Close manual analysis of high-similarity functions with divergent costs.
Smart contract design.
Execution has a price.
My bachelor's thesis studied whether the way an Ethereum program is designed-not only how each line is written-can make its transactions more expensive.
Optimization starts before the first line of code.
Running a smart contract costs gas, which users pay as a transaction fee. Most optimization work looks at individual coding mistakes, such as unnecessary storage operations or repeated calculations.
My thesis looked one level higher: the design of the whole contract. Two functions can be almost identical but live inside systems with different roles, withdrawal methods, and lifecycle rules. I studied whether those surrounding decisions were linked to different costs.
Which recurring smart-contract design solutions are statistically associated with high gas execution costs?
Compare code that is almost identical.
The dataset contained 4,965 pairs of similar functions from real Ethereum contracts. I focused on 43 pairs that were more than 95% alike but had different execution costs. Keeping the function code nearly constant made the surrounding design easier to compare.
Manual comparison of those pairs generated candidate patterns. The broader validation stage then operated on 1,010 unique functions and their associated contracts.
Discover patterns manually, then test them at scale.
I grouped repeated differences into four areas: administrative roles, withdrawal methods, market lifecycle, and supporting functions. This produced 18 candidate design patterns.
I converted each pattern into a tested search rule and scanned the full dataset. Python prepared the comparisons, while Fisher's exact test in R measured whether each pattern had a meaningful association with higher cost.
Regex-based detectors across four design areas.
Automated scanning and contingency-table construction.
Association strength evaluated through p-values and odds ratios.
Six patterns showed a meaningful association.
Six of the 18 candidates were statistically associated with higher gas costs in this dataset. The measured odds ratios ranged from approximately 2.22 to 5.33.
A fund-redistribution pattern involving payable cross-contract interaction.
A full-balance withdrawal guarded by ownership checks.
Hard-coded role initialization and its storage implications.
Administrative state transitions around a restricted market phase.
The other 12 patterns had p-values above 0.05. The study therefore did not claim that every plausible design difference increases cost: absence of sufficient evidence was kept distinct from evidence of no effect.
Gas efficiency is also a design concern.
The findings suggest that developers should reason about gas consumption while defining roles, lifecycle transitions, fund movement, and contract boundaries-not only during low-level Solidity optimization.
A practical continuation would turn the validated patterns into design guidelines or analysis tooling that flags potentially expensive architectural choices before deployment.
An association is not a universal rule.
The dataset represents a specific family of real contracts, many centered on token farming, sales, and market behavior. This limits generalization to other application domains. Regex detectors can also miss semantically equivalent patterns expressed through different syntax.
Future work should use broader multi-chain datasets, semantic code representations, and machine-learning-based detectors. The reported odds ratios quantify association inside this dataset; they are not guarantees that adding one pattern will multiply every transaction's gas consumption by the same factor.