Ethereum has established itself as a foundational platform for decentralized applications, moving far beyond simple digital currency transfers. Its architecture is a complex interplay of cryptographic principles, data structures, and a unique state machine. This analysis delves into the core components that make Ethereum function: its blocks, transactions, smart contracts, and the Ethereum Virtual Machine (EVM), providing a clear view of how they interact to create a secure and programmable blockchain environment.
Foundational Concepts and Data Structures
Cryptographic Primitives: SHA-3 and RLP
At the heart of Ethereum's security and data management are two key technologies: the SHA-3 hashing algorithm and RLP (Recursive Length Prefix) encoding.
The SHA-3 algorithm, the latest member of the Secure Hash Algorithm family standardized by NIST, is employed throughout the Ethereum codebase. Unlike its predecessors, SHA-3 uses a completely different design structure called Keccak, offering robust security guarantees. In Ethereum, SHA-3 is used to generate unique, fixed-size fingerprints for any piece of data. This hash is crucial for creating immutable identifiers for blocks, transactions, and state objects. It's important to note that a hash function is a one-way process; you cannot reconstruct the original input data from its hash output.
RLP encoding serves as Ethereum's primary serialization method. It takes arbitrarily structured and nested data and encodes it into a flat, serialized array of bytes ([]byte). This is essential because the underlying database that stores Ethereum’s state is a simple key-value store. The RLP encoding process is reversible, allowing the serialized data to be perfectly reconstructed later. The most common pattern is to take the RLP encoding of an object and then hash that result using SHA-3 to create its unique key for storage and lookup.
Essential Data Types: Hash and Address
Ethereum defines custom data types for consistency and clarity across its codebase. The two most fundamental are common.Hash and common.Address.
common.Hash: A 32-byte (256-bit) array used to represent the output of the SHA-3 hashing function. Any hash within the system, whether for a block, transaction, or state, uses this type.common.Address: A 20-byte array that uniquely identifies an account (either a user-owned Externally Owned Account or a smart contract) on the network. It is derived from the public key of an account.
For handling large numbers, such as token balances and gas calculations, Ethereum heavily utilizes Go's big.Int type, which can handle integers of arbitrary size.
The Role of Gas and Ether
Ethereum introduces an economic model to regulate resource consumption on the network through two concepts: Gas and Ether.
- Gas: This is the fundamental unit of computational effort. Every operation on the Ethereum network—a simple addition, a memory store, or a contract creation—has a predetermined gas cost. Gas acts as a measure of the network resources required to execute an operation.
- Ether (ETH): This is the native cryptocurrency of the Ethereum blockchain. While Ether has value in its own right, it is also used to pay for gas. The cost of a transaction is calculated as
Gas Used * Gas Price, where the gas price is the amount of Ether a user is willing to pay per unit of gas.
This system creates a market for block space and prevents malicious actors from spamming the network with computationally expensive operations, as they must pay for the resources they consume.
👉 Explore advanced strategies for gas optimization
The Building Blocks: Transactions and Blocks
Anatomy of a Block
A block is the fundamental container for data in Ethereum's blockchain. Blocks are organized into a chain by cryptographically linking each block to its predecessor.
The Block data structure contains two primary elements:
Header: A structure containing metadata about the block, such as:
ParentHash: The hash of the previous block, forming the chain.Number: The block's height (genesis block is #0).- Other consensus-related fields like
Difficulty,Timestamp, andGasLimit.
- Transactions: An array of transaction objects (
[]*Transaction) that this block includes. The execution of these transactions is what changes the global state of the Ethereum network.
Deconstructing a Transaction
A transaction (Transaction) is a signed data package that instructs the network to perform an action. Its core data (txdata) includes:
AccountNonce: A sequence number issued by the sender to prevent replay attacks.GasPrice: The amount of Ether the sender will pay per unit of gas.GasLimit: The maximum amount of gas the sender is willing to consume.Recipient: The destinationAddress(can benilfor contract creation).Amount: The quantity of Ether to transfer.Payload: A byte array that can contain either the init code for a new contract (ifRecipientisnil) or the input data for a call to an existing contract.V,R,S: The three components of the ECDSA digital signature that authorizes the transaction.
Notably, the sender's address is not directly stored in the transaction. Instead, it is cryptographically recovered from the signature (V, R, S) when the transaction is processed, verifying the sender's identity and ensuring the transaction's integrity.
The Execution Engine: From Transaction to State Change
The Outer Execution Layer
The journey of a transaction begins in the outer execution layer. The StateProcessor.Process() function is the main entry point for processing a block. It iterates through each transaction in the block and applies it to the current world state using ApplyTransaction().
For each transaction, the process involves:
- Converting to a Message: The transaction is converted into a
Messageobject. This involves recovering the sender's address from the transaction's signature. - Gas Handling: The sender's account is charged an upfront fee (
GasLimit * GasPrice). AGasPooltracks the remaining gas available for all transactions in the block. - EVM Execution: The
Messageand a newEVMcontext are passed to theTransitionDb()function, which is responsible for the core execution logic. Creating a Receipt: After execution, a
Receiptis generated. This receipt contains critical information about the transaction's outcome, including:PostState: The hash of the state root after the transaction was applied.CumulativeGasUsed: The total gas used so far in the block.Logs: An array of log entries generated by the transaction's execution (e.g., from Solidityeventemits). ABloomfilter is also included for efficient log searching.
- Refunds and Rewards: Any unused gas is refunded to the sender. The miner of the block is rewarded with the gas fees from all transactions in the block (
Total Gas Used * Gas Price), incentivizing network security.
Inside the Ethereum Virtual Machine (EVM)
The EVM is a quasi-Turing complete state machine and the runtime environment for all smart contracts on Ethereum. Its operation can be broken down into several key components.
1. Context and State
The EVM does not operate in a vacuum. It is initialized with a Context and has access to a StateDB interface.
Context: Provides information about the current block and transaction (e.g.,GasPrice,Block.Number) and functions for transferring value (Transfer).StateDB: An interface representing the state database. It provides methods to retrieve, update, and commit changes to account balances, contract code, and storage. The EVM interacts with a cached state; changes are only permanently written to the underlying database when a block is finalized.
2. The Contract Object
A Contract is the central object the EVM executes. It encapsulates:
CallerAddress: The address that initiated this contract execution.self: The address of the contract itself (this is theRecipientfrom the transaction).Code: The bytecode of the contract to be executed.Input: The data provided to the contract for this execution.Gas: The gas allotted for this execution.Value: The amount of Ether to be transferred.
3. Execution Paths: Call vs. Create
The EVM has different execution modes, primarily Call and Create.
Call: Used when a transaction'sRecipientis an existing contract address. The EVM loads the contract's bytecode from the state, sets up theContractobject, and executes it.Create: Used when a transaction'sRecipientisnil, signaling the intent to deploy a new contract. The EVM generates a new contract address, uses the transaction'sPayloadas the init code to run, and then saves the returned code as the new contract's permanent bytecode at that address.
4. The Interpreter and Precompiles
Execution within the EVM is handled by the Interpreter.
- Precompiled Contracts: For common, complex cryptographic operations (e.g., elliptic curve pairings, SHA-256), Ethereum uses precompiled contracts. These are native Go functions with fixed gas costs that run much faster than if the same operation were implemented in EVM bytecode. The interpreter checks if a contract address is in the precompile list before attempting to interpret its bytecode.
- Bytecode Interpretation: For regular smart contract bytecode, the interpreter uses a
JumpTable—an array of 256 operations (opcodes). It fetches each byte from the contract'sCode, looks up the corresponding operation, and executes it. These operations manipulate the EVM's execution stack, memory, and storage.
Frequently Asked Questions
What is the main difference between a transaction and a message?
A transaction is a signed piece of data originating from an external actor (an Externally Owned Account). A message is an internal construct passed between contracts during execution. A transaction always initiates a top-level message call, which can then spawn further internal message calls. Transactions are signed and paid for with gas, while internal message calls are not directly signed but inherit the gas constraints of the initial transaction.
Why is gas such an important concept in Ethereum?
Gas is the mechanism that measures and prices computational effort on the Ethereum network. It serves two critical purposes: it prevents Denial-of-Service (DoS) attacks by making computation expensive for an attacker, and it creates a market-based mechanism for allocating block space, ensuring that the network's resources are used efficiently and that miners are compensated for their work.
How is a new smart contract address determined?
A new contract address is deterministically generated from the sender's address (creator) and their nonce (the number of transactions they have sent). The formula is essentially keccak256(rlp.encode([creator, nonce]))[12:]. This means you can always predict the address of a contract you are about to deploy, which is useful for patterns like counterfactual instantiation.
What are the V, R, S values in a transaction?
These values constitute the Elliptic Curve Digital Signature Algorithm (ECDSA) signature of the transaction. R and S are the components of the signature itself, while V is the recovery identifier. Together, they allow any node to cryptographically verify that the transaction was authorized by the holder of the sender's private key and to recover the sender's public address from the signature.
What is the purpose of the receipt's logs and bloom filter?
Logs are emitted by smart contracts using the LOG opcodes to record events that occurred during execution (e.g., a token transfer). They are stored in the transaction receipt. The bloom filter is a space-efficient data structure that provides a probabilistic way to quickly check if a specific log might be present in the block. It allows light clients to efficiently search for logs without downloading all transaction receipts for every block.
Can the EVM run code from any language?
Not directly. The EVM executes a specific low-level bytecode instruction set. However, you can write smart contracts in high-level languages like Solidity or Vyper, which are then compiled down to EVM bytecode. This bytecode is what is deployed and executed on the blockchain.