Learn Computing from the Experts | The Rheinwerk Computing Blog

What Is Ethereum?

Written by Rheinwerk Computing | Oct 7, 2024 1:00:00 PM

Ethereum has managed to construct a blockchain that can do much more than just transfer cryptocurrencies between participants.

 

Complex applications can be developed and run on the platform—all with the unique features of blockchain technology. It has also been steadily evolving over the past few years. The biggest event was The Merge, in which Ethereum was transferred to the PoS consensus mechanism. A few months before The Merge, the PoS chain went live and ran parallel to the Ethereum proof-of-work (PoW) chain. Then, in The Merge, the two structures were merged together. This historical development is still very evident in the platform’s architecture. For example, there are basically two layers in Ethereum: the execution layer, which contains the remnants of the old Ethereum PoW chain, and the consensus layer, which carries the innovations of the Beacon Chain introduced with PoS. Since PoS has been the only consensus mechanism since The Merge, the Ethereum chain is now collectively called the Beacon Chain.

 

In this blog post, we would like to give you some general information to help you differentiate Blockchain 2.0 from Blockchain 1.0 in a comprehensible way.

 

State Machine

Ethereum sees itself as a transaction-based state machine that starts with an initial genesis state and is converted into a final state through transactions. This final state is not a state with which the system ends but is always the most up-to-date state of the platform. (If you want to read more about this, see https://ethereum.github.io/yellowpaper/paper.pdf, a detailed paper.)

 

The Bitcoin project can also be described as a state machine, with the state represented by the global collection of all unspent transaction outputs (UTXOs; see https://medium.com/cybermiles/diving-into-ethereums-world-statec893102030ed for more information). Bitcoin’s state is also altered by transactions on the network. To initiate these transactions, the participant must use their key to access one or more UTXOs and convert them into new UTXOs. With Bitcoin, users don’t have an account balance associated with their address. They only manage keys in their wallets that can unlock UTXOs assigned to them. So, while the state of Bitcoin is rather abstract, Ethereum sees states as a basic concept on which its whole project is built. Unlike in Bitcoin, accounts form an important basic construct in the Ethereum network. These represent the addresses of the participants in the network but can contain much more information.

 

Merkle Patricia Trie

Now, we’ll introduce you to a data structure that is omnipresent in the Ethereum project: the Modified Merkle Patricia Trie (MPT). The MPT is a combination of two data structures: the Merkle tree and the Patricia trie.

 

The word trie is derived from information retrieval. The similarity to the word tree is intentional because a trie is structured like a tree and forms a key-value store. In general, a trie stores strings and can then be searched for them. For example, starting with the root, following the paths results in words. Alternatively, instead of words, IP addresses can be displayed in such a trie. Whereas in a regular trie, each additional character in a word entails a new path, if there is no explicit branching, the Patricia trie summarizes multiple characters. In this way, space can be saved. In the Patricia trie, the strings represent the keys that lead to a certain value, as shown in the figure below. For example, the value can be the word itself or the ID of a word. However, it must be made clear that the value itself is only recorded at the very bottom of the leaf of the trie. Along the way, the key-value pairs are formed by the substring acting as the key and the child node forming the value.

 

 

The MPT combines the search properties of the Patricia trie and the hash property of the Merkle tree. The keys introduced with the Patricia trie are broken down into nibbles in the MPT. In computer science, a nibble represents a data set of four bits (or half a byte). Here, a nibble is represented as a hexadecimal sign, which means that it can take values from 0 to 9 or a to f. There are basically three different types of nodes in an MPT:

  • Branch node: The branch node acts as a kind of signpost when two keys start to differ. To do this, the node has a slot for every value that a nibble can take. Depending on the next character of a key, there is a reference to the child node in the slot.
  • Leaf node: The leaf node is the lowest node in this data structure and is characterized by the fact that it has no child nodes. It forms the end of the key and contains the corresponding value.
  • Extension node: The extension node is used when the keys have equal parts. Think of the Patricia trie featured above, which, unlike other tries, summarizes identical characters. This functionality happens in the extension nodes of the MPT.

We would now like to illustrate this abstract explanation with an example. Let’s say we have four keys already converted to nibbles: fa284b1, fa83bc9, fa83b14, and fad3492. These keys are linked to the values value1, value2, value3, and value4. This scenario is now to be represented in an MPT. The beginning fa is the same for all strings. Therefore, as shown below, an extension node is formed for this purpose.

 

 

This extension node is also the root of the tries, but the following characters of the keys differ. Therefore, the extension node points to a branch node, in which the references to the next child nodes can be entered. For the first key, a reference is created in slot 2; for the second and third keys, a reference is created in slot 8; and for the last key, a reference is created in slot d. Since the first and last keys remain unique from this point on, a leaf node is created with the last nibbles of the keys. These leaf nodes now contain the value, but the second and third keys also have the same following characters. Therefore, slot 8 refers to a new extension node with the matching nibbles of the keys. To represent the last, differing nibbles, a branch node is used again by adding a reference to the following child nodes at slot c and slot 1. These represent the final leaf nodes with the last nibble, and they contain the values. Note that if the nibble in the branch node were the last digit of the key, the value would be stored directly there—just like in a leaf node.

 

To further illustrate the concept of an MPT, the listing below shows what an implementation of the example trie looks like.

 

#Extension{:key [f a]

           :child #Branch{2 #Leaf{:key [8 4 b 1]

                                  :value "value1"}

                          8 #Leaf{:key [3 b]

                                  :child #Branch{c #Leaf{:key [9]

                                                         :value "value2"}

                                                 1 #Leaf{:key [4]

                                                         :value "value3"}}

                          d #Leaf{:key [3 4 9 2]

                                  :value "value4"}}}

 

Now, the Merkle part of the MPT comes into play: hashing. This, too, happens from the bottom up. First, the leaf nodes are hashed along with their data. Then, in the parent nodes (nodes that directly reference the affected leaf nodes), the pointers are replaced by the hash of the node in question. The parent node is then hashed, and the pointers are also replaced in their parent nodes. This continues until it’s the root’s turn. The resulting hash is ultimately the root hash of the MPT.

 

In Ethereum, the MPT is used multiple times, as you’ll see in the following sections.

 

Accounts and State Trie

The state trie is the heart of Ethereum and is made up of many individual accounts. In this section, we explain how this important data structure works.

Accounts

The global state of Ethereum is the sum of the states of many accounts that exist on the network. An account is represented by an address that can be used to identify the account beyond doubt. There are two types of accounts in Ethereum: externally owned accounts (EOAs) and contract accounts (CAs).

Ethereum Addresses

The addresses of EOAs in Ethereum are calculated from the public key of the user. An Ethereum address has a size of 20 bytes and consists of 40 characters of the hexadecimal system. Each Ethereum address is preceded by the identifier 0x, which indicates the use of the hexadecimal system. Originally, the characters a to f only appeared in lowercase in the addresses, but since then, a variant has been introduced in which the characters are also capitalized. The latter variant includes a checksum that detects when an address has been entered incorrectly.

 

The addresses of CAs are calculated from the sender address and the total number of transactions a sender has made. For this purpose, the two values are encoded and hashed.

 

EOAs are accounts used by external users, outside of the Ethereum platform (e.g., real people as users). These accounts are accessed via a private key. Smart contracts are represented in the network via the CAs; instead of a private key, they are controlled only by the program code of the smart contract. In addition, the CAs can be connected to other program code in other CAs.

 

The state of any account, regardless of its form, consists of four components, as shown in the next figure and the following list:

  • Account balance: This shows how much Wei (the smallest unit of Ether) the account has and thus represents a kind of account balance.
  • Nonce: In an EOA, the nonce counts how many transactions have been sent from the account. In a contract account, the nonce represents how many times a contract has already interacted with other contracts. The nonce is also added to the account’s transactions, forming a sequential number. In this way, the nonce can be used to prevent transactions from being sent twice or to ensure that the order of the transactions arriving at a node is correct.
  • Storage root: Every contract account needs an internal memory in which the variables of the contract can be stored. This data is stored in the form of an MPT called a storage trie. In the account itself, however, only the storage root, which represents the hash of the root of the storage trie, is stored. Since EOAs don’t use storage, the field in these accounts is left blank.
  • codeHash: The codeHash represents the hash of the programming code for the Ethereum Virtual Machine (EVM), and it is used by the CAs. The programming code can be activated by notifications from other accounts, and it generates operations on the internal memory at runtime. Once the contract account has been created, the code can no longer be changed. In the case of EOAs, the code is simply an empty string, and thus the codeHash field contains the hash of an empty string.

Together, the individual components represent the state of an account.

 

State Trie

We’ve explained that states play an important role in the Ethereum network. The current state of the Ethereum blockchain is constantly updated on the network, creating a global state called the platform’s world state. The state trie maps this global state, making it the heart of the Ethereum blockchain. So, it’s a snapshot of the entire system. A copy of the state trie is stored on each node in the network.

 

The state of the network is the sum of the states of all accounts in the network, and this is the reason why all existing accounts in the state are stored as key-value pairs. A key is represented by the address of an account. The value contains the associated account, including all its components, encoded in the Recursive Length Prefix (RLP) format. This means that the current balance, the current nonce, the entire storage, and the entire code of each account can be found in the state trie.

 

Recursive Length Prefix Format: The RLP format is used in Ethereum for serializing objects in byte streams. RLP takes either a string or a list of strings as an object, and it only encodes the raw structure of these objects and doesn’t care about how those objects were interpreted before encoding. This interpretation is then made again by the decoder at a later stage. With RLP, it’s possible to store data compactly in the tries or transfer it between nodes.

 

Transactions and the Transaction Trie

As in Bitcoin, there are transactions on the Ethereum platform that are stored in a transaction trie. In this section, we’ll explain how transactions fit into the system.

Transactions

Transactions are an important construct in the Ethereum blockchain and ensure that momentum comes into the platform. When transactions happen between accounts, Ethereum moves from one state to a new final state that can then be stored again.

 

As in the Bitcoin blockchain, a transaction is usually a message between actors in the network. Dr. Gavin Wood’s Ethereum yellow paper, “Ethereum: A Secure Decentralised Generalised Transaction Ledger,” describes a transaction as a single cryptographically signed instruction initiated by an EOA. Messages can be sent to other EOAs or CAs via a message call, and if the transaction takes place between two EOAs, it’s simply a matter of sending a certain amount of Ether. This is the use case that Bitcoin or other cryptocurrencies meet. When a transaction takes place between an EOA and a CA, it’s done to call the internal program code of the CA. This entails operations on the internal memory. Transactions can serve another purpose: to create CAs by initiating a smart contract. The different use cases for transactions are shown here.

 

 

As with Bitcoin, all transactions can be identified with a unique hash, which in Ethereum is called a TxHash. A standard transaction in Ethereum consists of several components detailed in the following list and figure:

  • nonce: There is a field in the transaction with the name nonce. The field is filled with the current value of the nonce from the sender’s account, which we presented in the previous section.
  • from: This field contains the address of the sender of the transaction.
  • signature: The signature of the sender is calculated with the private key.
  • to: The to field contains the address of the recipient of the transaction. If it’s a transaction that is supposed to create a contract, this field will be filled with an empty value because no address exists yet.
  • value: In the value field, a value in Wei is entered and is to be transmitted to the recipient by the transaction. This field is also used for a transaction to create a new contract. The value entered here then represents the initial balance.
  • Input data: The input data field is designed to interact with smart contracts. Here, for example, required input parameters can be entered; they are required to execute the code on a contract account. If the transaction is a contract deployment transaction, the contract code is stored in the data field. The contract code is represented in bytes and executed exactly once when the contract account is initialized. The data field is responsible for storing the program code for the individual logic of the contract in the new contract account. The input data field is optional, so it can be empty, or users can store arbitrary data like messages in it.
  • gasLimit: Transactions on the Ethereum network cost money. Gas is the unit that can be used to pay for transactions or other actions on the network. Gas was introduced to provide a means of payment on the network that is independent of the Ether currency and its market value. The total gas a user needs to pay is calculated from the base fee and the priority fee. The base fee is for sending a transaction, which is set by the network, and the priority fee is a voluntary tip. The gasLimit field (sometimes also called startGas) determines the maximum amount of gas the user is willing to spend in total to carry out their transaction.

Gas: The Fuel of Ethereum: If you participate interactively in the Ethereum network, you can’t avoid gas. Gas keeps Ethereum running and is the price that users calculate to pay when they generate computing power in the network. Transactions, the creation of smart contracts, and the use of smart contracts—every operation performed requires a predetermined amount of gas. This allows developers to add up how much gas their smart contract will consume during operation and optimize it accordingly. Gas is not a currency, but the price of gas is expressed in Ether. The unit used is Gwei, which in turn corresponds to 1,000,000,000 Wei. The price of gas is determined by supply and demand in the network, so gas is a constant unit in a market where prices fluctuate. You can think of it like your car: if you have 5 gallons left in the tank, you know how far you can get with your car, no matter how high the price of gasoline is.

  • maxPriorityFeePerGas: Users can give validators a tip that gets users priority for inclusion in the next block. This tip is called a priority fee. The maximum price (in Gwei) of this priority fee per unit of gas can be specified in this data field. In addition to the priority fee, users pay the base fee, but this is burned after the transaction is carried out (i.e., liquidated by the network). The priority fee is therefore the actual reward for the validators. The higher the tip, the faster and more reliably the validators will consider the transaction.
  • maxFeePerGas: This field indicates the maximum total fee per gas unit that users are willing to pay as part of the transaction. The total fee is made up of the base fee and the priority fee. 

 

In a constantly evolving system like Ethereum, important components such as transactions are also changing. In addition to the standard transactions that we’ve described, there are modified transaction types with extended features. To make the system able to respond well to future developments and maintain backward compatibility, a typed transaction envelope was introduced and can handle a wide variety of transaction types (see https://eips.ethereum.org/EIPS/eip-2718). New transaction types can be wrapped in the envelope and just need to ensure backward compatibility.

Transaction Trie

Unlike the data in accounts, transactions in the block are not subsequently changed. It therefore makes sense to store transactions in a separate data structure. For this purpose, the Ethereum network uses the transaction trie by storing the transactions collected in the transaction list there. Here, Ethereum again resembles Bitcoin because unlike in the state trie, not all transactions in the network are stored in a transaction trie; only the transactions that have occurred since the last block are. So, there are several transaction tries—one per block, to be exact. Otherwise, the transaction trie works like a normal MPT. The transactions are stored in key-value pairs in the trie, with the RLP-encoded index of the transaction (which is important for the order) representing the key and the transaction components described previously representing the value.

Messages

You’ve now learned how human users (or off-chain software) as external actors can influence the network with external transactions via the EOAs. However, contract accounts that are located exclusively within the platform boundaries must also be able to actively participate in the network. Ethereum enables messages for this reason. With the help of messages, the contract accounts can communicate with other contract accounts and call functions there. Messages are similar to transactions, but they have some peculiarities. For example, messages can never be sent spontaneously. Each first message is preceded by an initial transaction of an EOA, but it can then trigger further messages. Another special feature is that messages don’t become part of the blockchain but only exist in the execution environment during runtime.

 

Nevertheless, messages can influence the status of an account. For example, it’s often the task of a contract account to send Ether and thus update the balance of an EOA or a CA. Such a message is sometimes referred to as a value transfer or an internal transaction. Again, these special messages are not stored in the blockchain but still change the balance of the account in question (see figure below). This may sound unusual, as we know from the Bitcoin blockchain that all transactions are stored without gaps. So, it’s not possible to trace where the Ethers in the balance originally come from, but the initial transaction, the input parameters entered, and the transparent view of the program code from the called CA can be used to simulate where the money comes from. For example, the leading Ethereum block explorer Etherscan exploits this fact to display the Ether-transferring messages to its users in an uncomplicated way. However, since these value transfers don’t have a TxHash for unique identification, the TxHash of the parent transaction (the initial transaction that directly or indirectly triggered the message) is used.

 

 

The structure and components of a message are very similar to the transactions, but there are some differences. Since a message comes from a CA, it doesn’t have a signature due to the lack of a private key. In addition, a message doesn’t have the gas-related fields, as this was already set by the EOA in the initial transaction.

 

Receipts and Receipts Trie

The receipts and receipts trie store the results of a transaction. In this section, we’ll introduce you to why the receipts and the associated tries are important in Ethereum.

Receipts

Transactions are instructions from an EOA that clearly state what the EOA wants the network to do. The transactions don’t show what happened after the transaction was executed and what effects it had. However, to be able to understand the change in a state, you need to know exactly what happened.

 

This issue is resolved with the receipts of the transactions. Receipts provide detailed information about how the transaction will be carried out, and they consist of several individual components, as shown in the following list and figure:

  • General data: The receipt contains some general data that helps to locate the transaction. The blockHash and blockNumber provide information about the block in which the transaction is stored. The transactionHash clearly indicates which transaction it is, and the transactionIndex shows where the transaction is in the block. The components from and to allow you to make conclusions about the sender and the recipient. If a contract account was generated by the transaction, the contractAddress displays the address of that account. The component type shows the type of value.
  • Status: The status of a receipt indicates whether a transaction was successful or not. If the status is 1, it was executed successfully; if it’s 0, it failed.
  • cumulativeGasUsed: This component is the sum of the gas consumed by the transaction under consideration and the gas consumed by all the transactions in the block in front of it.
  • gasUsed: This provides information on how much gas the transaction actually consumed.
  • effectiveGasPrice: This is the base fee plus the priority fee paid for each unit of gas.
  • Logs: This component is a list of log objects caused by the transaction. A log is created for a transaction by a smart contract it uses, and it does so whenever that transaction triggers an event. Events can be implemented by smart contract developers to document specific activities of the contract. A log consists of the address of the logging account plus the topics (the hash of the event and the indexed data types used as input variables), data, block number, transactionHash, transactionIndex, blockHash, logIndex, and removed field (which indicates whether the log was removed).
  • logsBloom: Bloom filters are used to prepare data in such a way that it is easy to search for the same and similar content. This is especially important for the logs to enable data analysis in connection with events. The bloomFilter component is the filter that is applied to the logs described previously.

Receipts Trie

Like the transactions, the receipts in the block are not changed afterwards. However, the receipts must still be kept separately. The transaction trie is formed before execution and then already has the required immutable hash values. If the receipts were also stored in this trie, it would subsequently change the hash values. For this reason, a separate trie is created for the receipts. The receipts trie is very similar to the trie for the transactions: again, an instance contains only the receipts of the particular block. The receipts are also stored in key-value pairs in the trie, the RLP-encoded index of the receipt represents the key, and the components of the receipt represent the value.

 

Editor’s note: This post has been adapted from a section of the book Blockchain: The Comprehensive Guide to Blockchain Development, Ethereum, Solidity, and Smart Contracts by Tobias Fertig and Andreas Schütz.