# EVM Internals

In this tutorial, we take a deep dive into the low-level details of the EVM bytecodes. First, you will learn how to debug or trace a transaction at the bytecode level such that we can understand the details of the contract creation and cross-contract calls.

# Debugging on Remix

Remix provides as a GUI-based debugging interface -- it can be used for a quick and easy debugging of a contract's logic. To initiate a debugging session, you can set a breakpoint (a circle next to the line number), and press the "Debug" button next to a transaction on the message window. In this case, we'd like to debug a transaction that invokes store(0xdeadbeef).

Debugging on Remix

This spawns Remix's debugging interfaces. This interface is very primitive but powerful enough for common cases. It provides a way to navigate the program counter (PC) or instruction pointer (explained below) and shows the states for EVM like stack, memory, storage and global variables such as block and msg. Before going further, please read the Debugger (opens new window) secion of the Remix document.

Debugging on Remix

Once spawned, it stops at the breakpoint (line 18) and shows the current state. You can try to navigate the PC by using the interface below and see how the state changes when the PC moves back and forth.

Navigation on Remix (ref. Remix Document)

We will walk you through one step of a transaction. Before going further, read the Debugging Transactions (opens new window) section.

First, move your PC to SSTORE by either clicking "step into" or "step over forward", which is a part of the number = num statement.

In this tutorial, we'd like to show how to read each EVM instruction and how to examine their state. The yellow paper explains each instruction in an opaque and unreadable manner, so please refer to EVM.codes (opens new window) for concise and cleaner explanation of each byte code.

SSTORE before

You can see two values (256 bytes each) on top of the stack, namely, 0x0 and 0xdeadbeef, which indicate an address of the storage and the value to store of the SSTORE instruction (opens new window).

SSTORE

The gas calculation of SSTORE is a bit complicated but, for now, we don't need to comprehend every single detail of it. Let's see how the state changes after executing SSTORE.

SSTORE after

First, the PC proceeds by one (from 132 to 133) and two stack values are poped (consumed by SSTORE). And importantly, in storage, the address 0 (a key) now has 0xdeadbeef (a value). Feel free to change the PC back and forth to capture the high-level idea of how each instruction works.

# Tracing on Brownie

Let's see how Brownie provides an interface for tracing.

$ git pull
$ brownie run scripts/pwn.py -I

>>> s = Storage.deploy()
Transaction sent: 0xed56633bc73ce0068b871ada77cb7bf1c8ccf418df278752b3d7b75cd1481113
  Gas price: 0.0 gwei   Gas limit: 12000000   Nonce: 657
  Storage.constructor confirmed   Block: 666   Gas used: 90551 (0.75%)
  Storage deployed at: 0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A

>>> tx = s.store(0xdeadbeef)
Transaction sent: 0xe2ba023cd6007edb46c15b1ed2fa5a4db09116b84b7b7697099ca913f242be25
  Gas price: 0.0 gwei   Gas limit: 12000000   Nonce: 658
  Storage.store confirmed   Block: 667   Gas used: 41452 (0.35%)

>>> tx.info()
Transaction was Mined
---------------------
Tx Hash: 0xe2ba023cd6007edb46c15b1ed2fa5a4db09116b84b7b7697099ca913f242be25
From: 0xfed61D4a212A14143DC51f21d8D6617072e7a621
To: 0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A
Value: 0
Function: Storage.store
Block: 667
Gas Used: 41452 / 12000000 (0.3%)

We simply deploy and invoke store() of the Storage contract. The transaction, tx, can be inspected with various functions that Brownie provides.

>>> tx.call_trace()
Call trace for '0xe2ba023cd6007edb46c15b1ed2fa5a4db09116b84b7b7697099ca913f242be25':
Initial call cost  [21240 gas]
Storage.store  0:59  [20212 gas]

0:59 indicates a range of indexes of traces (tx.trace) for Storage.store(). For example, you can examine its state like below:

>>> for t in tx.trace:
...   if t["op"] == "SSTORE":
...     pprint(t)
...
{'address': '0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A,
 'contractName': 'Storage',
 'depth': 0,
 'error': '',
 'fn': 'Storage.store',
 'gas': 11978557,
 'gasCost': 20000,
 'jumpDepth': 0,
 'memory': ['0000000000000000000000000000000000000000000000000000000000000000',
            '0000000000000000000000000000000000000000000000000000000000000000',
            '0000000000000000000000000000000000000000000000000000000000000080'],
 'op': 'SSTORE',
 'pc': 90,
 'source': {'filename': 'contracts/Storage.sol', 'offset': [316, 328]},
 'stack': ['000000000000000000000000000000000000000000000000000000006057361d',
           '000000000000000000000000000000000000000000000000000000000000005c',
           '00000000000000000000000000000000000000000000000000000000deadbeef',
           '0000000000000000000000000000000000000000000000000000000000000000'],
 'storage': {}}

In this trace, you can see various states before executing SSTORE like the remaining gas, stack, memory, storage, etc. For more detail of the Brownie's tracing capability, please read Inspecting and Debugging Transactions (opens new window).

TIP

The debugging and tracing are in fact built on top of the interface that each node provides. For example, in geth (opens new window), debug_traceTransaction() under the debug namespace (i.e., debug.traceTransaction()) provides more versatile control of inspecting a transaction. You can even write a debugging script with a JavaScript on the geth console.

# Contract Internal

Let's try to understand the last four byte code of invoking a store() of the Storage contract.

>>> tx.trace[-4:]
[
    {
        'address': "0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A",
        'contractName': "Storage",
        'depth': 0,
        'error': "",
        'fn': "Storage.store",
        'gas': 11978557,
        'gasCost': 20000,
        'jumpDepth': 0,
        'memory': [
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000080"
        ],
        'op': "SSTORE",
        'pc': 90,
        'source': {
            'filename': "contracts/Storage.sol",
            'offset': [316, 328]
        },
        'stack': [
            "000000000000000000000000000000000000000000000000000000006057361d",
            "000000000000000000000000000000000000000000000000000000000000005c",
            "00000000000000000000000000000000000000000000000000000000deadbeef",
            "0000000000000000000000000000000000000000000000000000000000000000"
        ],
        'storage': {
        }
    },

# Once `SSTORE` executed, it will jump to a PC stored on top of the stack (0x5c),
# which happens to be the right next instruction!

    {
        'address': "0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A",
        'contractName': "Storage",
        'depth': 0,
        'error': "",
        'fn': "Storage.store",
        'gas': 11958557,
        'gasCost': 8,
        'jumpDepth': 0,
        'memory': [
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000080"
        ],
        'op': "JUMP",
        'pc': 91,
        'source': {
            'filename': "contracts/Storage.sol",
            'offset': [271, 335]
        },
        'stack': [
            "000000000000000000000000000000000000000000000000000000006057361d",
            "000000000000000000000000000000000000000000000000000000000000005c"
        ],
        'storage': {
            '0000000000000000000000000000000000000000000000000000000000000000': 
            "00000000000000000000000000000000000000000000000000000000deadbeef"
        }
    },

# `JUMP` can only jump to an instruction marker, called `JUMPDEST`, which makes 
# it easy to enforce the integrity of the control flow. It's just a maker, nop, so
# nothing happens other than increasing the PC.

    {
        'address': "0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A",
        'contractName': "Storage",
        'depth': 0,
        'error': "",
        'fn': "Storage.store",
        'gas': 11958549,
        'gasCost': 1,
        'jumpDepth': 0,
        'memory': [
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000080"
        ],
        'op': "JUMPDEST",
        'pc': 92,
        'source': {
            'filename': "contracts/Storage.sol",
            'offset': [271, 335]
        },
        'stack': [
            "000000000000000000000000000000000000000000000000000000006057361d"
        ],
        'storage': {
            '0000000000000000000000000000000000000000000000000000000000000000': 
            "00000000000000000000000000000000000000000000000000000000deadbeef"
        }
    },
    
# `STOP` halts the transaction successfully.

    {
        'address': "0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A",
        'contractName': "Storage",
        'depth': 0,
        'error': "",
        'fn': "Storage.store",
        'gas': 11958548,
        'gasCost': 0,
        'jumpDepth': 0,
        'memory': [
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000080"
        ],
        'op': "STOP",
        'pc': 93,
        'source': {
            'filename': "contracts/Storage.sol",
            'offset': [271, 335]
        },
        'stack': [
            "000000000000000000000000000000000000000000000000000000006057361d"
        ],
        'storage': {
            '0000000000000000000000000000000000000000000000000000000000000000': 
            "00000000000000000000000000000000000000000000000000000000deadbeef"
        }
    }
]

We can also retrieve the byte codes of the contract by using web3.eth.get_code(address) like below.

>>> code = web3.eth.get_code('0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A')

>>> print(evm_disasm(code))
00000: PUSH1 0x80
00002: PUSH1 0x40
00004: MSTORE
00005: CALLVALUE
00006: DUP1
00007: ISZERO
00008: PUSH1 0x0f
...
0005a: SSTORE
0005b: JUMP
0005c: JUMPDEST
0005d: STOP
...

You can see SSTORE at 90/0x5a and its flow to STOP at 93/0x5d as in the trace.

# Contract Creation Internal

What we have been seeing is the bytecode of the contract itself. To create a contract, we can, by convention, simply send the constructor code (data) without indicating the receiver address (to) in a transaction.

For example, Storage.bytecode contains the deployment code, which is lightly different from the contract code.

# deployment code. [] indicates the contract's code
>>> Storage.bytecode
'6080604052348015600f57600080fd5b5060ac8061001e6000396000f3fe[6080604052348015600f57600080fd5...]'

# contract code
>>> web3.eth.get_code('0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A')
'[6080604052348015600f57600080fd5...]'

Let's manually deploy the Storage contract (w/o using Storage.deploy()):

>>> tx = web3.eth.send_transaction({"from": a[0].address, "data": Storage.bytecode})
>>> tx = TransactionReceipt(tx)

>>> tx.info()
Transaction was Mined
---------------------
Tx Hash: 0x1477ec61faaa156c85b4c10a30719b3959c4a0bd62ad5b09f07aa16821c57ce7
From: 0xfed61D4a212A14143DC51f21d8D6617072e7a621
New UnknownContract address: 0x1b336F2650da1b0702Be8967532A3Aa8f16fdBa0
Block: 678
Gas Used: 90551 / 190551 (47.5%)

The transaction receipt indicates a newly created contract as part of the transaction.

How does the deployment work? In fact, a simplest constructor (i.e., empty in Solidity) literally just copies the contract code (appended at the end of the data)! We will walk through each bytecodes, but a pseudocode (i.e., decompilation (opens new window) of Storage.bytecode) looks like the following:

contract Contract {
    function main() {
        // heap top (discussed in lab04)
        memory[0x40:0x60] = 0x80;

        // non payable constructor (discussed in lab04)
        var var0 = msg.value;
        if (var0) { revert(memory[0x00:0x00]); }
    
        // HERE! copy the code@[0x1e:0xca] to the memory@[0x00:0xac]
        memory[0x00:0xac] = code[0x1e:0xca];
        return memory[0x00:0xac];
    }
}

You can check the payload is indeed the contract's code as well.

>>> HexBytes(Storage.bytecode)[0x1e:0xca] == web3.eth.get_code('0xb6dbB5627379c0094408B3dbfE1f802E54b1c68A')
True

Let's read the EVM bytecodes of the contract creation.

>>> tx.trace
    ...
    #
    # CODECOPY(destOffset /* in memory */, offset /* in code */, size)
    # 
    # This copies a code from 0x1e-0xac (payload) to the memory at 0x0
    # 
    {
        'memory': [
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000000",
            "0000000000000000000000000000000000000000000000000000000000000080"
        ],
        'op': "CODECOPY",
        'pc': 25,
        'stack': [
            "00000000000000000000000000000000000000000000000000000000000000ac",
            "00000000000000000000000000000000000000000000000000000000000000ac",
            "000000000000000000000000000000000000000000000000000000000000001e",
            "0000000000000000000000000000000000000000000000000000000000000000"
        ]
    },
    #
    # Now, you see the contract's code at the memory
    #
    # This push 0xac the size of the contract code
    {
        'memory': [
            "6080604052348015600f57600080fd5b506004361060325760003560e01c8063",
            "2e64cec11460375780636057361d14604c575b600080fd5b6000546040519081",
            "5260200160405180910390f35b605c6057366004605e565b600055565b005b60",
            "0060208284031215606f57600080fd5b503591905056fea26469706673582212",
            "20c909574d6a76c0095f3e1cf55edabf499e6feb67a09f312ee3fa40fe8cbf51",
            "8964736f6c634300081100330000000000000000000000000000000000000000"
        ],
        'op': "PUSH1",
        'pc': 26,
        'stack': [
            "00000000000000000000000000000000000000000000000000000000000000ac"
        ],
    },
    #
    # RETURN(offset, size)
    #
    # It terminates the current context with the value at offset w/ size, which
    # is the contract's code
    # 
    {
        'memory': [
            "6080604052348015600f57600080fd5b506004361060325760003560e01c8063",
            "2e64cec11460375780636057361d14604c575b600080fd5b6000546040519081",
            "5260200160405180910390f35b605c6057366004605e565b600055565b005b60",
            "0060208284031215606f57600080fd5b503591905056fea26469706673582212",
            "20c909574d6a76c0095f3e1cf55edabf499e6feb67a09f312ee3fa40fe8cbf51",
            "8964736f6c634300081100330000000000000000000000000000000000000000"
        ],
        'op': "RETURN",
        'pc': 28,
        'stack': [
            "00000000000000000000000000000000000000000000000000000000000000ac",
            "0000000000000000000000000000000000000000000000000000000000000000"
        ],
    }

Task

In this tutorial, you are asked to write a small contract (in assembly) that transfers its ether to you when invoked.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.11;

contract Lab03 {
    address public owner;
    bool public completed;

    constructor(address _player) {
        owner = _player;
    }

    function submitContract(address _addr) external {
        uint256 size;
        assembly {
            size := extcodesize(_addr)
        }
        require(size < 0x20, "Need a contract whose size is less than 0x20 bytes");
        require(_addr.balance == 1 gwei, "The contract should have 1 gwei");

        uint balance = owner.balance;

        (bool success, ) = _addr.call(abi.encodeWithSignature("withdraw()"));
        require(success, "Failed to invoke the contract");

        require(_addr.balance == 0, "The contract still has some balance");
        require(owner.balance == balance + 1 gwei, "The balance wasn't transferred to the owner");

        completed = true;
    }
}

TIP

First, you might want to deploy random bytes and see if it really works. As shown in the lecture, use the below assembly for the deployment.

code = HexBytes("0xdeadbeef")
deployer = evm_asm("""\
PUSH1 %d
DUP1
PUSH1 0x0b
PUSH1 0x00
CODECOPY
PUSH1 0x00
RETURN
""" % len(code))

txid = web3.eth.sendTransaction({"from": a[0].address, "data": deployer + code})
tx = web3.eth.getTransactionReceipt(txid)

# what's here?
web3.eth.get_code(tx.contractAddress)

TIP

You might want to read CALL (opens new window) in detail. Or what about writing a contract in solidity and check their assembly? Ah, you might have to think about how to optimize your bytecodes that fit under 0x20! In fact, our code fits exactly 10 bytes 😃