The "UniversalAddress" structure

In Qtum (as well as Bitcoin and others) there are multiple "address" formats. These might look like QNPkSpiwVfvMoMK8wqpKN9TdyLgRwJTYKz for a standard pay-to-pubkeyhash address or the address might begin with the letter M indicating a pay-to-scripthash address. The prefix of the address (ie, the bit that determines if it starts with a Q or M) determines what format the address is, and thus how coins can be sent to that address. There is actually more data in address than what goes into the blockchain. An address is made like the following:

base58(version + data + checksum)

Version is a 1 byte prefix that determines what letter the base58 address begins with. Data is the actual data that will go on the blockchain. For a pubkeyhash address, this would be a ripemd160 hashed public key. The checksum is the bottom 4 bytes of sha256(sha256(version + data)). Satoshi himself designed this system to prevent mistakes where it is possible to send coins to a wrong address, etc.

Since Qtum was first thought of, the mutliple address formats has been difficult to reconcile with the design of the EVM. In the EVM, all addresses are assumed to be 160 bits. This does not allow for any simple method of disambiguating between a pubkeyhash or scripthash form of address, there is no room for the "version" field. Furthermore, it's always been rather confusing because Qtum uses base58 addresses everywhere like Bitcoin does, but the EVM and Solidity naturally uses hex addresses like Ethereum. It's easy to convert a base58 address to a hex address, but it strips away the version number and it's easy to make mistakes.

Our best workaround for this problem was basically to have soldiity contracts etc continue to use hex addresses. If the hex address exists in the EVM as a contract, it sends coins or whatever to that contract address. If the address does not exist, then it assumes that the address is a pubkeyhash address. With this system it's impossible to make use of pay-to-scripthash or multisig addresses.

With the design of the x86 VM in Qtum, this was one of the more important things to be fixed. It will be possible to use established and safe Bitcoin style multisig addresses to manage x86 contracts, and contracts can send and receive funds to/from these addresses. Exposing the base58 version number was not enough however. To avoid mistakes like sending mainnet Qtum to a testnet network address, the exact version number can vary per network. It would not be ideal to need to compile special versions of a contract that work only on testnet, or only on mainnet. So, instead the "UniversalAddress" structure was created. It is currently defined as the following:

#define ADDRESS_DATA_SIZE 32
typedef struct{
    //Do not modify this struct's fields
    //This is consensus critical!
    uint32_t version;
    uint8_t data[ADDRESS_DATA_SIZE];
} __attribute__((__packed__)) UniversalAddressABI;

//Note: we don't use Params().Base58Prefixes for these version numbers because otherwise
//contracts would need to use two different SDKs since the the base58 version for pubkeyhash etc changes
//between regtest, testnet, and mainnet
enum AddressVersion{
    UNKNOWN = 0,
    //legacy is either pubkeyhash or EVM, depending on if the address already exists
    LEGACYEVM = 1,
    PUBKEYHASH = 2,
    EVM = 3,
    X86 = 4,
    SCRIPTHASH = 5
};

The version number was made 4 bytes instead of 1 to allow for future flags and to keep with a 4 byte alignment. Allowing the structure to be 33 bytes might be rather inefficient due to unaligned memory access on some processors being more expensive. The address data is 32 bytes because it can fit a 256 bit hash. No current address format uses a full 256 bit hash though, so there has been some consideration of reducing that size to 20 bytes instead, to fit the standard ripemd160 bit hash.. but thus far we are leaning on the side of flexibility.

The version number used in UniversalAddresses are not the same as the base58 address version. Instead, these addresses are not intended to be used directly, but rather to be easy to write code around. Each value corresponds to pubkeyhash, scripthash, EVM, or x86 addresses. These version numbers are always the same regardless of network, and tools are provided to decode and map a base58 address into a UniversalAddress. Here is an example in a C contract:

UniversalAddressABI a;
if(qtumParseAddress("xCREYTgvdk3XzFNpdXkj35fZX4hJK2Ckpd", &a)){
    qtumError("Error parsing address");
}
Posted: 11/19/2018 5:46:12 AM

Memory map of the x86 VM

This is a continuing series of x86 blog posts that basically documents things as needed for both team members and the public.

Unlike the EVM, in x86 memory is used extensively, and gaps in memory is possible. For example, accessing memory at 0x1000 might give an error about there being no memory there, but 0x2000 might have something useful. The EVM is actually the odd one out here as most VM and CPU architectures use this concept to their advantage. Back when 1Mb of memory was rare, the 8086 actually abused this by having most memory located starting at 0, but with the read-only BIOS memory located at 0xF0000, and sometimes with various sections of memory that actually connected to external devices rather than to RAM.

Anyway, Qtum's x86 VM design abuses this feature as well so that certain areas of memory can be shared etc. This is Qtum's core memory map; explanation for each memory section is below:

  • Emergency stack - 0x100, length 64, read-write
  • Contract Code - 0x1000, length 0x10000 (max), read-only
  • Contract Data - 0x100000, length 0x10000 (max), read-write
  • Stack memory - 0x200000, length 8196, read-write
  • Execution data - 0xD0000000, length TBD, read-only
  • Transaction data - 0xD1000000, length dynamic, read-only
  • Blockchain data - 0xD2000000, length TBD, read-only

Note that a memory section can be "init on read". ie, there's not necessarily a need to construct the data for a section before a contract needs to read that data. There may also be additional gas costs for the first access of such a memory area.

Emergency Stack

This is an emergency stack that is used when a double-fault exception occurs. Unused right now since exception support is not implemented

Contract Code

This is where the actual contract code is loaded into. This is read-only for security and optimization purposes. The actual size of this is determined by how much contract code needs to be loaded. Memory beyond the actual size can be read, but it can not be written to since it's read-only, and is fixed at 0

Contract Data

This is where contract data is loaded into. Unlike in the EVM, if you have a variable with some value, there is no need for CPU code to be written to reserve some bit of memory and then set it to that value. Variables in x86 is instead just a simple pointer to a memory address. That memory address is expected by the loader (in normal operating systems, the executable parser and loader.. in Qtum, the VM initialization process). This is nice because we can use fast native code to initialize all the variables in one quick "memcpy" operation. Variable pointers end up pointing to somewhere in this memory section. The actual size is currently fixed at 1Mb, but this will be changed later beyond the prototype.

Note that read-only data is (depending on linker configuration) stored in the code section. Read-only data in common programs includes strings, constants, etc.

Stack memory

This is a section of memory reserved for the x86 call stack. The call stack is used for passing arguments to functions, storing return addresses, local variables etc. The reason for not overloading the data section with stack data is that this section of memory is surrounded by missing memory. This means that if the stack over or under flows it can be immediately "detected" by getting an error thrown. Explicit errors are almost always preferrable to continuing to run with corrupted state.

Execution data

This is data that is read-only and specific to each contract execution. This means that the data will be different when calling a contract within a contract. This data includes things like the sender address (what address initiated this execution), gas limit, and other data.

Transaction data

This is dynamic length data that encodes the complete transaction data which caused the current contract execution. This includes all inputs and outputs. There will be both raw script access, as well as simplified "100 coins was sent to address A" type of data that can be used very easily. There will be helper library functions to assist in script decoding etc.

Blockchain data

This is read-only global blockchain data that is constant for all contract executions in the current block. This includes things like the block gas limit, current block height, previous block hashes, current difficulty, etc.

Rationale

The point of most of this data being available in memory rather than in syscalls is primarily for two reasons:

  • Every syscall has a non-negligible security risk
  • Syscalls are expensive in contract code size, contract code gas overhead, and in VM implementation speeds (it means exiting JIT etc)

Every syscall is an interface for exposing contract code to the outside world. The majority of OS kernel security exploits come from buggy syscall behavior. Thus, there is a big benefit to exposing as few syscalls to the VM as possible. Less ways for the VM to escape the sandbox, less places to audit.

The other factor that encouraged this is that the syscall process is quite expensive in multiple ways. The code size is small, but not negligible. It requires a minimum of about 30 bytes to make a syscall from C. Other languages might be smaller, but are probably bigger. The expensive part of the syscall process is that there must be translation between the syscall and the standard C interface. This involves calling a helper function, preserving a few registers, getting a bunch of items from the stack and storing them in registers, doing the actual syscall, and then finally restoring those previously preserved registers. The 30 byte point previously actually only covers the helper function part. The total number of bytes executed is significantly more but not represented in contract size since that code is written only once and used for every syscall.

And then finally, beyond actual contract ineffeciencies, each syscall can potentially wreck VM performance. In a JIT workflow, the actual hardware CPU can cache the JIT overhead code and JIT compiled contract code very quickly, but upon leaving that section to some other piece of code, it typically must clear at least portions of the cache to make room for this new code. And in a true virtualization workflow using hardware support it becomes even more expensive as it can involve a context switch and hardware level system call. This is a common problem with most languages involving a VM. This includes Webassembly (up until recently, I think that team made a major improvement somehow), Javascript/V8, etc.

Prototype

Qtum x86 is still in prototype. These memory areas are fairly certain, but as with everything in this series, things are still up in the air and this should not be considered a final specification

Posted: 10/27/2018 12:20:38 AM

Exposed Formats in Qtum x86

The following formats are exposed to interface and smart contract code

Runtime Length Encoding Format

On the blockchain all contract data for x86 uses a variant of Runtime Length Encoding (RLE) for bytes that are equal to 0. Non-zero data is not encoded. When a 0 is encountered in the bytestream it becomes a length to indicate how many zero bytes follow it. Having a length of 0 for the RLE is expressly forbidden and will result in an invalid transaction that can not be included in blocks or in the mempool

The RLE payload is prefixed with a 32 bit integer which contains a 24-bit decoded size field. If this size does not exactly match the decoded payload, the transaction is invalid.

the other 8 bits remaining in the 32 bit integer is reserved as a version number for potentially implementing other compression formats

Note that the size of the payload can be verified without allocating additional memory.

Relevant code comment:

//format: decoded payload length (uint32_t) | payload
//the compression only compresses 0 bytes. 
//when 0x00 is encountered in the bytestream, it is transformed converted to a run-length encoding
//encoding examples:
//0x00 00 00 00 -> 0x00 04
//0x00 -> 0x00 01
//0x00 00 -> 0x00 02
//0x00 (repeated 500 times) -> 0x00 0xFF 0x00 0xF5
//as part of the encoding process, a 32 bit length field is prefixed to the payload

The RLE encoding is used on top of any other formats.

Code for encoding and decoding is included in x86Lib, but should be trivial to implement in any language needed.

The length field may be removed at a later time as right now it provides no benefit. The original rationale was so that a constant memory allocation profile could be used, rather than potentially needing multiple allocations. However, the decoded format of data is stored in backing database in Qtum right now, making decoding a one-time operation.

Smart contract bytecode format

The x86 VM works differently from how the EVM encodes and stores contracts. In the EVM, a flat array of opcodes is all that is given. These opcodes are then executed and they decide what part of those opcodes and data is actually stored in permanent storage as the contract bytecode.

With Qtum's x86 VM it's much simpler and generally easier to reason about. A custom binary format with separate sections for options (ie, contract configuration info etc), initialized data, and code. The entire binary is stored directly in permanent storage. This prevents the situation in the EVM where a contract must also contain a contract saver. This is not too complicated in the EVM, but with existing paradigms used in x86 development this would be extremely unusual, probably necessitating the need to write two completely separate programs in order to prevent assumed address memory errors etc.

Tooling is provided in the x86Lib testbench program to convert an ELF program to this custom binary format. The rationale for not using ELF directly is that ELF contains a significant amount of potential complexity, even if most ELF programs are fairly simple to parse. Rather than making a special case "you can't use these features in ELF" type implementation, it is instead easier to use a very simple custom format and providee conversion tools. Because of this, it is also possible to potentially provide flat binary, PE, and other binary format support.

The custom format is a flat byte array with a prefix map that indicates the length of each section:

//The data field available is only a flat data field, so we need some format for storing
//Code, data, and options.
//Thus, it is prefixed with 4 uint32 integers.
//1st, the size of options, 2nd the size of code, 3rd the size of data
//4th is unused (for now) but is kept for padding and alignment purposes
struct ContractMapInfo {
    //This structure is CONSENSUS-CRITICAL
    //Do not add or remove fields nor reorder them!
    uint32_t optionsSize;
    uint32_t codeSize;
    uint32_t dataSize;
    uint32_t reserved;
} __attribute__((__packed__));

After this map, just a flat array of bytes remains.

Flow:

  • LOCATION = 0, which is the first byte after ContractMapInfo
  • copy option data into a buffer from LOCATION to LOCATION + optionsSize. Increment LOCATION by optionsSize
  • copy code data into a buffer from LOCATION to LOCATION + codeSize. Increment LOCATION by codeSize
  • etc etc...

Section description:

  • Options is currently unused and thus optionsSize must be 0. Later it will contain dependency graphs, trusted contract options, and other specialized configuration data. This data is not directly exposed to contract code, unless the option specifically indicates so
  • Code is read-only executable code. This is loaded into the contract's memory space at 0x1000 and currently has a max size of 1Mb. The contract is not allowed to modify the contents of this memory. Their may also be a gas surcharge for executing code outside of this section, since this may make JIT and other optimizations more difficult. Right now this section has a fixed size of 1Mb and any data beyond codeSize is set to 0.
  • Data is initialized and read-write data. This is loaded into the contract's memory space at 0x100000 and also has a 1Mb limit. In some documents this section of memory is also referred to as "scratch memory". Right now this space has a fixed size of 1Mb and any data beyond dataSize is set to 0. The ".BSS" section of ELF files (uninitialized memory) will end up in this memory area as well, but since there is no data associated with it, there is no need to provide any info about .BSS to the VM. The ELF file converter will check that the data size + bss size does not exceed 1Mb.

Smart contract transaction-call format

Contracts have a call stack which can be used to send arguments to called contracts and for called contracts to return data. However, having numerous data entities in a transaction is non-trivial to deal with for verification purposes. So, there will only be 1 actual data field in the transaction format for calling contracts (same as we currently have for the EVM). However, an ABI is provided to simplify the smart contract's responsibiltiy for parsing this. The format can of course by ignored by just having one large argument that is the byte array you want to pass.

Flow:

  • LOCATION = 0, the first byte of the data
  • 32 bit integer, SIZE, is read from LOCATION
  • LOCATION is incremented by 2 (size of integer)
  • Buffer is allocated of size SIZE and then memory is copied to it from LOCATION to LOCATION + SIZE
  • LOCATION is incremented by SIZE
  • Repeat until no data remains

System call interface

Qtum uses syscalls in a very similar way to Linux based systems. The stack is not used at all, and only registers are used. If more arguments are needed than registers exist, then one register needs to point to an area of memory wiht these additional arguments in some kind of structure. The interrupt number 0x40 is used for all Qtum syscalls at this time.

Registers for calling:

  • EAX - system call number
  • EBX, ECX, EDX, ESI, EDI, EBP - arguments 1-6
  • ESP, EFLAGS - unused

Registers upon return from syscall:

  • EAX - return value (0 means success typically, excluding operations that return a length)
  • EBX, ECX, EDX, ESI, EBP, ESP, EFLAGS - not modified

A wrapper function is provided in libqtum to simplify this interface for C ABI using languages:

.global __qtum_syscall
// long syscall(long number, long p1, long p2, long p3, long p4, long p5, long p6)
__qtum_syscall:
  push %ebp
  mov %esp,%ebp
  push %edi
  push %esi
  push %ebx
  mov 8+0*4(%ebp),%eax
  mov 8+1*4(%ebp),%ebx
  mov 8+2*4(%ebp),%ecx
  mov 8+3*4(%ebp),%edx
  mov 8+4*4(%ebp),%esi
  mov 8+5*4(%ebp),%edi
  mov 8+6*4(%ebp),%ebp
  int $0x40
  pop %ebx
  pop %esi
  pop %edi
  pop %ebp
  ret

Most people should never need to do anything with syscalls however, as libqtum provides easy to use wrapper functions for each syscall.

Posted: 10/22/2018 10:52:10 PM