The "UniversalAddress" structure

In Qtum (as well as Bitcoin and others) there are multiple "address" formats. These might look like QNPkSpiwVfvMoMK8wqpKN9TdyLgRwJTYKz for a standard pay-to-pubkeyhash address or the address might begin with the letter M indicating a pay-to-scripthash address. The prefix of the address (ie, the bit that determines if it starts with a Q or M) determines what format the address is, and thus how coins can be sent to that address. There is actually more data in address than what goes into the blockchain. An address is made like the following:

base58(version + data + checksum)

Version is a 1 byte prefix that determines what letter the base58 address begins with. Data is the actual data that will go on the blockchain. For a pubkeyhash address, this would be a ripemd160 hashed public key. The checksum is the bottom 4 bytes of sha256(sha256(version + data)). Satoshi himself designed this system to prevent mistakes where it is possible to send coins to a wrong address, etc.

Since Qtum was first thought of, the mutliple address formats has been difficult to reconcile with the design of the EVM. In the EVM, all addresses are assumed to be 160 bits. This does not allow for any simple method of disambiguating between a pubkeyhash or scripthash form of address, there is no room for the "version" field. Furthermore, it's always been rather confusing because Qtum uses base58 addresses everywhere like Bitcoin does, but the EVM and Solidity naturally uses hex addresses like Ethereum. It's easy to convert a base58 address to a hex address, but it strips away the version number and it's easy to make mistakes.

Our best workaround for this problem was basically to have soldiity contracts etc continue to use hex addresses. If the hex address exists in the EVM as a contract, it sends coins or whatever to that contract address. If the address does not exist, then it assumes that the address is a pubkeyhash address. With this system it's impossible to make use of pay-to-scripthash or multisig addresses.

With the design of the x86 VM in Qtum, this was one of the more important things to be fixed. It will be possible to use established and safe Bitcoin style multisig addresses to manage x86 contracts, and contracts can send and receive funds to/from these addresses. Exposing the base58 version number was not enough however. To avoid mistakes like sending mainnet Qtum to a testnet network address, the exact version number can vary per network. It would not be ideal to need to compile special versions of a contract that work only on testnet, or only on mainnet. So, instead the "UniversalAddress" structure was created. It is currently defined as the following:

#define ADDRESS_DATA_SIZE 32
typedef struct{
    //Do not modify this struct's fields
    //This is consensus critical!
    uint32_t version;
    uint8_t data[ADDRESS_DATA_SIZE];
} __attribute__((__packed__)) UniversalAddressABI;

//Note: we don't use Params().Base58Prefixes for these version numbers because otherwise
//contracts would need to use two different SDKs since the the base58 version for pubkeyhash etc changes
//between regtest, testnet, and mainnet
enum AddressVersion{
    UNKNOWN = 0,
    //legacy is either pubkeyhash or EVM, depending on if the address already exists
    LEGACYEVM = 1,
    PUBKEYHASH = 2,
    EVM = 3,
    X86 = 4,
    SCRIPTHASH = 5
};

The version number was made 4 bytes instead of 1 to allow for future flags and to keep with a 4 byte alignment. Allowing the structure to be 33 bytes might be rather inefficient due to unaligned memory access on some processors being more expensive. The address data is 32 bytes because it can fit a 256 bit hash. No current address format uses a full 256 bit hash though, so there has been some consideration of reducing that size to 20 bytes instead, to fit the standard ripemd160 bit hash.. but thus far we are leaning on the side of flexibility.

The version number used in UniversalAddresses are not the same as the base58 address version. Instead, these addresses are not intended to be used directly, but rather to be easy to write code around. Each value corresponds to pubkeyhash, scripthash, EVM, or x86 addresses. These version numbers are always the same regardless of network, and tools are provided to decode and map a base58 address into a UniversalAddress. Here is an example in a C contract:

UniversalAddressABI a;
if(qtumParseAddress("xCREYTgvdk3XzFNpdXkj35fZX4hJK2Ckpd", &a)){
    qtumError("Error parsing address");
}
Posted: 11/19/2018 5:46:12 AM

Memory map of the x86 VM

This is a continuing series of x86 blog posts that basically documents things as needed for both team members and the public.

Unlike the EVM, in x86 memory is used extensively, and gaps in memory is possible. For example, accessing memory at 0x1000 might give an error about there being no memory there, but 0x2000 might have something useful. The EVM is actually the odd one out here as most VM and CPU architectures use this concept to their advantage. Back when 1Mb of memory was rare, the 8086 actually abused this by having most memory located starting at 0, but with the read-only BIOS memory located at 0xF0000, and sometimes with various sections of memory that actually connected to external devices rather than to RAM.

Anyway, Qtum's x86 VM design abuses this feature as well so that certain areas of memory can be shared etc. This is Qtum's core memory map; explanation for each memory section is below:

  • Emergency stack - 0x100, length 64, read-write
  • Contract Code - 0x1000, length 0x10000 (max), read-only
  • Contract Data - 0x100000, length 0x10000 (max), read-write
  • Stack memory - 0x200000, length 8196, read-write
  • Execution data - 0xD0000000, length TBD, read-only
  • Transaction data - 0xD1000000, length dynamic, read-only
  • Blockchain data - 0xD2000000, length TBD, read-only

Note that a memory section can be "init on read". ie, there's not necessarily a need to construct the data for a section before a contract needs to read that data. There may also be additional gas costs for the first access of such a memory area.

Emergency Stack

This is an emergency stack that is used when a double-fault exception occurs. Unused right now since exception support is not implemented

Contract Code

This is where the actual contract code is loaded into. This is read-only for security and optimization purposes. The actual size of this is determined by how much contract code needs to be loaded. Memory beyond the actual size can be read, but it can not be written to since it's read-only, and is fixed at 0

Contract Data

This is where contract data is loaded into. Unlike in the EVM, if you have a variable with some value, there is no need for CPU code to be written to reserve some bit of memory and then set it to that value. Variables in x86 is instead just a simple pointer to a memory address. That memory address is expected by the loader (in normal operating systems, the executable parser and loader.. in Qtum, the VM initialization process). This is nice because we can use fast native code to initialize all the variables in one quick "memcpy" operation. Variable pointers end up pointing to somewhere in this memory section. The actual size is currently fixed at 1Mb, but this will be changed later beyond the prototype.

Note that read-only data is (depending on linker configuration) stored in the code section. Read-only data in common programs includes strings, constants, etc.

Stack memory

This is a section of memory reserved for the x86 call stack. The call stack is used for passing arguments to functions, storing return addresses, local variables etc. The reason for not overloading the data section with stack data is that this section of memory is surrounded by missing memory. This means that if the stack over or under flows it can be immediately "detected" by getting an error thrown. Explicit errors are almost always preferrable to continuing to run with corrupted state.

Execution data

This is data that is read-only and specific to each contract execution. This means that the data will be different when calling a contract within a contract. This data includes things like the sender address (what address initiated this execution), gas limit, and other data.

Transaction data

This is dynamic length data that encodes the complete transaction data which caused the current contract execution. This includes all inputs and outputs. There will be both raw script access, as well as simplified "100 coins was sent to address A" type of data that can be used very easily. There will be helper library functions to assist in script decoding etc.

Blockchain data

This is read-only global blockchain data that is constant for all contract executions in the current block. This includes things like the block gas limit, current block height, previous block hashes, current difficulty, etc.

Rationale

The point of most of this data being available in memory rather than in syscalls is primarily for two reasons:

  • Every syscall has a non-negligible security risk
  • Syscalls are expensive in contract code size, contract code gas overhead, and in VM implementation speeds (it means exiting JIT etc)

Every syscall is an interface for exposing contract code to the outside world. The majority of OS kernel security exploits come from buggy syscall behavior. Thus, there is a big benefit to exposing as few syscalls to the VM as possible. Less ways for the VM to escape the sandbox, less places to audit.

The other factor that encouraged this is that the syscall process is quite expensive in multiple ways. The code size is small, but not negligible. It requires a minimum of about 30 bytes to make a syscall from C. Other languages might be smaller, but are probably bigger. The expensive part of the syscall process is that there must be translation between the syscall and the standard C interface. This involves calling a helper function, preserving a few registers, getting a bunch of items from the stack and storing them in registers, doing the actual syscall, and then finally restoring those previously preserved registers. The 30 byte point previously actually only covers the helper function part. The total number of bytes executed is significantly more but not represented in contract size since that code is written only once and used for every syscall.

And then finally, beyond actual contract ineffeciencies, each syscall can potentially wreck VM performance. In a JIT workflow, the actual hardware CPU can cache the JIT overhead code and JIT compiled contract code very quickly, but upon leaving that section to some other piece of code, it typically must clear at least portions of the cache to make room for this new code. And in a true virtualization workflow using hardware support it becomes even more expensive as it can involve a context switch and hardware level system call. This is a common problem with most languages involving a VM. This includes Webassembly (up until recently, I think that team made a major improvement somehow), Javascript/V8, etc.

Prototype

Qtum x86 is still in prototype. These memory areas are fairly certain, but as with everything in this series, things are still up in the air and this should not be considered a final specification

Posted: 10/27/2018 12:20:38 AM

Exposed Formats in Qtum x86

The following formats are exposed to interface and smart contract code

Runtime Length Encoding Format

On the blockchain all contract data for x86 uses a variant of Runtime Length Encoding (RLE) for bytes that are equal to 0. Non-zero data is not encoded. When a 0 is encountered in the bytestream it becomes a length to indicate how many zero bytes follow it. Having a length of 0 for the RLE is expressly forbidden and will result in an invalid transaction that can not be included in blocks or in the mempool

The RLE payload is prefixed with a 32 bit integer which contains a 24-bit decoded size field. If this size does not exactly match the decoded payload, the transaction is invalid.

the other 8 bits remaining in the 32 bit integer is reserved as a version number for potentially implementing other compression formats

Note that the size of the payload can be verified without allocating additional memory.

Relevant code comment:

//format: decoded payload length (uint32_t) | payload
//the compression only compresses 0 bytes. 
//when 0x00 is encountered in the bytestream, it is transformed converted to a run-length encoding
//encoding examples:
//0x00 00 00 00 -> 0x00 04
//0x00 -> 0x00 01
//0x00 00 -> 0x00 02
//0x00 (repeated 500 times) -> 0x00 0xFF 0x00 0xF5
//as part of the encoding process, a 32 bit length field is prefixed to the payload

The RLE encoding is used on top of any other formats.

Code for encoding and decoding is included in x86Lib, but should be trivial to implement in any language needed.

The length field may be removed at a later time as right now it provides no benefit. The original rationale was so that a constant memory allocation profile could be used, rather than potentially needing multiple allocations. However, the decoded format of data is stored in backing database in Qtum right now, making decoding a one-time operation.

Smart contract bytecode format

The x86 VM works differently from how the EVM encodes and stores contracts. In the EVM, a flat array of opcodes is all that is given. These opcodes are then executed and they decide what part of those opcodes and data is actually stored in permanent storage as the contract bytecode.

With Qtum's x86 VM it's much simpler and generally easier to reason about. A custom binary format with separate sections for options (ie, contract configuration info etc), initialized data, and code. The entire binary is stored directly in permanent storage. This prevents the situation in the EVM where a contract must also contain a contract saver. This is not too complicated in the EVM, but with existing paradigms used in x86 development this would be extremely unusual, probably necessitating the need to write two completely separate programs in order to prevent assumed address memory errors etc.

Tooling is provided in the x86Lib testbench program to convert an ELF program to this custom binary format. The rationale for not using ELF directly is that ELF contains a significant amount of potential complexity, even if most ELF programs are fairly simple to parse. Rather than making a special case "you can't use these features in ELF" type implementation, it is instead easier to use a very simple custom format and providee conversion tools. Because of this, it is also possible to potentially provide flat binary, PE, and other binary format support.

The custom format is a flat byte array with a prefix map that indicates the length of each section:

//The data field available is only a flat data field, so we need some format for storing
//Code, data, and options.
//Thus, it is prefixed with 4 uint32 integers.
//1st, the size of options, 2nd the size of code, 3rd the size of data
//4th is unused (for now) but is kept for padding and alignment purposes
struct ContractMapInfo {
    //This structure is CONSENSUS-CRITICAL
    //Do not add or remove fields nor reorder them!
    uint32_t optionsSize;
    uint32_t codeSize;
    uint32_t dataSize;
    uint32_t reserved;
} __attribute__((__packed__));

After this map, just a flat array of bytes remains.

Flow:

  • LOCATION = 0, which is the first byte after ContractMapInfo
  • copy option data into a buffer from LOCATION to LOCATION + optionsSize. Increment LOCATION by optionsSize
  • copy code data into a buffer from LOCATION to LOCATION + codeSize. Increment LOCATION by codeSize
  • etc etc...

Section description:

  • Options is currently unused and thus optionsSize must be 0. Later it will contain dependency graphs, trusted contract options, and other specialized configuration data. This data is not directly exposed to contract code, unless the option specifically indicates so
  • Code is read-only executable code. This is loaded into the contract's memory space at 0x1000 and currently has a max size of 1Mb. The contract is not allowed to modify the contents of this memory. Their may also be a gas surcharge for executing code outside of this section, since this may make JIT and other optimizations more difficult. Right now this section has a fixed size of 1Mb and any data beyond codeSize is set to 0.
  • Data is initialized and read-write data. This is loaded into the contract's memory space at 0x100000 and also has a 1Mb limit. In some documents this section of memory is also referred to as "scratch memory". Right now this space has a fixed size of 1Mb and any data beyond dataSize is set to 0. The ".BSS" section of ELF files (uninitialized memory) will end up in this memory area as well, but since there is no data associated with it, there is no need to provide any info about .BSS to the VM. The ELF file converter will check that the data size + bss size does not exceed 1Mb.

Smart contract transaction-call format

Contracts have a call stack which can be used to send arguments to called contracts and for called contracts to return data. However, having numerous data entities in a transaction is non-trivial to deal with for verification purposes. So, there will only be 1 actual data field in the transaction format for calling contracts (same as we currently have for the EVM). However, an ABI is provided to simplify the smart contract's responsibiltiy for parsing this. The format can of course by ignored by just having one large argument that is the byte array you want to pass.

Flow:

  • LOCATION = 0, the first byte of the data
  • 32 bit integer, SIZE, is read from LOCATION
  • LOCATION is incremented by 2 (size of integer)
  • Buffer is allocated of size SIZE and then memory is copied to it from LOCATION to LOCATION + SIZE
  • LOCATION is incremented by SIZE
  • Repeat until no data remains

System call interface

Qtum uses syscalls in a very similar way to Linux based systems. The stack is not used at all, and only registers are used. If more arguments are needed than registers exist, then one register needs to point to an area of memory wiht these additional arguments in some kind of structure. The interrupt number 0x40 is used for all Qtum syscalls at this time.

Registers for calling:

  • EAX - system call number
  • EBX, ECX, EDX, ESI, EDI, EBP - arguments 1-6
  • ESP, EFLAGS - unused

Registers upon return from syscall:

  • EAX - return value (0 means success typically, excluding operations that return a length)
  • EBX, ECX, EDX, ESI, EBP, ESP, EFLAGS - not modified

A wrapper function is provided in libqtum to simplify this interface for C ABI using languages:

.global __qtum_syscall
// long syscall(long number, long p1, long p2, long p3, long p4, long p5, long p6)
__qtum_syscall:
  push %ebp
  mov %esp,%ebp
  push %edi
  push %esi
  push %ebx
  mov 8+0*4(%ebp),%eax
  mov 8+1*4(%ebp),%ebx
  mov 8+2*4(%ebp),%ecx
  mov 8+3*4(%ebp),%edx
  mov 8+4*4(%ebp),%esi
  mov 8+5*4(%ebp),%edi
  mov 8+6*4(%ebp),%ebp
  int $0x40
  pop %ebx
  pop %esi
  pop %edi
  pop %ebp
  ret

Most people should never need to do anything with syscalls however, as libqtum provides easy to use wrapper functions for each syscall.

Posted: 10/22/2018 10:52:10 PM

Thoughts and Goals on Qtum's x86 VM

So we've been notoriously quiet about what the x86 VM will allow in Qtum, beyond just more programming language support. This is basically because the design process is easy to make a mediocre version on, but hard to build a well-optimized, efficient, and easy to use version. So here I won't dip into the details of the design just yet, but I'd like to announce the goals we have in mind.

Programming Language Support

Of course, programming language support is a great reason to build this x86 VM. I personally would like to make 2018 the year of smart contracts written in Rust. Rust is incredibly efficient, lightweight, and above all, focused on safety and avoidance of programmer mistakes. There is a lot more than just Rust of course though. Bringing easy to use languages like C# or Go is also a goal.

The basic gist of the x86 VM's power is that you can take pretty much any existing compiler or programming language, and just make some modifications so that it can run on Qtum's operating system like environment. Almost every compiler out there already has x86 support, so that actual bytecode and architecture support is already there.

Standard Library

One of the common complaints about the EVM is a lack of a standard library. This is not only a developer annoyance, but also is a direct consumer of precious blockchain space. Providing a standard library allows not only for Qtum's blockchain to be made slimmer and more efficient, but also it allows for these standard library functions to have special internal code, similar to Ethereum's pre-compiled contracts. This functionality can happen without needing to add special support for a new pre-compiled contract, and then relying on contracts to begin using this special functionality. Instead, contracts can use the same old unoptimized code, and when they call it, that code is opaque and can be optimized at will without needing to make any change to consensus. Optimization of the standard library like this is an implementation detail. However, as the ecosystem's implementations become efficient, the gas model can be tuned for these functions to make their gas cost reflect their true resource cost.

Additionally, the standard library need not require a fork to extend. Common functions could easily enter the specialized standard library memory space by using the Decentralized Governance Protocol. This mechanism also allows for patching bugs in the standard library, though that power must be especially audited, as smart contracts can come to rely on buggy behavior for proper function. And so, potential upgrades to standard library functions may be only by an opt-in function, or it may not exist at all. Our goal with the DGP is always to keep it conservative and to ensure that even in the case of complete compromise, smart contract logic is unaffected and user's funds are safe.

Optimized Gas Model

This part is very tricky, and just as a forewarning, we will most likely launch the x86 VM initially with a very simple gas model akin to the EVM. However, because of how much more powerful the ISA and features available in x86 are, there is some fairly straightforward ways to advance this space.

One of the simple but powerful solutions will be not only providing a standard library with common functions used by smart contracts and programs in general. But also there is a fairly straightforward path to making these standard library functions have a human-set cost, rather than requiring for these functions to rely on the simplistic and general gas model used for general computation. So, for example, strlen in the simplistic gas model may require 90 gas per character in the string. However, upon developer inspection, it's discovered strlen is extremely cheap to actually execute on the Qtum VM. So, Qtum's Decentralized Governance Protocol is used to propose a special gas rule for this function. So, now the cost of calling this function might be a flat initial cost of 10 gas, plus 1 gas per character. It is impossible to make a perfectly accurate gas model, so, we want to utilize the DGP mechanism in Qtum in order to make this approximation as optimized and efficient as possible.

Unlocking the Full Power of the AAL

Right now, we tend to pitch the Account Abstraction Layer as "what was necessary to make the EVM work". However, the AAL has a lot more power hidden in it beyond what was needed to make the EVM work on top of Qtum. Qtum was designed from the start to support multiple virtual machines, and the EVM was just the first one to be supported. That being said, the Account Abstraction Layer is currently limited by what can easily be exposed through the EVM. The x86 VM we are designing will not face such limitations. Some of the powerful things we want to expose because of this:

  • P2SH (ie, Bitcoin style) multi-sig as a first class citizen, for both sending and receiving payments from smart contracts
  • Raw transaction script support to send custom transactions to take full advantage of the Script functionality in Qtum
  • Allowing segwit transactions to contain and execute smart contracts

New Possibilities for Smart Contracts

Using x86 we get the Von Neumman architecture of computing. This means that code is data and vice-versa. This feature, as well as things like hardware/software interrupts allow for potential operating-system like constructs and features to be integrated into a single smart contract with multiple semi-trusted participants. This includes things like cooperative-multitasking, pause and resume execution (ie, resuming execution in a later transaction), and watchdog timers (though instead of "time", it would work on gas). This also of course includes a direct mechanism of updating contract bytecode without needing to transfer funds and data to a new contract.

The x86 instruction set also includes many specialized functions to control things like specialized permissions for certain code spaces, paging and memory mapping, as well as system calls. It is not expected for Qtum to expose most of these specialized system-level instructions. They greatly complicate the gas model design, and make everything significantly harder to optimize.

However, despite that let-down, there are relatively few things relevant to smart contracts within that set of instructions. There is very little practical need on a public blockchain for having separate ring-0 privileged code and ring-3 unprivileged code within a single smart contract's code. Where the use cases tend to really be prevalent for these features is in privileged and semi-privileged blockchains. So, when we begin to focus on Qtum's enterprise side, we will of course revisit this.

First-Class Oracles

In the x86 VM model for transactions, there is no need to call a contract if you know the data you need from that contract. It will be possible to load data directly from the external contract's storage space. This allows for first-class oracles, where contracts can establish their own ABI and API mechanisms to standardize their storage space. Then, a contract can simply load the storage data directly, with no need to do an expensive call that requires loading the entire contract bytecode, creating a new VM environment, etc. This will finally bring Oracles to be first-class citizens on the blockchain, instead of being limited by the functionality of smart contracts. s

Blockchain Analysis

The large memory space available to x86 as well as it's efficient set of opcodes for general computation allows for the potential of blockchain analysis that is just a pipe-dream with the EVM. It's possible to expose full blockchain data for contract analysis. This could allow AI-based smart contracts to automatically monitor the blockchain, potentially functioning as an oracle, in order to allow smart contracts to adjust their own behavior so that they operate as efficiently as possible in the current network conditions. This blockchain data could include full transaction data or statistics computed by the blockchain nodes (in a consensus-critical way). There is little downside to exposing this data since it is fully constant and only results in a few Mb of additional memory usage.

Alternative Data Storage

Currently, the EVM forces everyone to use 32-byte keys that point to 32-byte data. This can be quite painful to manage, especially when considering fragmentation and maintenance of that space. Moreover, there is no great reason for this. So, in the x86 machine we intend to give smart contracts a general purpose key-value store. So, you can store anything from 1 to some large number of bytes as a key, and have it point to an equally variant value. The proposed gas model for this functionality so far basically involves a flat-fee for writing/reading to this database, and then a per-byte rate fee for each byte that needs to be touched. This functionality would of course still be covered under the stateRootHash so that SPV wallets can interact with the smart contracts using this database.

Explicit Dependency Trees

Another, somewhat lofty, goal is to allow for dependency trees for smart contracts to be explicitly declared and enforced. This would be an opt-in function only so that contracts touching unknown smart contracts are still possible. However, for contracts that know exactly what dependencies they will rely on, this allows for these contracts to be executed in parallel in some cases, and thus they may received reduced gas costs, among other benefits. This would be a major scaling advantage for x86 based smart contracts that choose to opt-in to this feature.

Why x86? Why not ARM?

I've heard many people ask "why x86? Why not ARM?". This is a very good question. We think that x86 is the most well understood platform for virtual machines and emulators. There has been decades of collective mindshare put forth into making efficient and secure VMs for x86. If you want to really research this point, look no further than the many questions on Stackoverflow about making the Android emulator operate at a reasonable performance level. Basically, the solution in most cases is to use an x86 virtual machine instead of an ARM one. There are of course some projects out there for ARM VMs that aren�t regarded as terrible, like Qemu and I'm sure others I don't know about. But the point is that x86 emulation is a known problem with a fairly straightforward solution. The famous Intel Manuals are regarded as some of the most well written and clear documents available for a CPU architecture like this. And there are even some kids in high school that write x86 emulators for fun (haha, that's me!). The compiler support for ARM is always improving, but it is still no where near on par with x86's support.

Now, all that being said, x86 is by no means a simple ISA. It's existed since the 70s with the 8008, and has stayed backwards compatible since the 8088/8086. This really takes a toll on it's design, and is why there are just a vast amount of opcodes available, including some that are arguably useless and actually execute slower on hardware than if you wrote your code to avoid those instructions. My favorite example of that is the cursed Binary Coded Decimal set of instructions which haven't been popular since the 80s. However, this is an added complexity that is a few hours of extra work. The benefits to using x86 far outweigh the costs.

ARM is still on our radar though, especially for IoT use cases which typically operate natively on ARM processors. Right now we are focused on x86, but after that, who knows, especially for enterprise and permissioned blockchains.

Posted: 10/2/2017 8:01:55 AM