Whether you’re interested in becoming a developer for blockchain applications, or you’re just looking to understand what happens under the hood when you send bitcoin to a friend, it’s good to have a working knowledge of what happens when you create and broadcast Bitcoin transactions to the Bitcoin network. Why?
Because transactions are a basic entity on top of which the bitcoin blockchain is constructed. Transactions are the result of a brilliant collision of cryptography, data structures, and simple non-turing-complete scripting. They’re simple enough that common transaction types aren’t overly-complex, but flexible enough to allow developers to encode fairly customized transactions types as well. Today we’ll take a tour of the former.
As a developer, how does your bitcoin client post a new transaction to the network (and what happens when it’s received)?
What exactly is happening when you send some bitcoin to a friend?
This post will assume that the reader has a basic understanding of hashing, asymmetric cryptography, and P2P networking. It’s also a good idea to have a good sense for what exactly a blockchain is, even if you’re unfamiliar with any specific mechanics.
Bitcoin Transactions and their role in the bigger picture
Bitcoin is comprised of a few major pieces: nodes and a blockchain. The role of a typical node is to maintain its own blockchain version and update it once it hears of a “better” (longer) version. Simply put, the blockchain has blocks, and blocks have transactions.
With this simplified but accurate picture in mind, you might be wondering what exactly a transaction is made out of.
How do transactions allow me to transfer some bitcoin to a friend?
It turns out that the answers to these questions vary based on many things. Even assuming that we’re talking only bitcoin, we can use transactions in a number of creative ways to accomplish a variety of personalized goals. Let’s start at the beginning, that is, let’s take a look a good old-fashioned pay-to-PK-hash transaction type. After all, this type of transaction accounts for over 99% of all transactions on the bitcoin blockchain.
First, let’s build a mental model. It’s tempting to think of bitcoin as an account-based system. After all, when I send bitcoin to somebody, that person receives money and I’m left with a remaining balance. In the real world though, things are represented a bit differently. Generally speaking, when I send money to somebody I am sending spending all of that money (minus transaction fees). Some of that money will be spent back to my own personal account if there exists a remaining balance. The point is that all of the money moves every single time. You can skip to section 3.1 of for an explanation of why this model is preferable.
With that in mind, we can generalize and say that a bitcoin transaction has some inputs and outputs. A graphical representation might look something like this:
This was somewhat confusing to me when I first saw it, so I’ll elaborate a bit. When I post a transaction, I’m essentially “claiming” an output and proving that I have permission to spend the amount of money at that output. So if I’m Bob and I want to pay Alice, those inputs are my proof that I have been given a certain amount of money (although this might just be a portion of my total balance), and the outputs will correspond to Alice’s account. In this simple case, there would be only a single input and a single output.
A deeper look into Bitcoin transactions
Let’s understand the mechanics of a real bitcoin transaction. We’ll use the image above as a reference.
If you were to cut open a typical bitcoin transaction, you’d end up with three major pieces: the header, the input(s), and the output(s). Let’s briefly look at the fields available to us in these sections, as they’ll be important for discussion. Note that these are the fields that are in a so-called raw transaction. Raw transactions are broadcast between peers when a transaction is created.
hash: The hash over this entire transaction. Bitcoin generally uses hash values both a pointer and a means to check the integrity of a piece of data. We’ll look at this more in the next section.
ver: The version number that should be used to verify this block. The latest version was introduced in a soft fork that became active in December 2015.
vin_sz: The number of inputs to this transaction. Similarly, vout_sz counts the number of outputs.
lock_time: We’ll look at this more in later articles, but this basically describes the earliest time at which a block can be added to the blockchain. It is either the block height or a unix timestamp.
previous output hash: This is a hash pointer to a previously unspent transaction output (UTXO). Essentially, this is money that belongs to you that you are about to spend in this transaction.
n: An index into the list of outputs of the previous transaction. This is the actual output that you are spending.
scriptSig: This is a spending script that proves that the creator of this transaction has permission to spend the money referenced by 1. and 2.
value: The amount of Satoshi being spent (1 BTC = 100,000,000 Satoshi).
scriptPubKey: The second of two scripts provided in a bitcoin transaction, which points to a recipient’s hashed public key. More on this in the last section of this article.
One of the jobs of a bitcoin node is the verify that incoming transactions are correct (data hasn’t been tampered with, money isn’t being created, only intended recipients spend UTXOs, etc). A more exhaustive list can be found online, but I’ll list out a few of the important ones here:
All outputs claimed by inputs of this transaction are in the UTXO pool. Unspent outputs can only ever be claimed once.
The signatures on each input are valid. More precisely, we’re saying that the combined scripts return true after executing them one after the other. More on this in the last section.
No UTXO is spent more than once by this transaction. Notice how this is different than the first item.
All of the transaction’s output values are non-negative.
The sum of this transaction’s input values is greater than the sum of its output values. Note that if the numbers are different, the difference is considered to be a transaction fee that can be claimed by the miner.
A basic pay-to-PK-hash transaction
Bitcoin has its own custom (Forth-like) scripting language that is powerful enough to allow developers to create complicated and custom types of transactions. There are five or so standard transaction types that are accepted by standard bitcoin clients , however, there exist other clients that will accept other types of transactions for a fee. We’ll just cover the mechanics of pay-to-PK-hash here.
For any transaction to be valid, a combined scriptSig/scriptPubKey pair must evaluate to true. More specifically, a transaction spender provides a scriptSig that is executed and followed by the scriptPubKey of the claimed transaction output (remember how we said inputs claim previous unspent transaction outputs?). Both scripts share the same stack.
In the interest of efficiency, let’s use (official bitcoin wiki) a reference as we discuss. When you visit the link, go about halfway down to find a table containing 7 rows. This table shows how the scripts are combined, how execution occurs, and what the stack looks like at each step.
One thing to note is that, because bitcoin addresses are actually hashes (well, it gets even a bit more complicated. See ), there is no way for the sender to know the actual public key to check against the private key. Therefore, the Redeemer specifies both the public key and private key, and the scriptPubKey will duplicate and hash the public key to make sure that the Redeemer is indeed the intended recipient.
During execution, you can see that constants are placed directly onto the stack when they are encountered. Operations add or remove items from the stack as they are evaluated. For example, OP_HASH160 will take the top item from the stack, and has it twice, first with SHA-256 and then with RIPEMD-160. When all items in our script have been evaluated, our entire script will evaluate to true if true remains on the stack, and false otherwise.
All in all, the pay-to-PK-hash is a pretty straightforward transaction type. It ensures that only a redeemer with the appropriate public/private key pair can claim and subsequently spend bitcoin. Assuming that all other criteria are met (see the previous section), then the transaction is a good one and it can be placed into a block.
In future articles, I’ll break down more complicated types of transactions. We’ll see how more than two parties can participate in a transaction, and we’ll see how longer-running transaction types can be implemented.