In this section we will finally unpack how payment channels can be connected to a network of other payment channels via a process called routing. Note that we separate the concept of routing from the concept of path finding. Routing refers to the series of interactions across the network that allow a payment to flow from point A to point B, i.e. the active process of process of sending a payment. An important rule of thumb is that it’s possible for a path to exist between Alice and Bob, yet there may not be an active route on which to send the payment. One example is the scenario where all the nodes connecting Alice and Bob are currently off-line. In theory, one can examine the channel graph and connect a series of payment channels from Alice to Beb, hence a path exists. However, as the intermediary nodes are offline, the payment cannot be sent and so no route exists.
The innovation of routed payment channels allows our gamer Gloria to receive funds from her fans without maintaining a separate channel with every one of her fans who want to tip her. Instead Gloria will be able to receive payment from a fan as long as there exists a path of well-funded channels from that viewer to Gloria. The nodes along the path from the fan to Gloria are intermediaries and called "routing nodes" for the purpose of routing a payment.
Importantly, the routing nodes are unable to steal the funds while routing a payment from a fan to Gloria. Furthermore, routing nodes cannot lose money while participating in the routing process. They can however charge a routing fee for acting as an intermediary (although they don’t have to. It is possible to route payments for free!).
Another important detail is that due to the use of onion routing, intermediary nodes are only explicitly aware of the nodes before and after them in the route. They will not neccessarily know who is the orginator and recipient of the payment. This enables fans to use intermediary nodes to pay Gloria, without leaking private information and without risking theft.
This process of connecting a series of payment channels with end-to-end security, and the incentive structure for nodes to forward payments, is one of the key innovations of the Lightning Network.
In this chapter, we’ll dive into the mechanism of routing in the Lightning Network, detailing the precise manner in-which payments flow through the network. First, we will cover the concept of a conditional chained end-to-end secure payment, most commonly referred to as a Hash Time Locked Transaction (HTLC). Having learned how payments can be transmitted through the network, we will then discuss the concept of source-based routing and contrast it to the privacy preserving onion routing used in the network today. Finally, we will explore the exact mechanism of payment forwarding. We will discuss how the structure (edges, fees, time-locks, etc) of the route is determined by the sender, and is then transmitted to each individual node along the route.
Before we dive into the concept of a conditional chained end-to-end secure payment, let’s work through an example. Let us to return to Alice who, in previous chapters, purchased a coffee from Bob with whom she has an open channel. Alice now watches a live stream from Gloria the gamer, and wants to send her a tip via the Lightning Network. However, Alice has no open channel with Gloria. Alice is able to open one, however, this will require liquidity and on-chain fees which could be more than the value of the tip itself. Alice might also wish to minimize the total number of channels she has open. Instead, Alice can repurpose her existing open channels to send a tip to Gloria without the need to open a channel directly with Gloria. This is possible, as long as there exists some path of channels from Alice to Gloria with sufficient capacity to route the tip.
From previous chapters, we know Alice has an open channel with Bob, the coffee shop owner. Bob, in turn, has an open channel with the software developer Wei who helps him with the point of sale system he uses in his coffee shop. Wei is also the owner of a large software company which develops the game that Gloria plays, and they already have an open channel which Gloria uses to pay for the game’s license and in-game items.
If we draw out this series of payment channels, it’s possible to manually trace a path from Alice to Gloria that uses Bob and Wei as intermediary routing nodes. Alice can then craft a route from this outlined path, and use it to send a tip of a few thousand satoshis to Gloria, with the payment being forwarded by Bob and Wei. Essentially, Alice will pay Bob, who will pay Wei, who will pay Gloria. And no direct channel from Alice to Gloria is required.
The main challenge is to do this in a way that prevents Bob and Wei from stealing the money that Alice wants delivered to Gloria. To understand how the Lightning Network protects the payment while being routed, we can compare to an example of routing physical payments with golden coins in the real world.
Assume Alice wants to give 10 gold coins to Gloria, but does not have direct access to Gloria. However, Alice knows Bob, who knows Wei, who knows Gloria and so she decides to ask Bob and Wei for help. She can pay Bob to pay Wei to pay Gloria, but how does she make sure that Bob or Wei don’t run off with the coins after receiving them? In the physical world contracts could be used for safely carrying out a series of payments.
Alice could negotiate a contract with Bob which reads:
_I (Alice) will give you (Bob) 10 gold coins if you pass them on to Wei_
While this contract is nice in the abstract, in the real world, Alice runs the risk that Bob might breach the contract and hope to not get caught by law enforcement. Even if Bob gets caught by law enforcement, Alice faces the risk that he might be bankrupt and be unable to return her 10 gold coins. Assuming these issues are magically solved, it’s still unclear how to leverage such a contract to achieve our desired outcome: the coins ultimately being delivered to Gloria.
We thus improve our contract:
_I (Alice) will reimburse you (Bob) with 10 gold coins if you can prove to me (for example via a receipt) that you already have delivered 10 gold coins to Wei_
You might ask yourself why should Bob sign such a contract. He has to pay Wei but ultimately gets nothing out of the exchange, and he runs the risk that Alice might not reimburse him. Bob could offer Wei a similar contract to pay Gloria, but similarly Wei has no reason to accept it either. Even putting aside the risk, Bob abd Wei must already have 10 gold coins to send, otherwise they wouldn’t be able to participate in the contract. Thus Bob and Wei face both risk and opportunity cost for agreeing to this contract, and they would need to be compensated in order for them to accept it.
Alice can this make this attractive to both Bob and Wei, by offering them fees of 1 gold coin each, if they transmit her payment to Gloria. The final contract would instead read:
_I (Alice) will reimburse you (Bob) with 12 gold coins if you can prove to me (for example via a receipt) that you already have delivered 11 golden coins to Wei_
Alice now promises Bob 12 gold coins. There are 10 to be delivered to Gloria and 2 for the fees. She promises 12 to Bob if he can prove that he has forwarded 11 to Wei. The difference of 1 gold coin is the fee that Bob will earn for helping out with this particular payment.
As there is still the issue of trust and the risk that either Alice or Bob don’t honour the contract, all parties decide to use an escrow service. At the start of the exchange, Alice could "lock up" these 12 golden coins in escrow that will only be paid to Bob once he proves that he’s paid 11 golden coins to Wei. This escrow service is an "ideal functionality", which will later be replaced by a more trust-minimized mechanism. Let’s assume for now that everyone trusts this escrow.
In the Lightning Network, this "proof" of payment could take the form of a secret that only Gloria knows. In practice, this secret would be a large random number that is large enough to prevent others from guessing it (typically very, very large number, encoded using 256 bits!). The secret could then be committed to the contract by including the sha256 hash of the secret in the contract itself. We call this hash of the payment’s secret the payment hash. The secret which "unlocks" the payment is called the payment secret.
For now, we keep things simple and assume that Gloria’s secret is simply the text line: Glorias secret
.
In order to "commit" to this secret, she computes the sha256
hash which when encoded in hex, can be displayed as: f23c83babfb0e5f001c5030cf2a06626f8a940af939c1c35bd4526e90f9759f5
.
[1]
Since Alice wants to send 10 gold coins to Gloria, she is told by Gloria to use this payment hash to receive proof of payment. Alice now amends the previous contract to read:
_I (Alice) will reimburse you (Bob) with 12 gold coins if you can show me a valid message that hashes to:`*f23c83...*`. You can acquire this message by setting up a similar Contract with Wei who has to set up a similar contract with Gloria. In order to assure you that you will get reimbursed I will provide the 12 gold coins to an trusted escrow before you set up your next contract._
This new contract now protects Alice from Bob not forwarding to Wei, protects Bob from not being reimbursed by Alice, and ensures that there will be proof that Gloria was ultimately paid via the hash of Gloria’s secret. This valid message that hashes to the required number f23c83…` is called a pre-image
After Bob and Alice agree to the contract, and Bob receives the message from the escrow that Alice has deposited the 12 gold coins, Bob can now negotiate a similar contract with Wei.
Note that since Bob is taking a service fee of 1 coin, he will only forward 11 gold coins to Wei once Wei shows proof that he has paid Gloria. Similarly, Wei will also demand a fee and will expect to receive 11 gold coins once he has proved that he has paid Gloria the promised 10 gold coins.
Bob’s contract with Wei will read:
_I (Bob) will reimburse you (Wei) with 11 gold coins if you can show me a valid message that hashes to:`*f23c83...*`. You can acquire this message by setting up a similar contract with Gloria. In order to assure you that you will get reimbursed I will provide the 11 gold coins to an trusted escrow before you set up your next contract._
Once Wei gets the message from the escrow that Bob has deposited the 11 gold coins, Wei sets up a similar contract with Gloria:
_I (Wei) will reimburse you (Gloria) with 10 golden coins if you can show me a valid message that hashes to:`*f23c83...*`. In order to assure you that you will get reimbursed after revealing the secret I will provide the 10 gold coins to an trusted escrow._
Everything is now in place. Alice has a contract with Bob and has placed 12 gold coins in escrow. Bob has a contract with Wei and has placed 11 gold coins in escrow Wei has a contract with Gloria and has placed 10 gold coins in escrow. It is now up to Gloria to reveal the secret, the pre-image
Since Gloria is the one who came up with the secret (and committed it to the contract in the form of a payment hash), she now provides it to Wei. He checks that it hashes to f23c83…` and the escrow releases the 10 golden coins to Gloria. Wei now provides the secret to Bob. Bob checks it and the escrow releases the 11 gold coins to Wei. Bob now provides the secret to Alice. Alice checks it and the escrow releases 12 gold coins to Bob.
All the contracts are now settled. Alice has paid a total of 12 gold coins, 1 of which was recieved by Bob, 1 of which was recieved by Wei, and 10 of which were received by Gloria. With a chain of contracts like this in place, Bob and Wei would not have been able to run with the money as they actually deposited their money first.
However, one issue still remains. If Gloria refused to release her secret pre-image, then Wei, Bob, and Alice would all have their coins stuck in escrow but wouldn’t be reimbursed. And similarly if anyone else along the chain failed to pass on the secret, the same thing would happen. So while no one can steal money from Alice everyone can still lose money.
Luckily, this can be resolved by adding a deadline to the contract.
We could amend the contract so that if it is not fulfilled by a certain deadline, then the contract expires and the escrow service returns the money to the person who made the original deposit. We call this deadline a "time lock". The deposit is locked with the escrow service for a certain amount of time, and is eventually released even if no proof of payment was provided.
In order to factor this in, the contract between Alice and Bob is once again amended with a new clause:
_Bob has 24 hours to show the secret after the contract was signed. If he does not provide the secret by this time, Alice's deposit will be refunded by the escrow service and the contract becomes invalid._
Bob, of course, now has to make sure he receives the proof of payment within 24 hours. Even if he successfully pays Wei, if he receives the proof of payment later than 24 hours he will not be reimbursed. In turn, he will alter his contract with Wei in the following way:
_Wei has 22 hours to show the secret after the contract was signed. If he does not provide the secret by this time, Bob's deposit will be refunded by the escrow service and the contract becomes invalid._
As you might have guessed, Wei is now incentiviced to also alter his contract with Gloria:
_Gloria has 20 hours to show the secret after the contract was signed. If he does not provide the secret by this time, Bob's deposit will be refunded by the escrow service and the contract becomes invalid._
With such a chain of contracts we can ensure that, after 24 hours, the payment will successfully deliver from Alice to Bob to Wei to Gloria, or it will fail and everyone will be refunded. Either the contract failed or succeeded, there’s no middle ground. In the context of the Lightning Network, we call this "all or nothing" property "atomicity".
As long as the escrow is trustworthy and faithfully performs its duty, then no party will have their coins stolen in the process. The pre-condition to this route working at all, is that all parties in the path already needed to have enough money to satisfy the required series of deposits.
While this seems like a minor detail we will see in later this chapter that this requirement is actually one of the more difficult issues for Lightning Network nodes. It becomes progressively more difficult as the size of the payment increases. Furthermore, the parties cannot use their money while it is locked in escrow. Thus users forwarding payments face an opportunity cost for locking the money, which is ultimately reimbursed through routing fees, as we saw in the above example.
In the following two sections we will discuss how the Bitcoin scripting language can be used to set up conditional chained end-to-end secure payment contracts without third party escrows, similar to the gold coin contracts described above. These are called Hash Time Locked Contracts (HTLCs). For HTLCs, there are no trusted third parties who act as an escrow; the Bitcoin Network itself becomes the "escrow" service.
After that, we will discuss users are able to use an HTLC to "route" a payment through the network securely. In the Lightning Network in 2020 we use a technique called source-based onion routing, although it is also possible to route payments with alternative techniques. Finally we will discuss the precise details concerning the exact mechanics of forwarding, settling, and cancelling HTLCs in the network.
Our example in the prior section using "golden coins", was intended to lay same base intuition which we’ll leverage in this section to explain how HTLCs work in practice. HTLC is actually an acronym that stands for "Hash Time-Locked Contracts". A HTLC is a specific instantiation of a Conditional Chained End to End Secure Payment (CCESP, don’t use this acronym?). As we’ll see in the later chapters, given a set of adequate cryptographic constructs, many other instantiations are possible as well.
Before we dive into the specifics of HTLCs, it may be helpful to first build intuition on an abstraction over this concrete concept. First, let’s unpack what it means for something to be a conditional chained end to end secure payment:
A payment can be said to be conditional, if the completion of the payment relies on the completion of a certain event. In the golden coins example, this "condition" was the reveal of a hash pre-image. We could feasibly substitute this hash pre-image reveal for any other construct with "hardness" properties. Namely: it should be infeasible for a party that doesn’t know the proper "solution" of the condition to satisfy it, the "description" of the condition shouldn’t give away any information about the true "solution", and once a solution has been chosen and a description created from it, it shouldn’t be possible to "alter" that solution and have it still be a valid condition for the description.
The payment should only be able to be redeemed if a valid solution is revealed. Critical, all conditions need to be timed in order to allow the construct to return the funds back to the sender if a solution to this condition isn’t revealed. The combination of the condition, and a timeout on the condition gives the payment a trait we commonly refer to as atomicity: either the payment happens, or the receiver if refunded the funds.
Building upon our conditional payment, it may be possible to chain this payment, allowing it to involve the payer, the payee, and possibly several intermediaries. Each intermediary, is able to present a slightly modified version of the condition (without invalidating it all together), and so so in an iterated manner until the conditional payment reaches the payee. Once it reaches the payee, then the payment should be able to be iteratively resolved, starting at the payee all the way back to the payer.
Each chaining creates an "incoming" and "outgoing" conditional payment. A node receives a conditional payment from a party (incoming condition), and then extends the conditional payment to the next party in the chain (outgoing condition). The payment is extended in from payer to payee, but settled from payee to payer, as each of the intermediaries gain the solution to the outgoing condition, and use that (possibly augmenting it) to satisfy the incoming solution.
Typically the payer rewards the intermediaries by sending slightly more than the payment amount, in order to allow the intermediaries to send out less with their outgoing payment than what they received from the incoming payment. The difference between these two payment values makes up the "forwarding fee" collected by the intermediary.
With our final addition, we’ll achieve "end to end security". By this we mean that: no intermediaries are able to "claim" the payment without first obtaining the solution from someone further down from them in the chain. Additionally, we also require that the amount the payer intended to send is fully received by the payee. Finally, we require that non of the intermediaries are able to "contaminate" the payment beyond giving incorrect directions to the party that directly follows them. In other words, the intermediary shouldn’t be able to materially affect the propagation of the payment several hops away from it.
In this section, we’ll construct a conditional chained end to end payment known as the HTLC. At each step we’ll add a new component, then examine it in light of our original definition to ensure all requirements and security properties are reached.
First, the "condition". For an HTLC, the condition is typically the reveal of a hash pre-image that matches a particular hash. This hash is typically referred to as the "payment hash", with the pre-image being called the "payment pre-image". If the name didn’t give too much away, for an HTLC, we’ll use a cryptographically secure hash function as one part of our condition. By using a cryptographic hash function, we ensure that it’s infeasible for another party to "guess" the solution of our condition, it’s easy for anyone to verify the solution, and there’s only one "solution" to the condition.
In order to implement the "refund" functionality, we rely on the "absolute time lock" functionality of Bitcoin script.
With all that said, a basic Bitcoin script implementing a hash time-locked contract would look something like the following:
OP_SIZE 32 OP_EQUAL
OP_IF
OP_HASH160 <ripemd(payHash)> OP_EQUALVERIFY
<receiver key>
OP_ELSE
OP_CHECKLOCKTIMEVERIFY <timeout>
OP_DROP
<sender key>
OP_CHECKSIGVERIFY
Alice can present this script to Bob in order to kick off the conditional payment. For the chained aspect, Alice needs to be able to communicate the proper payment details to each hop in the route. Recall that each hop will specify a forwarding fee rate, as well as other parameters that express their forwarding policy. In addition to this forwarding rate, Alice also needs to be conceded about what time locks to use. Each node in the hop needs some time to be able to settle the outgoing, then incoming payment on-chain in the worst case. As a result, when constructing the final route, we need to give each node some buffer time, we call this before time, the "time lock delta". Factoring in this time-lock delta, the time-lock of the outgoing HTLC will decrease as the route progresses, as the outgoing HTLC will expire before the incoming HTLC. This set of decrementing time-locks is critical to the operation of the system, as it ensure out atomicity property for each hop, assuming they’re able to get into the chain in time.
In the next section, we’ll go into the exact mechanism of how Alice is able to deliver forwarding details to each hop in the route. In addition, we’ll dive further into proper time-lock construction, as incorrect time-lock set up can violate our atomicity property and lead to a loss of funds.
So far you have learnt that payment channels can be connected to a network which can be utilized to send payment from one participant to another one through a path of payment channels. You have seen that with the use of HTLCs the intermediary nodes along the path are not able to steal any funds that they are supposed to forward and also how a node can set up and settle an HTLC. With this bare foundation laid, the following questions may have come across you mind:
-
Who chooses the path for a candidate route?
-
How is a path selected as a candidate to attempt to route the HTLC for a payment?
-
How much information do nodes know about the total path?
-
How exactly does a payment flow through the network at each node?
In the network today, the sender is the one that selects the route and decides nearly all the details of the resulting route.
As for how path finding is done, there is no single approach that all nodes in the network use. Instead, answer to the second question has a very large solution space, meaning there are several algorithms and neuritics used in the network today. Most commonly, a variation of Dijkstra’s algorithm is used which takes into account additional Lightning Network details such as fees and time-locks. Remember from earlier that a path turns into a route which is used to trigger a payment attempt. As several conditions need to be satisfied for the HTLC to be completely extended, the sender may need to try several routes until one succeeds. However, the user of the wallet typically will not be aware of these failed path finding attempts, just as when we load a web-page on the Internet, we don’t learn of any TCP packet retransmissions.
In the early days of the network, a payment could only utilize a single channel in its final route. With the rise of Multi-Path Payments, the sender is able to split the amount into smaller pieces, and use distinct strategies to route all the payment chunks. This splitting behavior is similar to IP packet fragmentation on the IP layer: each node expresses its Maximum Payment Unit, with the sender using this as a guide to adequately split all payments. In later chapters, we’ll discuss further details of payment splitting and combination once we get to advanced path finding.
At a high level, each node in the route is only explicitly told: how to validate the incoming HTLC packet (remember all details need to be correct for a payment to flow!), who the next hop in the route is, and how to modify the incoming HTLC packet into a valid outgoing HTLC packet to forward to the next node. Combined with the fact that intermediate forwarding nodes aren’t explicitly given the sender and receiver of a payment, nodes are given the least amount of information they need to successfully forward a payment. In addition to these privacy enhancing attributes, intermediate nodes aren’t able to arbitrarily modify an HTLC packet, as all information is encrypted and cryptically authenticated with integrity checks carried out at each hop to ensure contents haven’t been modified. Readers familiar with onion routing may have realized that we’ll be using some clever cryptographic technique application to achieve all thees traits. We call this series of clever application of cryptographic techniques: sourced based onion routing!
Source based routing (the non-cryptographic portion of onion routing), is distinct from how packets are typically transmitted on the IP layer. On the Internet today, packet switching is widely used to transmit data across the Internet. Packet switching typically explicitly indicates the sender and receiver of a given packet. Intermediate routing nodes then attempt to deliver the packet on a best effort basis, with great freedom with to exactly how they select the next node in the route. However, the lack of encryption, end-to-end integrity checks, and arbitrary choice of routes may this a poor system to use in a payment network.
Source routing instead has the sender select the route entirely (which all we’ll learn later is important due to fees and timelocks). The onion routing layers then gives the sender nearly completely control of the route, and allows the sender to only tell the intermediate nodes what they need to successfully forward a payment. Onion routing is used in several popular protocols on the Internet, with the most notable of them being Tor. In the Lightning Network, we use a specific onion routing packet format called Sphinx, with some special modifications made in order to make it more suited to the unique constraints of the Lightning Network.
Note
|
While the Lightning Network also uses an onion routing scheme it is actually very different to the onion routing scheme that is used in the TOR network. Aside from the distinct cryptographic techniques they use, the biggest difference is that TOR is being used for arbitrary data to be exchanged between two participants where on the Lightning Network the main use case is to pay people and transfer data that encodes monetary value. In the Lightning Network, we’re only concerned with transmitting the details that are needed for a successful payment. On the Lightning Network there is no analogy to the exit nodes of the Tor Network as there’s no need to "exit" the network: all payments flow within the network. Although, in an idea model only a precise amount of information is leaked by a route, in practice several "side channels' exist, that may allow an adversary to deduce more information about a route. As an example, information about CTLV deltas, or the set of possible routes in the network may give away additional information about a given route. Similar to Tor, onion routing in the Lightning Network isn’t secure against a global passive adversary (one that can monitor all links and information flows in the network). Today in the network, every node in the route sees the same payment hash, meaning that if two nodes are "compromised" more details of the route are leaked. On the TOR network nodes can theoretically be connected via a full graph as every node could create an encrypted connection with every other node on top of the Internet Protocol almost instantaneously and at no cost. On the Lightning Network payments can only flow along existing payment channels. Removing and adding of those channels is a slow and expensive process as it requires onchain bitcoin transactions. On the Lightning Network nodes might not be able to forward a payment package because they do not own enough funds on their side of the payment channel. On the other hand there are hardly any plausible reasons other then its wish to act maliciously why a TOR node might not be able to forward an onion. Last but not least the Lightning Network can actually run on Tor to use it as a message transport layer. This means that all connections of a node with its peers and the resulting communication will by obfuscated once more through the TOR network. |
Lets stick to our example in which Alice still wants to tip Gloria and has decided to use the path via Bob and Wei. We note that there might have been alternative paths from Alice to Gloria but for now we will just assume it is this path that Alice has decided to use. In order to kick off the transfer, Alice needs to send a special message to Bob to kick off the multi-hop transfer. You’ll learn about the specific structure of this message in later chapters, but for now we’ll call it an "HTLC Add" message. Aside from the amount, the payment hash, and the time-lock, this message also contains an opaque field use to store encrypted forwarding information. Today in the network, this field is 1366 bytes, as that’s the fixed size length of the onion packet. #TODO(roasbeef): explain security properties earlier This onion contains all the information about the path that Alice intends to use to send the payment to Gloria. However Bob who receives the onion cannot read all the information about the path as most of the onion is hidden from him through a sequence of encryptions. The name onion comes from the analogy to an onion that consists of several layers. In our case every layer corresponds to one round of encryption. Each round of encryption uses different encryption keys. They are chosen by Alice in a way that only the rightful recipient of an onion can peel of (decrypt) the top layer of the onion.
For example after Bob received the onion from Alice he will be able to decrypt the first layer and he will only see the information that he is supposed to forward the onion to Wei by setting up an HTLC with Wei. The HTLC with Wei should use the same Payment Hash as the receiving HTLC from Alice. The amount of the forwarded HTLC was specified in Bob decrypted layer of the onion. It will be slightly smaller than the amount of his incoming HTLC from Alice. The difference of these two amounts has to be at least as big as to cover the routing fees that Bob’s node announced earlier on the gossip protocol.
In order to set up the HTLC Bob will modify the onion a little bit in a deterministic manner. He removes the information that he could read from it and passes it along to Wei.
Wei in turn is only able to see that he is supposed to forward the package to Gloria. Wei knows he received the onion from Bob but has no clue that it was actually Alice who initiated the onion in the first place. In this way every participant is only able to peel of one layer of the onion by decrypting it. Each participant will only learn the information it has to learn to fulfill the routing request. For example Bob will explicitly be told that Alice offered him an HTLC and sent him an onion and that he is supposed to offer an HTLC to Wei and forward a slightly modified onion. Bob isn’t explicitly told if Alice is the originator of this payment as she could also just have forwarded the payment to him. Due to the layered encryption he cannot see the inside of Wei’s, and Gloria’s layer. The only thing Bob is told explicitly is that he was involved in a path that involved Alice, him and Wei.
While the Onion is decrypted layer by layer while it travels along the path from Alice via Bob and Wei to Gloria it is created from the inside layer to the outside layers via several rounds of encryption. Being created from the inside means that the construction starts with the Onion Package that Gloria is supposed to receive in plain text. Let us now look at the construction of the Onion that Alice has to follow and at the exact information that is being put inside each layer of the onion.
The onions are a data structure that at every hop consists of four parts:
-
The version byte
-
The header consisting of a public key that can be used by the recipient to produce the shared secret for decrypting the outer layer and to derive the public key that has to be put in the header of the modified onion for the next recipient.
-
The payload
-
an authentication via an HMAC.
For now we will ignore how the public keys are derived and exchanged and focus on the payload of the onion. Only the payload is actually encrypted and will be peeled of layer by layer. The payload consists of a sequence of a sequence of per hop data. This data can come in two formats the legacy one and the Type Length Value (TLV) Format. While the TLV format offers more flexibility in both cases the routing information that is encoded into the onion is the same for every but the last hop. For example, with the new TLV format, the sender can actually included the preimage in the payload for the last hop. This is nice as it allow a payer to initiate a payment without the necessity to ask the payer for an invoice and payment hash first. We will this feature called key send in a different chapter.
A node needs three pieces of information to forward the package:
-
The short channel id of the next channel along which it is supposed to forward the onion by setting up an HTLC with the same payment hash.
-
The amount that it is supposed to be forwarded and thus being used in the HTLC.
-
Timelock information encoded to a
cltv_delta
is the last piece of information that is needed as HTLCs are hashed time locked contracts.
For easier readability we have used just a small integer as short_channel_ids
in the following example and graphics.
We can see that Alice has created some per hop data for David. The short channel id is set to 0 signaling David that this payment is intended to be for him. The amount to forward is set to 3000. On the incoming HTLCs David should have seen that exact amount. Usually this amount is intended to say how many satoshis should be forwarded. Since the short channel id was set to zero in this particular case it is interpreted as the payment amount. Finally the CLTV delta which David should use to forward the payment is also set to zero as David is the final hop. These data fields consist of 20 Bytes. The Lightning Network protocol actually allows to store 65 Bytes of data the Onion for every hope.
-
1 Byte Realm which signals nodes how to decode the following 32 Bytes.
-
32 Byte for routing hints (20 of which we have already used).
-
32 Byte of a Hashed Message Authentication code.
Since the additional 12 Byte of data for the routing hints were not needed at this time they are set to zero. In the next diagram we can see how the per hop payload for David looks like.
On important feature to protect the privacy is to make sure that onions are always of equal length independ of their position along the payment path. Thus onions are always expected to contain 20 entries of 65 Bytes with per hop data. As David is the final recipient there is only reasonable data for 65 Bytes of the per hop data. This is not a problem as the other 19 fields are filled with junk data. You could also see this in the previous diagram.
After Alice has set all the data she needs to encrypt the onion payload. For this she derives a shared secret between Davids public node key and the private secret that she generated for David. This process is also well known as an Elliptic Curve Diffie Hellmann key exchange and a standard technique in cryptography and Bitcoin.
You can see that Alice put the encrypted payload inside the full Onion Package which contains a the public keys from the secret key that she used to derive the shared secret. Full onion package also has a version byte in the beginning and an HMAC for the entire Onion. When David receives the Onion package he will extract the public key from the unencrypted part of the onion package. The property of the Elliptic Curve Diffie Hellmann key exchange is that if he multiplies this public key with his private node key he will get the same shared secret as a result as Alice did. However others cannot derive the same shared secret as they neither know Alice’s nor David’s private key.
Note
|
Let |
After the encrypted Onion for David is created Alice will create the next outer layer by creating the onion for Wei.
She truncates 65 Bytes from the end of the encrypted onion and prepends the truncated onion with 65 Byte per Hop data for Wei. The per hop data follows the same structure as the per hop data for David. Thus she starts with the Realm Byte that she will set to 0 again. Then comes the short channel id. This is set to 452 as Wei is supposed to use that channel to forward the onion. She sets the amount to 3000 satoshi as this is the amount that David is supposed to receive. Finally she uses the CLTV delta that was announced for this channel on the gossip protocol and that Wei should use for the HTLC when he forwards the Onion. Again 12 Bytes of zeros are padded and an HMAC is computed. Note that she did not have to compute filler this time as she already has too much data with the encrypted inner onion. That is why the inner onion had to be truncated at the end. This is the plain text version of Weis Onion payload and can be seen in the following diagram:
We emphasize that Wei has no chance to decrypt the inner part of the onion. However the information for Wei should also be protected from others. Thus Alice conducts another ECDH. This time with Wei’s public key and and ephemeral keypair that she has generated particularly for Wei. She uses the shared secret to encrypt the onion payload. She would be able to construct the entire onion for Wei - which actually Bob does while he forwards the onion. The Onion that Wei would receive can be seen in the following diagram:
Note that in the entire onion there will be Wei’s ephemeral public key. David ephemeral public key is not stored anywhere in the onion. Neither in the header, nor in the payload data. However we have seen that David needed to have this key in the header of the Onion that he received. Luckily the ephemeral keys that Alice used for the ECDH with David can be derived from the ephemeral key that she used for Wei. Thus after Wei decrypts his layer he can use the shared secret and his ephemeral public key to derive the ephemeral public key that David is supposed to use and store it in the header of the Onion that he forwards to David. The exact progress to generate the ephemeral keys for every hope will be explained at the very end of the chapter. Similarly it is important to recognize that Alice removed data from the end of Davids onion payload to create space for the per hop data in Wei’s onion. Thus when Wei has received his onion and removed his routing hints and per hop data the onion would be to short and he somehow needs to be able to append the 65 Bytes of filled junk data in a way that the HMACs will still be valid. This process is of filler generation as well as the process of deriving the ephemeral keys is described in the end of this chapter. What is important to know is that every hope can derive the Ephemeral Public key that is necessary for the next hop and that the onions save space by always storing only one ephemeral key instead of all the keys for all the hops.
Finally after Alice has computed the encrypted version for Wei she will use the exact same process to compute the encrypted version for Bob. For Bobs onion she actually computes the header and provides the ephemeral public key herself. Note how Wei was still supposed to forward 3000 satoshis but How Bob was supposed to forward a different amount. The difference is the routing fee for Wei. Bob on the other hand will only forward the onion if the difference between the mount to forward and the HTLC that Alice sets up while transferring the Onion to him is large enough to cover for the fees that he would like to earn.
Note
|
We have not discussed the exact cryptographic algorithms and schemes that are being used to compute the ciphertext from the plain text. Also we have not discussed how the HMACs are being computed at every step and how everything fits together while the Onions are always being truncated and modified on the outer layer. If everything until here made perfect sense to you and you want to learn about those details we believe that you have all the necessary tools at hand to read BOLT 04 which is why we decided not to include all those technical details here in the book. BOLT 04 is the open source specification of the onion routing scheme that is being used on the Lightning Network and a perfect resource for the missing details. |
TODO: everything from here on will most likely change and could even be redundant.
Onions are being constructed from the inside to the outside. As the inside of the onion is decrypted last it has to correspond to the recipient which in our case is Gloria. As every layer of the Onion is encrypted by Alice in such a way that only the respective recipient can decrypt their layer Alice needs to come up with a sequence of encryption keys that she will use for each and every hop. The main concept that is being used is the shared secret computation via an elliptic Curve Diffie Hellmann Key exchange (ECDH) between Alice and each of the hops. However for the recipients to be able to to compute their shared secret they have to know a public key which they can use. If Alice used the same private key for the computation of each of the shared secrets Alice would have to send the same public key with the onion.
the different payments could be linked together by an attacker that is why
Every layer of the onion has 32 Bytes of per_hop
data.
This data is split into 4 data fields
-
The 8 Byte
short_channel_id
indicates on which channel the onion should be forwarded next -
The 8 Bytes
amt_to_forward
is a 64 Bit unsigned integer that encodes an amount in millisatoshi and indicates the amount that is supposed to be forwarded -
The 4 Bytes
cltv_delta
is a 32 Bit unsigned integer that is used for the time locks in the HTLCs. -
Finally there are 12 Byte left for padding and future versions and updates of the onion package format.
Interestingly enough Alice can construct the onion with different encryption keys for Bob, Wei and Gloria without the necessity to establish a peer connection with them.
She only needs a public key from each participant which is the public node_id
of the lightning node and known to Alice.
As other nodes she has learnt about the existence of public payment channels and the public node_id
of other participants via the gossip protocol which we described in its own chapter.
In order to have a different encryption key for every layer Alice produces a shared secret with each hop using the public node_id
of each node and conduct an Elliptic Curve Diffie Hellmann Key exchange (ECDH).
She starts by generating a temporary session key. This key will also be called the ephemeral key. This private key multiplied with the generator Point of the Elliptic curve that is being used in Bitcoin produces a public key. This happens in the same way how the nodes public key is generated from the secret private key of the node. Alice could use this session keys to conduct the Diffie Hellmann key exchange if she would send the public key with the onion. However she wishes to use a different session key to conduct the Diffie Hellmann key exchange with each of the nodes along the path. TODO: WHY?! Yet she does not want to add a public key (which consumes quite some space) into every layer of the onion. Luckily there is a nice deterministic way in which she can derive different sessions keys for every hop and execute the Diffie Hellmann and allow the hops to use their shared secret to derive the next session public key. Lets explore this in detail with the following example:
Of course the Lightning Network protocol could have been designed in a way that Alice will only use her node’s key to conduct the ECDH with every nodes public key. However she would have to put her public key in the header of the onion. This is necessary for nodes to be able to execute an ECDH and produce the same shared secret that Alice used for the respective layer of the Onion. However with that information nodes would know that Alice was the originator of the payment lifting the anonymity of the payer by design.
In the first part of the routing chapter you have learnt that payments securely flow through the network via a path of HTLCs. You saw how a single HTLC is negotiated between two peer and added to the commitment transaction of each peer. In the second part you have seen how the necessary information for setting up HTLCs along a path of hops are being transfered via onions from the source to the sender. A mechanism that protects the privacy of payer and payee. However there are quite some challenges and things that can go not as expected. This is why we we want to discuss how errors are being handled and what users and developers should take into consideration.
Most importantly it is absolutely necessary that you understand that once your node sent out an onion on your behalf (most likely because you wanted to pay someone) Everything that happens to the onion is now out of your control.
-
You cannot force nodes to forward the onion immediately.
-
You cannot force nodes to send back an error if they cannot forward the onion because of missing liquidity or other reasons.
-
You cannot be sure that the recipient has the preimage to the payment hash or releases it as soon as the HTLCs of the correct amount arrived.
By setting up an HTLC - which you do by sending out an onion - you have committed to settle the HTLCs in exchange for the preimage if the preimage arrives before the absolute timelock of the HTLC. This can be very frustrating from a use experience point of view. You want to quickly pay a person but the payment path that your node choose has CLTV deltas that quickly add up to several 100 blocks which is a couple of days. This means now that if nodes on the path misbehave - on purpose or maybe just because they have a downtime which your node didn’t know about - you will have to wait even though you don’t see a preimage. You must not send out another onion along a different path because there is a risk that both payments will settle eventually. While our user experience is that most payments find a path and settle in far less than 10 seconds the Lightning Network protocol cannot and does not give any service level agreement that within this time payments will settle or fail.
Note
|
There are ideas out that might solve this issue to some degree by allowing the payer to abort a payment. You can find more about that under the terms |
Despite these principle problems there are plausible situations in which the routing process fails and in which honest nodes can and should react. This is why the onion protocol has the ability to send back errors. Some - but not all - of the reasons for errors could be:
-
A node has not enough liquidity to set up the next HTLC
-
The next payment channel does not exist anymore as it might have been closed while the onion was routed to node that was supposed to forward the onion along the channel.
-
While the channel might still be open - as the funding transaction was never spent - it might happen that the other peer is offline. This of course prevents the node to forward the onion.
-
The key exchanges of the sender might have been wrong so the decryption of the onion or the HMCAs do not match. (also because someone tried to tamper with the onion)
-
The recipient might not have issued an invoice and does not know the payment details.
-
The amount of the final HTLC is too low and the recipient does not want to release the preimage.
If errors like those occur a node should send back a reply onion.
The reply onion will be encrypted at each hop with the same shared secrets that have been used to construct the onion or decrypt a layer.
These shared keys are all known to the originator of the payment.
The onion innermost onion contains the error message and an HMAC for the error message.
The process makes sure that the sender of the onion and recipient of the reply can be sure that the error really originated from the node that the error messages says.
Another important step in the process of handling errors is to abort the routing process.
We discussed that the sender of a payment cannot just remove the HTLC on the channel along which the sender sent the payment.
Recall for example the situation in which Alice sent and onion to Bob who set up an HTLC with Wei.
If Alice wanted to remove the HTLC with Bob this would put a financial risk on Bob.
He fears that his HTLC with Wei still might be fulfilled meaning that he could not claim the reimbursement from Alice.
Thus Bob would never agree to remove the HTLC with Alice unless he already has removed his HTLC with Wei.
If however the HTLC between Alice and Bob are set up and the HTLC between Bob and Wei are set up but Wei encounters problems with forwarding the onion it is perfectly Wei has more options than Alice.
While sending back the error Onion to Bob Wei could ask him to remove the HTLC.
Bob has no risk in removing the HTLC with Wei and Wei also has no risk as there is no downstream HTLC.
Removing an HTLC is happening very similar to adding HTLCs.
Due to the just presented argument only peers who have accepted an offered HTLC can initiate the removal of HTLCs.
In the case of errors peers signals that they wish to remove the HTLC by sending an update_fail_htlc
or update_fail_malformed_htlc
message.
These messages contain the id of an HTLC that should be removed in the next version of the commit transaction.
In the same handshake like process that was used to exchange commitment_signed
and revoke_and_ack
messages the new state and thus pair of commitment signatures has to be negotiated and agreed upon.
This also means while the balance of a channel that was involved in a failed routing process will not have changed at the end it will have negotiated two new commitment transactions.
Despite having the same balance it must not got back to the previous commitment transaction which did not include the HTLC as this commitment transaction was revoked.
If it was used to force close the channel the channel partner would have the ability to create a penalty transaction and get all the funds.
In the last section you you understood the error cases that can happen with onion routing via the chain of HTLCs.
You have learnt how HTLCs are removed if there is an error.
Of course HTLCs also need to be removed and the balance needs to be updated if the chain of HTLCs was successfully set up to the destination and the preimage is being released.
Not surprisingly this process is initiated with anther lightning message called update_fulfill_htlc
.
You will remember that HTLCs are set up and supposed to be removed with a new balance for the recipient in exchange for a secret preimage
.
Recalling the complex protocol with commitment_signed
and revoke_and_ack
messages you might wonder how to make this exchange preimage
for new state atomic.
The cool thing is it doesn’t have to be.
Once a channel partner with an accepted incoming HTLC knows the preimage can savely just pass it to the channel partner.
That is why the update_fulfill_htlc
message contains only the channel_id
the id
of the HTLC and the preimage
.
You might wonder that channel partner could now refuse to sign a new channel state by sending commitment_siged
and revoke_and_ack
messages.
This is not a problem though.
In that case the recipient of the offered HTLC can just go on chain by force closing the channel.
Once that has happened the preimage can be used to claim the HTLC output.
Accepting and HTLC removes funds from a peer that the peer cannot utilize unless the HTLC is removed due to success or failure. Similarly forwarding an HTLC binds some funds from your nodes payment channel until the HTLC is being removed again. As we explained in the very beginning of the chapter engaging into the forwarding process of HTLCs does neither yield a direct risk to loose funds nor does it gain the chance to gain funds. However the funds in jeopardy could be locked for some time. In the worst case the routing process needs to be resolved on chain as the payment channel was forced close due to some other circumstances. In that case outstanding HTLCs produce additional onchain food print and costs. Thus there are two small economic risks involved with the participation in the routing process.
-
Higher onchain fees in case of forced channel closes due to the higher footprint of HTLCs
-
Opportunity costs of locked funds. While the HTLC is active the funds cannot be used otherwise.
In economics and financial mathematics the idea to pay another person that takes a risk is widely spread and seems reasonable. Owners of routing nodes might want to monitor the routing behavior and opportunities and compare them to the onchain costs and the opportunity costs in order to compute their own routing fees that they wish to charge to accept and forward HTLCs.
Also one should notice that HTLCs are outputs in the commitment transaction. Lightning network protocol allows users to pay a single satoshi. However it is impossible to set up HTLCs for this amount. The reason is that the corresponding outputs in the commitment transaction would be below the dust limit. Such cases are solved in practice with the following trick: Instead of setting up an HTLC the amount is taken from the output of the sender but not added to the output of the recipient. Thus the HTLCs which are below the dust limit can understood as additional fees in the commitment transaction. Most Lightning Nodes support the configuration of minimum accepted HTLC values. Operators have to consider if they want to risk overpaying fees or loosing funds in the forced channel close cases because the commitment transactions have been added to the fees.
Explain fee and time-lock considerations The “HTLC Switch” analogy compared to regular network switch Circuit map concept, how to handle forwarding Pipeline styles for HTLCs Error handling and encryption for HTLCs
Explain “one little trick” of DH re-randomization Explain how we keep the packet size fixed, what’s MAC’d, etc Introduce the new modern payload format which uses TLV
echo -n "Glorias secret" | sha256sum
to your Linux command line shell.