Part 2: 技术概览

共识

LMD Ghost

LMD GHOST 是一种分叉选择规则，被节点用来确定最佳的链。
它根据所有活跃验证者的投票而为链的不同分支赋予权重。
LMD GHOST 不提供最终确定性，但的确支持确认规则（confirmation rule）。
罚没被用来解决“无利害关系”问题。

介绍

在这一部分，我们将单独探讨 LMD GHOST，完全忽略 Casper FFG 的最终确定性覆盖层¹。LMD GHOST 自身就是一种共识机制的核心——它是一个分叉选择规则，就像中本聪共识中的最重链规则一样——并且有其自身属性和权衡。

当前，我们只考虑故事中的“它如何运作”这个部分——畅通无阻的流程。在后面的“议题和修复（Issues and Fixes）”部分，我们会去探索“它如何出错”的那部分。

命名

LMD GHOST 由两个缩写词组成，分别是“最新信息驱动的（Latest Message Driven, LMD）”，和“贪婪的、最重的可被观察子树（Greedy Heaviest-Observed Sub-Tree, GHOST）”。我们会先打开 GHOST，然后是 LMD。

GHOST

GHOST 协议源自 Sompolinsky 和 Zohar 2013 年的一篇论文，其中讨论了如何安全地提高比特币区块链的交易吞吐量。增加区块大小或减少区块之间的间隔，会使链更容易在无法控制延迟的网络（如互联网）中分叉。分叉的链会存在更多的重构，而重构不利于交易的安全性。用 GHOST 分叉选择规则替换比特币的最长链分叉选择规则，会让链在存在延迟的情况下更加稳定，允许更频繁的区块生产。

GHOST 全称是“贪婪的、最重的可被观察子树”，这描述了 GHOST 算法的工作方式。我们将在下面展开讨论这一点。简而言之，GHOST 的分叉选择不是遵循最重的链，而是遵循最重的子树。它认识到对某个区块的投票不仅仅是针对该区块自身，也隐含着对该区块每一个祖先的投票，因此整个子树都会获得相关权重。

比特币从未采用 GHOST。和论文中声称的不同，工作量证明下的以太坊也没有采用这一协议。尽管以太坊最初计划采用 GHOST，且先前工作量证明中的“叔块（uncle）”奖励与此相关。

LMD

以太坊权益证明中使用的 GHOST 协议能够处理认证，是扩展后的版本。在工作量证明中，投票者是区块提议者。他们通过将自己的区块构建在某个特定分支上来为其投票。在权益证明协议中，所有验证者都是投票者。通过发布认证，每个验证者平均每 6.4 分钟为其对网络的视图进行一次投票。因此，在权益证明中，我们可以获得更多关于参与者对网络状态看法的可用信息。

这就是所谓的“消息驱动（message driven）”，我们有了 LMD 中的 MD。分叉选择不是由提议者所添加的区块驱动，而是由所有验证者发布的消息（认证、投票）驱动。

这里的“L”代表“最新（latest）”：LMD GHOST 只考虑每个验证者的最新信息，即我们从验证者那里收到的最新认证。验证者之前的所有信息都会被丢弃，但最新投票被保留，并且无限期地具有权重。

顺便说一句，还有其他版本的消息驱动的 GHOST 可用。Vitalik 最初倾向于 IMD GHOST，即“即时消息驱动的（Immediate Message Driven）” GHOST。据我所知²，这会无限期地保留所有认证，且分叉选择基于当前最新认证运行。然后是 FMD GHOST，即“新鲜消息驱动（Fresh Message Driven）”的 GHOST，它只考虑当前时段和上一个时段的认证。还有 RLMD，即“近期的最新消息驱动（Recent Latest Message Driven）”的 GHOST，它只记下验证者在可参数化的时段内的最新认证。

它如何运作

LMD GHOST 首先是一个分叉选择规则。给定一个区块树和一系列投票，LMD GHOST 会告诉我应该将哪个区块视为最佳头块，从而提供一个从该头块一直到创世区块的线性历史视图。这一决策是基于我对链的本地视图，即节点收到的消息（区块和认证）——记住，没有“上帝视角”，我的本地视图是我所能利用的全部，它很可能与其他节点的本地视图不同。我们的想法是，诚实的验证者会根据他们看到的最佳头块来构建自己的区块，并将根据他们看到的最佳头块投票。

一个好的分叉选择规则将提供以下几点：³

多数节点诚实推进：如果超过 50% 的节点按照分叉选择规则构建区块，链就会前进，并且（指数级地）不太可能回滚旧区块。
稳定性：分叉选择可以很好地预测未来的分叉选择。
抗操纵性：即使攻击者在一小部分参与者中暂时成为绝对多数，它们也不太可能回滚区块。

这几点相互关联。对于区块提议者来说，稳定性尤其重要。提出一个区块时，我希望尽可能地确定该区块永远保留在链上。也就是，我希望尽可能确信它不会被重组出链。找到头块意味着：当我找到它并在其基础上构建新区块时，该头块最有可能使我的构建的新区块成为其他节点的视图中的头块。

我们将一分为二地探讨 LMD GHOST 如何运作。先看 LMD 部分，即最新消息，然后再看 GHOST 部分，即找到头块。

Finding the head block

In essence, the LMD GHOST fork choice rule is a function $\text{GetHead}(\text{Store}) \rightarrow \text{HeadBlock}$ . As we've seen, a node's Store is its view of the world: all the relevant information it has received that could affect the fork choice. For the pure LMD GHOST algorithm we are looking at here, the relevant parts of the Store are the following.

The block tree, which is just a list of blocks. The blocks' parent links join them logically into a tree.
The list of latest messages (votes) from validators.
The validators' effective balances (based on some state) as these provide the weights used in the algorithm.

The goal of the GHOST algorithm is to select a single leaf block from the given block tree, where a leaf is a block without any descendants. This will be our chosen head block.

We are going to assume that all the blocks in our block tree descend from a single root block. In a pure GHOST algorithm, that would be the genesis block: by definition, all blocks must descend from genesis. In our full consensus implementation, that root block will be the last justified checkpoint block. For our present purposes, all we need to know is that the GHOST algorithm starts from a given block, and ignores all blocks not descended from that block.

Get weight

The first thing we do is calculate the "weight" of each branch in the tree. A branch's weight is its score, in some sense.

The weight of a vote is the effective balance of the validator that made the vote. This will usually be 32 ETH, the maximum effective balance, but could be less. A vote, recall, is the latest message we have from that validator.

The weight of a branch is the weight of the votes for the block that roots it, plus the weights of that block's child branches. By including the weights of child branches, we are acknowledging that a vote for a block is also a vote for each of the ancestors of that block. The weight of a branch consisting of only a leaf block will be just the weight of the votes for that block.

$B_N$ is the sum of the effective balances of the validators whose most recent head vote was for block $N$ , and $W_N$ is the weight of the branch rooted at block $N$ .

Some obvious relationships apply between the weights, $W_x$ , of branches, and $B_x$ , the weights of the votes for blocks.

For a branch comprising only a leaf block, $L$ , $W_L = B_L$ .
The weight of a branch is the weight of the votes for the block at its root plus the sum of the weights of all branches below it. So, in the diagram, $W_1 = B_1 + W_2 + W_3$ .
The weight of a branch is the sum of the weights of all votes for blocks in the subtree that forms the branch.

Since votes always carry a positive weight, no block will have a greater weight than the root block, and the root block's weight is the sum of the weights of all the votes for blocks in the tree. Every validator has at most one latest message – that is, one vote –, so that weight is bounded above by the total effective balance of all active validators.⁴

Get head

Once we have the weight of each branch or subtree, the algorithm proceeds recursively. Given a block, we select the heaviest branch descending from it. We then repeat the process with the block at that branch's root. If a block has only one child, then the choice is obvious and there is no work to do. If two or more branches have equal weight, we arbitrarily choose the branch rooted at the child block with the highest block hash value. Eventually we will reach a block with no children. This is a leaf block and will be the output of the algorithm.

Unpacking the GHOST name, we see that the algorithm: is Greedy, meaning that it takes the Heaviest-Observed branch immediately, without looking further; and deals with Sub-Trees, the weight of a branch being the sum of all the weights of votes for blocks in the subtree.

Here's a simple example. In the diagrams, I've distinguished between (1) the weight of the votes for a particular block, which are the numbers attached to each block, and (2) the weights of branches, which I've added to the lines joining the blocks to their parents.

First, from the latest messages in the Store, we calculate the weight of votes for each block in the tree.

get_head() starts from the root block, $A$ , of a block tree. The numbers show the weights of the votes for each block.

Second, from the weights of votes for each block, we can calculate the weight of each branch or subtree.

The get_weight() function when applied to a block returns the total weight of the subtree of the block and all its descendants. These weights are shown on the lines between child and parent blocks.

Third, we move recursively through the blocks at the roots of the subtrees, always choosing the branch or subtree with the highest weight. Eventually, we will reach a leaf, which is our chosen head block.

At block $A$ , branch $C$ is heavier. At $C$ , branch $E$ is heavier. Block $E$ is a leaf block, so it is GHOST's choice for the head of the chain. Our blockchain is $[A \leftarrow C \leftarrow E]$ . A "longest chain" rule would have chosen block $G$ , although that branch has minority support from validators.

If, say, the subtrees rooted at $B$ and $C$ had equal weight, we would choose according to which of blocks $B$ and $C$ had the greatest block hash - this is a completely arbitrary tie-break mechanism.

The specification

The code that implements all this in the specification is get_head(), which walks up through the tree from the root, taking the heaviest branch at each fork. The code for calculating the weight of a subtree is get_weight().

If you look at the get_weight(), you'll find that it is more complex than what we've covered here, due to something called "proposer boost". We shall discuss proposer boost in detail in the Issues and Fixes section.

Intuition

Having seen how the GHOST protocol works, it's perhaps easier to gain some intuition for why we prefer it to a longest chain rule. The occurrence of forks suggests that block propagation time has become of similar order to, or exceeds, block production intervals (slots). In short, not all validators are seeing all the blocks in time to attest to them or to build on them.⁵

In these circumstances, we want to take advantage of the maximum amount of information available. Votes for two different children of the same parent block should be taken as confirmation that all those validators favour the parent block's branch, even if there is disagreement about the child blocks. GHOST achieves this simply by allowing a vote for a child block to add weight to all of its ancestors. Thus, when faced with a choice, we favour the branch with the greatest total support from validators. I've illustrated this in the diagram above: branch $C$ is favoured over branch $B$ , even though block $B$ has more direct votes than block $C$ , since more validators overall made latest votes for branch $C$ .

The longest chain rule discards all this information, and can allow a branch to win even if only a minority of validators has been working on it.

Confirmation rule

In proof of work, the only data available for input into the fork choice comes from block production, which represents the view of a single miner at the time of publishing the block.

In Ethereum's proof of stake protocol, we have vastly more information available to us, in the form of head block votes from $\frac{1}{32}$ of the validator set every 12 seconds, in addition to the block proposer's view.

In principle, all this extra information ought to allow us to be very sure, very quickly, about whether blocks will remain canonical or are in danger of being reverted. Proof of work chains tend to use a heuristic around the number of confirmations that a block has received. That is, blocks are assumed to be exponentially less likely to be reverted the more blocks have been built on top of them. This is generally true, but can fail badly in high-latency environments (such as an attacker making a longer chain in secret).

Proof of stake's ultimate answer to this is finality, which we shall look at in the next section. LMD GHOST on its own does not provide finality. However, it's interesting to consider whether there is some heuristic analogous to proof of work's confirmations rule that we can use, and it turns out that there is. In fact, it improves on the PoW confirmation rule by giving a "yes/no" statement about the safety of a block, rather than a probability.

The details of the confirmation rule are described by Aditya Asgaonkar in a blog post and an accompanying paper. The general idea is that, for a block $b$ , we calculate two values $q$ and $q_\text{min}$ . When $q > q_\text{min}$ , and the network remains close to synchronous, then that block is confirmed as "safe". We can be fully confident that it will not be reorged.

The quantity $q^n_b$ is defined as the weight of the subtree rooted at $b$ at slot $n$ divided by the total weight of votes cast since $b$ was produced. Simply put, if, in the slots since $b$ was proposed, 80% of validators voted for $b$ or a descendant of $b$ , then $q^n_b$ would be $0.8$ .

$q_\text{min}$ is defined as $\frac{1}{2} + \beta$ , where $\beta$ is the fraction of stake we believe is controlled by an adversary. This fraction is unknown, but is assumed to be less than one third, otherwise we have big problems.

Now, if $q^n_{b'} > q_\text{min}$ for $b$ and all its (non-finalised) ancestors, $b'$ , then $b$ is considered to be confirmed, or "safe".

The idea is that, once a branch up to block $b$ has accumulated a simple majority of the available voting weight, then all honest validators will continue to vote for that branch, so it will maintain its majority indefinitely. One way this can fail is if dishonest validators swap their votes to another branch, giving that branch a majority instead. That's why we add a fraction $\beta$ to the basic majority safety parameter of $\frac{1}{2}$ , so that, even if all the dishonest validators swap branches, our branch maintains a majority. The other way this can fail is if the network begins to suffer delays or partitions (loses synchrony), so that some honest validators cannot see other honest validators' votes, hence the reliance on the network remaining sufficiently synchronous.

While this seems very intuitive, there are some important subtleties and complications related to the integration with Casper FFG that modify the confirmation rule, so anyone dealing with this should consult the full paper. In addition, proposer boost makes it easier for an adversarial block proposer to reorg a block, so we need to account for that as well.

The table below summarises the differences between the confirmation rule and finality for a block.

	Confirmation	Finality
Time	One slot in ideal circumstances, less than a minute more generally.	At least two epochs / 13 minutes.
Assumptions	Network remains synchronous until finality.	No synchrony assumption.
Breakage	A confirmed block can be reorged if the network does not remain synchronous.	A conflicting block can be finalized if more than $\frac{1}{3}$ of the validators commit a slashable action.

The confirmation rule is not yet implemented in client software as I write, but should be available in due course via the safe block specification.

Incentives in LMD GHOST

One of the ways cryptoeconomic systems secure themselves is by rewarding good behaviour and penalising bad behaviour. In our implementation of LMD GHOST, both proposers and attesters are rewarded in different ways for accurately finding the head of the chain.

The incentive for a block proposer to build on the best head is clear. If it doesn't, then there's a good chance that its block will not be included in the eventual canonical chain – it will be orphaned – in which case the proposer will not receive any of its block rewards. This is an implicit incentive rather than explicit; miners in proof of work are in a similar situation.

By contrast, validators are directly rewarded for voting accurately. When a validator makes an accurate head vote, and its attestation is included in a block in the very next slot, it receives a micro reward. A perfectly performing validator will gain about 22% of its total protocol rewards from making accurate head votes. Proposers, in turn, are incentivised to include such attestations in blocks as they receive a proportionate micro reward for each one they manage to get in.

A vote is accurate if it matches what ends up in the canonical chain. So if the validator voted for a block at the previous slot, and the canonical chain has a matching block there, then it receives its reward. If it voted for a block from a previous slot, indicating a skipped slot, and the canonical chain has skipped slots between that block and the present, then it will also receive its reward.

There is no penalty if a validator makes an inaccurate head vote, or their head vote is not included on chain within one slot. Getting the head vote right is difficult when the beacon chain is under stress, and there are late and missing blocks. This is often no fault of the validator itself, and it was felt unfair to penalise them in such circumstances.⁶

Slashing in LMD GHOST

One of the big breakthroughs in proof of stake design was the adoption of slashing as a way around the "nothing at stake" problem. The problem, in essence, is that under proof of stake it is almost costless for a validator to equivocate by publishing multiple contradictory messages.

The solution turns out to be rather elegant. We detect when a validator has equivocated and punish it. The punishment is called "slashing", and entails removing a proportion of the validator's stake, and ejecting the validator from the protocol. Since validators digitally sign their messages, finding two contradictory signed messages is absolute proof of misbehaviour, so we can slash with confidence.

The name "slashing" comes from Vitalik's Slasher algorithm from early 2014. That was a very early proposal for solving the nothing at stake problem. Our current design doesn't look much like Slasher, but some things have carried over, not least the name.

Proposer slashing

When it is a validator's turn to produce a block in a particular slot, the validator is supposed to run the fork choice rule in order to decide which existing block it will build its own block on. Its goal is to identify the fork that is most likely, based on the evidence it has, to be the one that eventually becomes canonical. That is, the one that the whole set of correct validators will converge on.

The proposer of a block needs to choose which block to build on. The best strategy is to build on the block least likely to be reorged, as indicated by the fork choice rule.

However, why choose? Under proof of stake – unlike under proof of work – it is almost costless for validators to produce blocks. Therefore, a good strategy would seem to be to propose multiple blocks, one built on each possible head, then at least one of my blocks is guaranteed to become part of the eventual canonical chain.

In the absence of punishment, a lazy or dishonest proposer might choose to extend both sides of the fork.

This is undesirable because it prolongs any fork and prevents the network from converging on a linear history. Users of the chain may not be able to work out which fork is correct, and that makes them vulnerable to double spend attacks, the very thing we wish to avoid.

This illustrates the nothing at stake problem. As outlined above, the solution is to detect the two contradictory blocks and slash the validator that proposed them.

Proposer equivocation is not detected in-protocol, but relies on a third-party constructing a proof of equivocation in the form of a ProposerSlashing object. The proof comprises just two signed beacon block headers: that's enough to prove that the proposer signed off on two blocks in the same slot. A subsequent block proposer will include the proof in a block (and be well rewarded for it), and the protocol will slash the offending validator's balance and eject it from the active validator set.

Attester slashing

Similarly, when it is a validator's turn to publish an attestation, it is supposed to run its fork choice rule and vote for the head block that it thinks is best. The issue here is that an attacker could make multiple contradictory attestations in order to provoke or prolong forks and prevent the network from converging on a single chain. Even a mostly-honest validator might be tempted to make two attestations for a borderline late block: one voting for the block, one voting for an empty slot. If it weren't punishable behaviour, this would hedge the chance of voting the wrong way and missing out on a reward.

The remedy is the same: detect and punish contradictory attestations. That is, attestations made in by the same validator in the same slot for different head blocks.

History

The LMD GHOST fork choice in Ethereum has its origin's in Vlad Zamfir's work on the Casper CBC protocol (then known as Casper TFG, and not to be confused with Casper FFG, which is something quite different⁷). The initial announcement of Casper TFG was in 2015, and in his 2017 Casper the Friendly Ghost paper, Zamfir describes combining Sompolinsky and Zohar's GHOST protocol with a "latest message" construction.

In August 2018, Vitalik still favoured a fork choice called IMD GHOST (formerly known as Recursive Proximity to Justification), that was more aware of finalisation and justification than the pure LMD GHOST that we have today. As the Eth2 consensus mini-spec evolved, IMD GHOST was changed to LMD GHOST in November 2018⁸. This was due to concerns about the stability properties of IMD GHOST.

That November 2018 description of LMD GHOST in the mini-spec is essentially what we are using today.