Enhancing Threshold Security and Performance at Scale: Introducing Shadow Splicing

Enhancing Threshold Security and Performance at Scale: Introducing Shadow Splicing

In the ever evolving field of threshold cryptography, securing key shares through multi-party computation (MPC) offers powerful protection against compromise. MPC at Lit ensures threshold-based signing remains secure, with no single party holding full key share access and all nodes running in TEEs. However, scaling with this security model poses significant performance challenges: as network membership grows, thresholds for signing increase with it resulting in communication overhead between nodes increasing exponentially, thus inversely impacting throughput and limiting scalability. Thus the problem to be solved is how to handle more client requests (performance) while maintaining security?

To address these performance constraints, various solutions have been proposed, each operating within distinct security models. While an exhaustive discussion is beyond the scope of this post, the prevailing strategies can be categorized as follows: (1) increase the number of nodes while maintaining the existing threshold, (2) replicate the network by duplicating key shares across parallel instances, or (3) develop novel cryptographic techniques. The first two approaches are effectively variations of horizontal scaling—either by expanding the existing network or by cloning it. However, both methods inherently weaken the security guarantees of the threshold, a concern that will be elaborated upon later. Despite their shortcomings, no formal research has demonstrated the existence of superior alternatives. The third approach, designing new cryptographic primitives, presents significant challenges, requiring extensive mathematical rigor, expert validation, and a deep understanding of security models. Moreover, the specialized knowledge necessary for such advancements is often inaccessible to those outside the field.

Lit on the other hand, uses a similar approach to the first two but mitigates many of the potential pitfalls without using new cryptography. Lit organizes node operators into realms—discrete groups of nodes that collectively handle signing requests for the network. Each realm manages at least one keyset (a hierarchy of root keys) whose individual key shares are generated and managed by the participating nodes via a distributed key generation (DKG) procedure. This DKG process is used not only for the initial distribution of key shares but also for periodic share refreshes (adding a zero share to ensure proactive security) and for resharing events (when the set of participating nodes is updated).

With Lit, the process of creating new realms with preexisting key shares is referred to as key branching. Key branching, much like Git, creates an independent lineage of keys from a shared ancestor. Even though both realms stem from the same underlying secret, their respective key shares remain incompatible, ensuring that nodes from one realm cannot collude with those of another to generate signatures. If this weren't the case i.e. realms could collude, security would indeed be weakened.

Key branching

At present, this is process is carried out using Lit's backup and recovery system, which brings key shares from a preexisting realm into a newly formed one, a threshold of valid operators running the attested protocol in TEE. Although this approach functions effectively, it necessitates a labor-intensive procedure: recovery stakeholders must provide decryption shares from existing backups to the newly created realm. Moreover, it carries a risk of security entanglement if both realms maintain compatible key shares that could be used in collusion. This is same issue that cloning the network previously discussed presents. To avoid this, the new realm’s shares MUST remain mutually incompatible with those of the original realm e.g. shares from realm A must not be compatible with realm B even though the shares represent the same secrets.

To streamline the key branching process, Lit investigated two biologically inspired strategies: Mitosis and Conjugation. Mitosis continuously expands a single realm until it can divide (growing and splitting a realm), whereas Conjugation, mimicking viral replication, copies key shares into another realm, allowing node operators to join independently (duplicating and transferring key shares). However, Mitosis experiences performance issues as the realm grows previously described due to the threshold also increasing (e.g. going from 6-out-of-10 to 12-out-of-20), and Conjugation briefly duplicates the entire key share set, effectively halving security for that duration (e.g. going from 6-out-of-10 to 6-out-of-20 briefly), the same issue as cloning the network or adding nodes without increasing the threshold.

Shadow Splicing

Shadow Splicing

Enter Shadow Splicing, a hybrid solution. This technique automates key branching while preserving the desired strong security guarantees. It combines the best from both techniques. Here's how it works:

  1. Realm Designation:
    1. The existing realm with key shares is designated as the Shadower. This realm is already handling client signing requests.
    2. The new realm is the Shadowee. These nodes are setup ahead of time, established independently, and ready to go before performing key branching.
  2. Partial Reshare:
    1. A selected group from both realms jointly performs a resharing, resulting in unique "shadow shares" for the new realm. These shadow shares are intentionally incompatible with the original realm's shares, crucially maintaining the desired robust security separation. The Shadower group is equal to the current threshold while the remaining set of nodes are drawn from the Shadowee. For example, if the threshold was 6-out-of-10 (can be any threshold or set of nodes) then each realm has 10 nodes and this partial reshare group is composed of 6 Shadower nodes and 4 Shadowee nodes.
  3. Progressive Replacement:
    1. A second round of resharing introduces more Shadowee nodes while maintaining the threshold and removing Shadower nodes. This subsequent reshare is conducted using only enough Shadower nodes to maintain the threshold. After this stage, the new realm can begin to independently fulfill signing requests. Continuing the previous example, with the threshold set to 6, only 2 Shadower nodes are needed to maintain the threshold of 6 since 4 Shadowee nodes have shares. Thus 4 more Shadowee nodes are added and 4 Shadower nodes are removed. This progressive replacement finishes with 8 Shadowee nodes with valid shares and only 2 Shadower nodes. Since the Shadowee realm is over the threshold of 6, it may begin service.
  4. Complete Migration:
    1. This final step replaces all remaining Shadower nodes with Shadowee nodes, establishing a completely redundant autonomous and secure realm.
Example of Shadow Splicing in Action

After Shadow Splicing completes, the two realms now function separately without any further interactions. All of these steps rely on the same DKG reshare procedure originally employed with Lit, eliminating the need for additional cryptographic techniques, making it incredibly simple.

Shadow Splicing thus seamlessly transfers cryptographic keys between realms without compromising threshold security (key branching). Unlike previous methods, it avoids temporary key duplication (introducing copies of the same shares) or reduced security intervals (e.g. going from 6-out-of-10 to 6-out-of-20). Realms become independent immediately, ensuring isolation and significantly improving efficiency and scalability by eliminating any point where both realms share identical key material, while simultaneously automating the creation of a successor realm.

This innovation demonstrates Lit's commitment to secure, scalable threshold cryptography and marks an important milestone in maintaining robust security standards while scaling effectively. Further advancements and insights in secure, scalable cryptographic solutions will continue to be shared as they are developed.