Featured Project II
3 min read.
Reliability vs Security
Privacy-Preserving:
Hatching –> Computing
Ostriches often seek concealed sites for incubation rather than exposed open ground (quite intuitive🙂). What is more interesting, however, is that multiple ostriches may concentrate their eggs in a shared nesting site, allowing them to more efficiently use this carefully prepared and protected nest. Yet this strategy introduces a new risk.: any failure during incubation could affect more eggs.

Security may Need More Reliability
Fully Homomorphic Encryption, or FHE, is a fascinating technique. It allows data to be processed while it remains encrypted, without decrypting it first. This is often called privacy-preserving computation. This is similar to the ostrich hatching example: hatching eggs is like processing data. To improve efficiency, FHE often packs many data items together. But this also introduces a similar risk: faults during FHE computation can be greatly amplified. For example, some AI models that are normally fault-tolerant may become unable to tolerate even a single-bit error when running under FHE.
Contribution
Existing FHE accelerators often overlook a simple fact: hardware cannot compute everything perfectly. More seriously, as mentioned above, FHE can make applications more fault-sensitive because of its encryption and data-packing schemes. This creates a stronger need for reliability.
We propose a resilient framework that protects both storage and computation in FHE accelerators. One key idea is to use lightweight checksum-based schemes to build efficient codewords, taking advantage of the fact that FHE data is organized as very large ciphertext polynomials. In theory, this framework can improve reliability by several orders of magnitude with only about 1% overhead.

Limitation
- No design is perfect!
- This design is kernel-specific. It does not provide a uniform solution for all kernels. This also makes some design details harder to understand.
- This design currently focuses on FHE. We are exploring additional solutions that can support broader secure computing systems.