Push-Button Verification of File Systems via Crash Refinement.

We will be discussing Sigurbjarnarson, Helgi, et al. “Push-Button Verification of File Systems via Crash Refinement.” OSDI. Vol. 16. 2016.

Problem

File systems are a huge component of operating systems. There is difficulty in properly implementing a file system that is bug free. The reason for this is because file systems are complex data structures that need to be resilient to system crashes or power failures. Currently verifying a file system has a huge proof burden. There is a lot of time spent creating proofs and testing against the file system.

The main problems that the authors attempt to address are:

  • Difficult to create bug-free file systems
  • Proof burden in current verification methods

Solution

The authors attempt to solve this problem by using crash refinement. Crash refinement requires the set of possible disk states produced by an implementation to be a subset of those allowed by the developers specifications. Crash refinement captures the presence of non-determinism. The authors encapsulate this with their toolkit Yggdrasil.

Method

Yggdrasil is the authors’ implementation of a push-button version to verify file systems. Yggdrasil is specified to be:

  • a verification process based on crash refinement
  • uses fully automated reasoning
  • for single threaded systems
  • provides optimizations and reverifies the code
  • asks for 3 inputs:
    • specification of expected behavior
    • implememntation
    • consistency invariants indicating whether a file system image is in a consistent state

Yggdrasil has been used to test YminLFS and Yxv6.

  • YminLFS is a log-structured file system that is minimal to construct, but still tricky. Yggdrasil found 2 bugs when implementing this.

  • Yxv6 is a journaling file system. Yxv6 takes advantage of multiple layers to keep a modular design that is simple to verify.

Takeaway

The authors present a push-button method of verification with Yggdrasil that takes advantage of crash refinement. The authors tested this implememntation with 2 file systems and showed its success rate. There are limitations, however, such as single threaded systems, reliance on other ideas (such as Z3), and the inability to repair corrupted file systems.

Get the presentation slides here

Written on October 3, 2018