Pung: Unobservable communication over fully untrusted infrastructure

In their paper from 2016, Angel & Setty present Pung, a PIR-based messaging system that exploits clever data structure organization to amortize the traditionally high cost of computational private information retrieval to enable metadata-free communication without trusted parties. The primary contribution of the paper is the realization that a combination of bucketing and batch coding techniques would allow users to retrieve multiple messages in a single request with sub-linear costs the majority of the time. Its iterative optimizations to the naive C-PIR design that puts it closer in throughput to mixnet systems while offering lower latency and stronger privacy guarantees.

Discussion

The paper makes the assumption that multiple, preferably a large consistent number of, potential messages are to be retrieved by each user in every epoch for amortization to be of effect. Failing which, the system is on par in computational costs and actually worse in network overhead compared to even a naive C-PIR design. This makes Pung more attractive for group chatrooms implemented as a fully connected graph (it amortizes its n2 growth) than one-to-one conversations.

Several members of the class brought up potential deployment challenges in Pung’s design, specifically its seemingly single point of failure and inability to scale. The presenter, on the contrary, opined that Pung can in fact scale exceedingly well. He noted that the database rows involved in each round of requests are independent and can be effectively sharded across multiple machines with little synchronization. Availability challenges can be resolved by replicating a shard across yet more machines that are more tightly synchronized. Brilliantly, this can happen between discrete epochs and does not further introduce database consistency challenges.

On the downside, Pung does incur quite high network costs with its numerous rounds of retrieval per epoch and, as with most private messaging schemes, require full and continuous participation by clients even if they expect no message or have none to send for the privacy guarantees to hold. This limits its usefulness to mobile clients who often have intermittent connectivity and high cellular transmission costs.

Slides for the presentation can be found here
Written on November 7, 2018