{"ID":2859569,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.06404","arxiv_id":"2510.06404","title":"Asynchronous Checkpoint for Eventually Consistent Databases","abstract":"We focus on the problem of checkpointing (or taking a snapshot) in fully replicated eventually consistent distributed databases. In particular, we consider the problem of taking Distributed Transaction-Consistent Snapshots (DTCS). A typical example of such a system is a replicated main-memory database that provides strong eventual consistency. This problem is important and challenging for several reasons: (1) eventual consistency often creates anomalies that the users do not anticipate. Hence, frequent snapshots that can be used to ascertain desired invariants are highly beneficial in their maintenance, and (2) traditional distributed snapshot algorithms lead to significant overhead and/or inconsistencies such as storing dirty writes of incomplete transactions. A key benefit of DTCS is that it summarizes the computation by a sequence of snapshots that are strongly consistent even though the underlying computation is only weakly consistent. In essence, when anomalies arise in an eventually consistent system, DTCS enables one to concentrate solely on the snapshots surrounding the time point of the anomaly. By showing that traditional distributed snapshots lead to inconsistencies and/or excessive overhead, we define the notion of size-minimal DTCS for fully replicated databases. We present MuFASA, an algorithm for a size-minimal DTCS with minimal checkpointing overhead (only O(n) new messages and the addition of a single counter for existing messages). MuFASA also provides a significant benefit over existing checkpointing algorithms for distributed systems and replicated main-memory databases by being a fully asynchronous protocol.","short_abstract":"We focus on the problem of checkpointing (or taking a snapshot) in fully replicated eventually consistent distributed databases. In particular, we consider the problem of taking Distributed Transaction-Consistent Snapshots (DTCS). A typical example of such a system is a replicated main-memory database that provides str...","url_abs":"https://arxiv.org/abs/2510.06404","url_pdf":"https://arxiv.org/pdf/2510.06404v2","authors":"[\"Raaghav Ravishankar\",\"Sandeep Kulkarni\",\"Nitin H Vaidya\"]","published":"2025-10-07T19:29:41Z","proceeding":"cs.DC","tasks":"[\"cs.DC\"]","methods":"[]","has_code":false}
