From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0387C433F5 for ; Fri, 5 Nov 2021 07:07:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B21426124F for ; Fri, 5 Nov 2021 07:07:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232009AbhKEHJ7 (ORCPT ); Fri, 5 Nov 2021 03:09:59 -0400 Received: from cloud.peff.net ([104.130.231.41]:53688 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229494AbhKEHJ6 (ORCPT ); Fri, 5 Nov 2021 03:09:58 -0400 Received: (qmail 10730 invoked by uid 109); 5 Nov 2021 07:07:19 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Fri, 05 Nov 2021 07:07:19 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 15088 invoked by uid 111); 5 Nov 2021 07:07:21 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 05 Nov 2021 03:07:21 -0400 Authentication-Results: peff.net; auth=none Date: Fri, 5 Nov 2021 03:07:18 -0400 From: Jeff King To: Patrick Steinhardt Cc: git@vger.kernel.org Subject: Re: [PATCH] refs: sync loose refs to disk before committing them Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Thu, Nov 04, 2021 at 01:38:22PM +0100, Patrick Steinhardt wrote: > When writing loose refs, we first create a lockfile, write the new ref > into that lockfile, close it and then rename the lockfile into place > such that the actual update is atomic for that single ref. While this > works as intended under normal circumstences, at GitLab we infrequently > encounter corrupt loose refs in repositories after a machine encountered > a hard reset. The corruption is always of the same type: the ref has > been committed into place, but it is completely empty. Yes, I've seen this quite a bit, too, and I agree that more fsync()-ing will probably help. There are two things that have made me hesitate, though: 1. This doesn't quite make the write durable. Doing that would also require fsyncing the surrounding directory (and so on up the tree). This is a much trickier problem, because we often don't even have open descriptors for those. I think we can ignore that, though. Doing an fsync() here makes things strictly better with respect to robustness (especially if you care less about "has my ref update been committed to disk" and more about "does this end up corrupt after a power failure"). 2. It's not clear what the performance implications will be, especially on a busy server doing a lot of ref updates, or on a filesystem where fsync() ends up syncing everything, not just the one file (my impression is ext3 is such a system, but not ext4). Whereas another solution may be journaling data and metadata writes in order without worrying about the durability of writing them to disk. I suspect for small updates (say, a push of one or two refs), this will have little impact. We'd generally fsync the incoming packfile and its idx anyway, so we're adding may one or two fsyncs on top of that. But if you're pushing 100 refs, that will be 100 sequential fsyncs, which may add up to quite a bit of latency. It would be nice if we could batch these by somehow (e.g., by opening up all of the lockfiles, writing and fsyncing them, and then renaming one by one). So I think we want to at least add a config knob here. That would make it easier to experiment, and provides an escape hatch. > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 151b0056fe..06a3f0bdea 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -1749,6 +1749,7 @@ static int write_ref_to_lockfile(struct ref_lock *lock, > fd = get_lock_file_fd(&lock->lk); > if (write_in_full(fd, oid_to_hex(oid), the_hash_algo->hexsz) < 0 || > write_in_full(fd, &term, 1) < 0 || > + fsync(fd) < 0 || > close_ref_gently(lock) < 0) { > strbuf_addf(err, > "couldn't write '%s'", get_lock_file_path(&lock->lk)); I agree with the sentiment elsewhere that this would probably make sense for any lockfile. That does probably make batching a bit more difficult, though it could be layered on top (i.e., we could still fsync in the ref code, and the lockfile fsync-ing should then become a noop). -Peff