From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12D44C433F5 for ; Wed, 2 Mar 2022 00:58:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238854AbiCBA7Q (ORCPT ); Tue, 1 Mar 2022 19:59:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238848AbiCBA7G (ORCPT ); Tue, 1 Mar 2022 19:59:06 -0500 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96F5C9859B for ; Tue, 1 Mar 2022 16:58:19 -0800 (PST) Received: by mail-io1-xd2d.google.com with SMTP id c18so146637ioc.6 for ; Tue, 01 Mar 2022 16:58:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=gakm3IQnaKxIxbYR90n9EPvtJzAQTt057eQCWRTllSE=; b=1WLmyfrIFEVPBadX8gFqABNie7HaxRiD+nxQtTyso5YnHVt/PSdmGaF381YYHdnHrA 3CMN/9do3ejCDcVFB6fFXRWb6z39boP5H2KeJwrkeSpzl/Ap1QVVPrmh50uZ0SxqJIdN LefO1PZRrhHyFw4K600Ub9vCqubrOhbKt6s0PlTozZg8YdwPMS0+E6bCqY87ZdldyxLM SawYHhwm9HliIL6wL3qfV+t3Gtje2i2G6QPvq+5zaFa5PNQn+0N1hfrM6M0ht/7sUBQ8 /olcAZCuCdIASfHiDXjhes66aXHZi7bYfKdUZJbPmAltRc4vqTRA9bo/njJ4xyVK7TGc ZwVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=gakm3IQnaKxIxbYR90n9EPvtJzAQTt057eQCWRTllSE=; b=fgOk7nM1XEz5jSP/dwrTu9KHgybJdQuiqK7jHvDNd3xwhNhz56M2cnE+3moMp9JGeB /O/5A23A7bn2si/mluGakDAinhmvQIUvLN8cfmLNZVsE7Z4e7CUudqXOMSvGaKhL4CwM cHFbxDRaq1FbaNFylIc3rqxSeqHBIE7T6qa1DgaccIf4ScvpYlSnd5oYL3uizL4pdwip 0b4kSiQAJeNMAdIIXVjQGt3X6U2Mlx5HP1GPu6PYcIFqPV/5ZUNRFciSl3UtCSvEsV14 hBhbgSMII3YbEg5xUJN4VWhwdHZdeN3jrtAYk7iS12/x4qtM3jDQrabHpqOTIsZPrP79 bCeA== X-Gm-Message-State: AOAM533yHPXPDWGSCVO6C39p8SzBcpx1EAJzLyRuoY7zZvCf707x1XV1 k+sMo0iAJXHSe8ew0Jrj9P18Mhxf3nUooff9 X-Google-Smtp-Source: ABdhPJySJrlChtONCS0JhJqMNoCxJ2URle36SuWsKAune7gaLMAjhQkAxTjtRBbvq6tlmrbbnLgUSw== X-Received: by 2002:a6b:e403:0:b0:640:6b4b:1b41 with SMTP id u3-20020a6be403000000b006406b4b1b41mr21035865iog.9.1646182698507; Tue, 01 Mar 2022 16:58:18 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id o11-20020a92a80b000000b002c1ec0ca545sm8608360ilh.18.2022.03.01.16.58.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Mar 2022 16:58:18 -0800 (PST) Date: Tue, 1 Mar 2022 19:58:17 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v2 08/17] builtin/pack-objects.c: --cruft without expiration Message-ID: <2517a6be3d48a721dee6b5aa54f73b64e6abd1d6.1646182671.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach `pack-objects` how to generate a cruft pack when no objects are dropped (i.e., `--cruft-expiration=never`). Later patches will teach `pack-objects` how to generate a cruft pack that prunes objects. When generating a cruft pack which does not prune objects, we want to collect all unreachable objects into a single pack (noting and updating their mtimes as we accumulate them). Ordinary use will pass the result of a `git repack -A` as a kept pack, so when this patch says "kept pack", readers should think "reachable objects". Generating a non-expiring cruft packs works as follows: - Callers provide a list of every pack they know about, and indicate which packs are about to be removed. - All packs which are going to be removed (we'll call these the redundant ones) are marked as kept in-core. Any packs the caller did not mention (but are known to the `pack-objects` process) are also marked as kept in-core. Packs not mentioned by the caller are assumed to be unknown to them, i.e., they entered the repository after the caller decided which packs should be kept and which should be discarded. Since we do not want to include objects in these "unknown" packs (because we don't know which of their objects are or aren't reachable), these are also marked as kept in-core. - Then, we enumerate all objects in the repository, and add them to our packing list if they do not appear in an in-core kept pack. This results in a new cruft pack which contains all known objects that aren't included in the kept packs. When the kept pack is the result of `git repack -A`, the resulting pack contains all unreachable objects. Signed-off-by: Taylor Blau --- Documentation/git-pack-objects.txt | 30 ++++ builtin/pack-objects.c | 201 +++++++++++++++++++++++++- object-file.c | 2 +- object-store.h | 2 + t/t5328-pack-objects-cruft.sh | 218 +++++++++++++++++++++++++++++ 5 files changed, 448 insertions(+), 5 deletions(-) create mode 100755 t/t5328-pack-objects-cruft.sh diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index f8344e1e5b..a9995a932c 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -13,6 +13,7 @@ SYNOPSIS [--no-reuse-delta] [--delta-base-offset] [--non-empty] [--local] [--incremental] [--window=] [--depth=] [--revs [--unpacked | --all]] [--keep-pack=] + [--cruft] [--cruft-expiration=