From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2CB0C433FE for ; Thu, 3 Mar 2022 00:21:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230304AbiCCAV7 (ORCPT ); Wed, 2 Mar 2022 19:21:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230283AbiCCAV5 (ORCPT ); Wed, 2 Mar 2022 19:21:57 -0500 Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7BF945FF35 for ; Wed, 2 Mar 2022 16:21:04 -0800 (PST) Received: by mail-io1-xd2d.google.com with SMTP id r7so3981281iot.3 for ; Wed, 02 Mar 2022 16:21:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=BAmuCwBfieKO0j3OoOkc1z2O5C5diC1FfxP5KsP0W38=; b=YCE1aC64cSoswN/5v/HQsOhrpH/U7+n4ZiniiwB3S53oSZpDQqUm3BjG733W57ktTv rOTl2b35zS/2UFcDNHjUKMcYQSbxSwSkrhkv996jFl3wlH54BZo+Rm0RCX5xXvRFJAQo bYIDfR/xjzfkBQNk829x6e14ehQW4zBaTW3p2mhS5gHQV63qg7cOc1sOVNlhyRMl2hlb V/vvCTOEr8O7Yb/tfwzuTU4WaQIKPWOK9+YT16TZPb59EZ0K7VXOhL9Ij8Y/UTeMsmNZ HTDdC+X16auD62xa9IWK1Q+whGyYDJJeVEZIACJjQmWTqCBWmJcsr6TeFn1J3N2DUobC dxHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=BAmuCwBfieKO0j3OoOkc1z2O5C5diC1FfxP5KsP0W38=; b=t8x/dlb+3DOIDDWQU+EvBxWUoOBVTdEf9R7kORVT5wAw/jmL0Vy8tDjvQj46TI8EiB rLEvqlQ6Mo1cn5kd42h7/a6bv3tyO+y9WUCtTjEHscWNp9Lic9dFTW7MW8rib+VNOjAO sh2B9ZA1l+MDRwUhUxlcI3TXqGeE4jF7+Y6td6KJtK5ajEv1b45cF3flsAGtFrlmxlAM MXDcm1jDFj9TbGsIcYXhbvcndgeYFZvZA+s5S4wimH8bTlXQjOyt3uJtRlS5HED6tnX2 ufF2cP/rnX3KI4ph9TI1cXcS6OD4dde/U7fkkjD3/TcMAFigMZBNqoppSbAJxFDoqaCw 19aw== X-Gm-Message-State: AOAM532I/PaDMfvartV2xuFlsFSNcxyoOoLGZ7V1xKbJ6bWF5JYerA3x KTb+qwhyPcxeD1psoui5uyzRkmiCNMK2Mmxr X-Google-Smtp-Source: ABdhPJzXOi9DkwOkfmda26oxA9FsAhRmHdOSOB6DpJzloXRf+KsO9VCT5Fs0UmyuP9yRc5qJEOYofw== X-Received: by 2002:a05:6638:2688:b0:314:e214:d996 with SMTP id o8-20020a056638268800b00314e214d996mr26550091jat.167.1646266863470; Wed, 02 Mar 2022 16:21:03 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id b18-20020a92c852000000b002bf7b6b3041sm308951ilq.75.2022.03.02.16.21.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Mar 2022 16:21:03 -0800 (PST) Date: Wed, 2 Mar 2022 19:21:02 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: tytso@mit.edu, derrickstolee@github.com, gitster@pobox.com, larsxschneider@gmail.com Subject: [PATCH v3 08/17] builtin/pack-objects.c: --cruft without expiration Message-ID: <22705e4887b5c9e3d7ef9ff1eadaabeeac0d57da.1646266835.git.me@ttaylorr.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Teach `pack-objects` how to generate a cruft pack when no objects are dropped (i.e., `--cruft-expiration=never`). Later patches will teach `pack-objects` how to generate a cruft pack that prunes objects. When generating a cruft pack which does not prune objects, we want to collect all unreachable objects into a single pack (noting and updating their mtimes as we accumulate them). Ordinary use will pass the result of a `git repack -A` as a kept pack, so when this patch says "kept pack", readers should think "reachable objects". Generating a non-expiring cruft packs works as follows: - Callers provide a list of every pack they know about, and indicate which packs are about to be removed. - All packs which are going to be removed (we'll call these the redundant ones) are marked as kept in-core. Any packs the caller did not mention (but are known to the `pack-objects` process) are also marked as kept in-core. Packs not mentioned by the caller are assumed to be unknown to them, i.e., they entered the repository after the caller decided which packs should be kept and which should be discarded. Since we do not want to include objects in these "unknown" packs (because we don't know which of their objects are or aren't reachable), these are also marked as kept in-core. - Then, we enumerate all objects in the repository, and add them to our packing list if they do not appear in an in-core kept pack. This results in a new cruft pack which contains all known objects that aren't included in the kept packs. When the kept pack is the result of `git repack -A`, the resulting pack contains all unreachable objects. Signed-off-by: Taylor Blau --- Documentation/git-pack-objects.txt | 30 ++++ builtin/pack-objects.c | 201 +++++++++++++++++++++++++- object-file.c | 2 +- object-store.h | 2 + t/t5329-pack-objects-cruft.sh | 218 +++++++++++++++++++++++++++++ 5 files changed, 448 insertions(+), 5 deletions(-) create mode 100755 t/t5329-pack-objects-cruft.sh diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index f8344e1e5b..a9995a932c 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -13,6 +13,7 @@ SYNOPSIS [--no-reuse-delta] [--delta-base-offset] [--non-empty] [--local] [--incremental] [--window=] [--depth=] [--revs [--unpacked | --all]] [--keep-pack=] + [--cruft] [--cruft-expiration=