All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] rev-parse: implement object type filter
@ 2021-03-01 12:20 Patrick Steinhardt
  2021-03-01 12:20 ` [PATCH 1/7] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
                   ` (9 more replies)
  0 siblings, 10 replies; 97+ messages in thread
From: Patrick Steinhardt @ 2021-03-01 12:20 UTC (permalink / raw)
  To: git; +Cc: Christian Couder

[-- Attachment #1: Type: text/plain, Size: 3150 bytes --]

Hi,

I've recently had the usecase to retrieve all blobs introduces between
two versions which have a limit smaller than 200 bytes in order to find
all potential candidates for LFS pointers. This is currently done with
`git rev-list --objects --filter=blob:limit=200 <newrev> ^<oldrev>`, but
this is kind of inefficient: the resulting list is way too long as it
also potentially includes tags, commits and trees.

To be able to more efficiently answer this query, I've implemented
multiple things:

- A new object type filter `--filter=object:type=<type>` for
  git-rev-list(1), which is implemented both for normal graph walks and
  for the packfile bitmap index.

- Given that above usecase requires two filters (the object type
  and blob size filters), bitmap filters were extended to support
  combined filters.

- git-rev-list(1) doesn't filter user-provided objects and always prints
  them. I don't want the listed commits though and only their referenced
  potential LFS blobs. So I've added a new flag `--filter-provided`
  which marks all provided objects as not-user-provided such that they
  get filtered the same as all the other objects.

Altogether, this ends up with the following queries, both of which have
been executed in a well-packed linux.git repository:

    # Previous query which uses object names as a heuristic to filter
    # non-blob objects, which bars us from using bitmap indices because
    # they cannot print paths.
    $ time git rev-list --objects --filter=blob:limit=200 \
        --object-names --all | sed -r '/^.{,41}$/d' | wc -l
    4502300

    real 1m23.872s
    user 1m30.076s
    sys  0m6.002s

    # New query.
    $ time git rev-list --objects --filter-provided \
        --filter=object:type=blob --filter=blob:limit=200 \
        --use-bitmap-index --all | wc -l
    22585

    real 0m19.216s
    user 0m16.768s
    sys  0m2.450s

So with the new optimized query, we can both significantly reduce the
list of candidate LFS pointers and execution time.

Patrick

Patrick Steinhardt (7):
  revision: mark commit parents as NOT_USER_GIVEN
  list-objects: move tag processing into its own function
  list-objects: support filtering by tag and commit
  list-objects: implement object type filter
  pack-bitmap: implement object type filter
  pack-bitmap: implement combined filter
  rev-list: allow filtering of provided items

 Documentation/rev-list-options.txt  |   3 +
 builtin/rev-list.c                  |  14 ++++
 list-objects-filter-options.c       |  14 ++++
 list-objects-filter-options.h       |   8 ++
 list-objects-filter.c               | 116 ++++++++++++++++++++++++++++
 list-objects-filter.h               |   2 +
 list-objects.c                      |  32 +++++++-
 pack-bitmap.c                       |  71 +++++++++++++++--
 revision.c                          |   4 +-
 revision.h                          |   3 -
 t/t6112-rev-list-filters-objects.sh |  76 ++++++++++++++++++
 t/t6113-rev-list-bitmap-filters.sh  |  54 ++++++++++++-
 12 files changed, 380 insertions(+), 17 deletions(-)

-- 
2.30.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 97+ messages in thread

end of thread, other threads:[~2021-04-28  2:18 UTC | newest]

Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-01 12:20 [PATCH 0/7] rev-parse: implement object type filter Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 1/7] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 2/7] list-objects: move tag processing into its own function Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 3/7] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 4/7] list-objects: implement object type filter Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 5/7] pack-bitmap: " Patrick Steinhardt
2021-03-01 12:20 ` [PATCH 6/7] pack-bitmap: implement combined filter Patrick Steinhardt
2021-03-01 12:21 ` [PATCH 7/7] rev-list: allow filtering of provided items Patrick Steinhardt
2021-03-10 21:39 ` [PATCH 0/7] rev-parse: implement object type filter Jeff King
2021-03-11 14:38   ` Patrick Steinhardt
2021-03-11 17:54     ` Jeff King
2021-03-15 11:25   ` Patrick Steinhardt
2021-03-10 21:58 ` Taylor Blau
2021-03-10 22:19   ` Jeff King
2021-03-11 14:43     ` Patrick Steinhardt
2021-03-11 17:56       ` Jeff King
2021-03-15 13:14 ` [PATCH v2 0/8] " Patrick Steinhardt
2021-03-15 13:14   ` [PATCH v2 1/8] uploadpack.txt: document implication of `uploadpackfilter.allow` Patrick Steinhardt
2021-04-06 17:17     ` Jeff King
2021-03-15 13:14   ` [PATCH v2 2/8] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-04-06 17:30     ` Jeff King
2021-04-09 10:19       ` Patrick Steinhardt
2021-03-15 13:14   ` [PATCH v2 3/8] list-objects: move tag processing into its own function Patrick Steinhardt
2021-04-06 17:39     ` Jeff King
2021-03-15 13:14   ` [PATCH v2 4/8] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-03-15 13:14   ` [PATCH v2 5/8] list-objects: implement object type filter Patrick Steinhardt
2021-04-06 17:42     ` Jeff King
2021-03-15 13:14   ` [PATCH v2 6/8] pack-bitmap: " Patrick Steinhardt
2021-04-06 17:48     ` Jeff King
2021-03-15 13:14   ` [PATCH v2 7/8] pack-bitmap: implement combined filter Patrick Steinhardt
2021-04-06 17:54     ` Jeff King
2021-04-09 10:31       ` Patrick Steinhardt
2021-04-09 15:53         ` Jeff King
2021-04-09 11:17       ` Patrick Steinhardt
2021-04-09 15:55         ` Jeff King
2021-03-15 13:15   ` [PATCH v2 8/8] rev-list: allow filtering of provided items Patrick Steinhardt
2021-04-06 18:04     ` Jeff King
2021-04-09 10:59       ` Patrick Steinhardt
2021-04-09 15:58         ` Jeff King
2021-03-20 21:10   ` [PATCH v2 0/8] rev-parse: implement object type filter Junio C Hamano
2021-04-06 18:08     ` Jeff King
2021-04-09 11:14       ` Patrick Steinhardt
2021-04-09 16:05         ` Jeff King
2021-04-09 11:27   ` [PATCH v3 " Patrick Steinhardt
2021-04-09 11:27     ` [PATCH v3 1/8] uploadpack.txt: document implication of `uploadpackfilter.allow` Patrick Steinhardt
2021-04-09 11:27     ` [PATCH v3 2/8] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-04-09 11:28     ` [PATCH v3 3/8] list-objects: move tag processing into its own function Patrick Steinhardt
2021-04-09 11:28     ` [PATCH v3 4/8] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-04-11  6:49       ` Junio C Hamano
2021-04-09 11:28     ` [PATCH v3 5/8] list-objects: implement object type filter Patrick Steinhardt
2021-04-09 11:28     ` [PATCH v3 6/8] pack-bitmap: " Patrick Steinhardt
2021-04-09 11:28     ` [PATCH v3 7/8] pack-bitmap: implement combined filter Patrick Steinhardt
2021-04-09 11:28     ` [PATCH v3 8/8] rev-list: allow filtering of provided items Patrick Steinhardt
2021-04-09 11:32       ` [RESEND PATCH " Patrick Steinhardt
2021-04-09 15:00       ` [PATCH " Philip Oakley
2021-04-12 13:15         ` Patrick Steinhardt
2021-04-11  6:02     ` [PATCH v3 0/8] rev-parse: implement object type filter Junio C Hamano
2021-04-12 13:12       ` Patrick Steinhardt
2021-04-12 13:37     ` [PATCH v4 0/8] rev-list: " Patrick Steinhardt
2021-04-12 13:37       ` [PATCH v4 1/8] uploadpack.txt: document implication of `uploadpackfilter.allow` Patrick Steinhardt
2021-04-12 13:37       ` [PATCH v4 2/8] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-04-12 13:37       ` [PATCH v4 3/8] list-objects: move tag processing into its own function Patrick Steinhardt
2021-04-12 13:37       ` [PATCH v4 4/8] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-04-12 13:37       ` [PATCH v4 5/8] list-objects: implement object type filter Patrick Steinhardt
2021-04-13  9:57         ` Ævar Arnfjörð Bjarmason
2021-04-13 10:43           ` Andreas Schwab
2021-04-14 11:32           ` Patrick Steinhardt
2021-04-12 13:37       ` [PATCH v4 6/8] pack-bitmap: " Patrick Steinhardt
2021-04-12 13:37       ` [PATCH v4 7/8] pack-bitmap: implement combined filter Patrick Steinhardt
2021-04-12 13:37       ` [PATCH v4 8/8] rev-list: allow filtering of provided items Patrick Steinhardt
2021-04-13  7:45       ` [PATCH v4 0/8] rev-list: implement object type filter Jeff King
2021-04-13  8:06         ` Patrick Steinhardt
2021-04-15  9:42           ` Jeff King
2021-04-16 22:06             ` Junio C Hamano
2021-04-16 23:15               ` Junio C Hamano
2021-04-17  1:17                 ` Ramsay Jones
2021-04-17  9:01                   ` Jeff King
2021-04-17 21:45                     ` Junio C Hamano
2021-04-13 21:03         ` Junio C Hamano
2021-04-14 11:59           ` Patrick Steinhardt
2021-04-14 21:07             ` Junio C Hamano
2021-04-15  9:57               ` Jeff King
2021-04-15 17:53                 ` Junio C Hamano
2021-04-15 17:57                   ` Junio C Hamano
2021-04-17  8:58                     ` Jeff King
2021-04-19 11:46       ` [PATCH v5 " Patrick Steinhardt
2021-04-19 11:46         ` [PATCH v5 1/8] uploadpack.txt: document implication of `uploadpackfilter.allow` Patrick Steinhardt
2021-04-19 11:46         ` [PATCH v5 2/8] revision: mark commit parents as NOT_USER_GIVEN Patrick Steinhardt
2021-04-19 11:46         ` [PATCH v5 3/8] list-objects: move tag processing into its own function Patrick Steinhardt
2021-04-19 11:46         ` [PATCH v5 4/8] list-objects: support filtering by tag and commit Patrick Steinhardt
2021-04-19 11:46         ` [PATCH v5 5/8] list-objects: implement object type filter Patrick Steinhardt
2021-04-19 11:46         ` [PATCH v5 6/8] pack-bitmap: " Patrick Steinhardt
2021-04-19 11:47         ` [PATCH v5 7/8] pack-bitmap: implement combined filter Patrick Steinhardt
2021-04-19 11:47         ` [PATCH v5 8/8] rev-list: allow filtering of provided items Patrick Steinhardt
2021-04-19 23:16         ` [PATCH v5 0/8] rev-list: implement object type filter Junio C Hamano
2021-04-23  9:13           ` Jeff King
2021-04-28  2:18             ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.