All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Matthew DeVore <matvore@comcast.net>
Cc: Matthew DeVore <matvore@google.com>,
	git@vger.kernel.org, jonathantanmy@google.com, jrn@google.com,
	dstolee@microsoft.com, jeffhost@microsoft.com,
	jrnieder@gmail.com, pclouds@gmail.com, emilyshaffer@google.com,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v2 4/9] list-objects-filter: implement composite filters
Date: Fri, 7 Jun 2019 13:58:01 -0400	[thread overview]
Message-ID: <5dc58484-b206-dfe0-981a-437d0a162444@jeffhostetler.com> (raw)
In-Reply-To: <20190606223251.GA7246@comcast.net>



On 6/6/2019 6:32 PM, Matthew DeVore wrote:
> On Mon, Jun 03, 2019 at 05:51:28PM -0400, Jeff Hostetler wrote:
>> Since we are assuming 'compose' is an AND operation, there may be an
>> opportunity to short-cut some of this loop for blobs.  That is, if the
>> object is a blob and any filter rejects it, it is omitted, so we don't
>> need to keep looping for that object.  (Tree objects cannot be short-cut
>> this way because a tree may appear at different depths or in different
>> sparse "cones" and may have to be reconsidered.)
> 
> Blobs are also treated almost the same way as tree objects in tree:<depth>
> filters - they can be included by tree:<depth> - so they also need to be
> reconsidered when found at different depths.
> 
> But I agree it's always true that if some prior filter has excluded a blob, the
> later filters don't even need to be *called at all* for that blob, unless
> perhaps it's found under a different tree later. I also think it may be too
> early to implement this optimization, since filter in a later release may just
> want to "know" about a blob even if it must be excluded in the final result.
> 
> Does the optimization apply to trees as well? Does a tree:<depth> filter still
> want to consider children of tree X if tree X has already been excluded by
> another filter? If it doesn't want to consider, we can short-circuit the checks
> very aggressively. If it does want to consider, we want the short-circuiting to
> be customizable at least for trees.
> 
> A minor point - I don't think that short-circuiting the for loop (breaking out
> early) is important, since it will be very rare that a combine: filter has more
> than 4 or so sub-filters anyway. Calling the filter_fn implementation and
> letting that do internal short-circuiting (informed by the previous filters'
> results) can, however, skip a lot of computation.
> 
>> So you could add an "affects blobs only" bit to the per-filter data
>> and try this out.  For example a "compose:blob:none+sparse:foo" should
>> perform better than "compose:sparse:foo+blob:none" but give the same
>> results.
> 
> Does "affects blobs only" mean the filter includes all non-blob objects?
> 

I just meant that the blobs:none and blobs:limit filters give you a hard
omit.  Other filters later in the chain cannot change or override that
answer (because of the AND assumption); it doesn't matter how deep or
shallow the blob is the tree.

In the case of the tree:depth filter, a blob deep in the tree should
be provisionally omitted in case it appears later in a shallow tree
and should be included.  The tree filter can't do a hard omit on a blob
(just like it can't do a hard omit on a tree node).

WRT your question about a later filter "just wanting to know" about
a blob, I'm not sure.

So yeah, let's wait on this.  We can always add it later as an
optimization if/when it becomes a perf problem (and we have more
experience using them in practice).

Jeff



  reply	other threads:[~2019-06-07 17:58 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-01  0:35 [PATCH v2 0/9] Filter combination Matthew DeVore
2019-06-01  0:35 ` [PATCH v2 1/9] list-objects-filter: make API easier to use Matthew DeVore
2019-06-01  0:35 ` [PATCH v2 2/9] list-objects-filter: put omits set in filter struct Matthew DeVore
2019-06-01  0:35 ` [PATCH v2 3/9] list-objects-filter-options: always supply *errbuf Matthew DeVore
2019-06-01  0:35 ` [PATCH v2 4/9] list-objects-filter: implement composite filters Matthew DeVore
2019-06-03 21:51   ` Jeff Hostetler
2019-06-06 22:32     ` Matthew DeVore
2019-06-07 17:58       ` Jeff Hostetler [this message]
2019-06-01  0:35 ` [PATCH v2 5/9] list-objects-filter-options: move error check up Matthew DeVore
2019-06-01  0:36 ` [PATCH v2 6/9] list-objects-filter-options: make filter_spec a strbuf Matthew DeVore
2019-06-10 20:13   ` Junio C Hamano
2019-06-11  0:34     ` Matthew DeVore
2019-06-11 17:33       ` Junio C Hamano
2019-06-11 18:44         ` Matthew DeVore
2019-06-11 21:34           ` Matthew DeVore
2019-06-11 21:48           ` Junio C Hamano
2019-06-12  0:37             ` Matthew DeVore
2019-06-12 14:55               ` Matthew DeVore
2019-06-01  0:36 ` [PATCH v2 7/9] list-objects-filter-options: allow mult. --filter Matthew DeVore
2019-06-01  0:36 ` [PATCH v2 8/9] list-objects-filter-options: clean up use of ALLOC_GROW Matthew DeVore
2019-06-03 22:07   ` Jacob Keller
2019-06-03 22:39     ` Matthew DeVore
2019-06-04  3:16       ` Jacob Keller
2019-06-01  0:36 ` [PATCH v2 9/9] list-objects-filter-options: make parser void Matthew DeVore
2019-06-03 21:35 ` [PATCH v2 0/9] Filter combination Jeff Hostetler
2019-06-13 21:51 ` [PATCH v3 00/10] " Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 01/10] list-objects-filter: make API easier to use Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 02/10] list-objects-filter: put omits set in filter struct Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 03/10] list-objects-filter-options: always supply *errbuf Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 04/10] list-objects-filter: implement composite filters Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 05/10] list-objects-filter-options: move error check up Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 06/10] list-objects-filter-options: make filter_spec a string_list Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 07/10] strbuf: give URL-encoding API a char predicate fn Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 08/10] list-objects-filter-options: allow mult. --filter Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 09/10] list-objects-filter-options: clean up use of ALLOC_GROW Matthew DeVore
2019-06-13 21:51   ` [PATCH v3 10/10] list-objects-filter-options: make parser void Matthew DeVore
2019-06-14 19:50   ` [PATCH v3 00/10] Filter combination Junio C Hamano
2019-06-15  0:40 ` [PATCH v4 " Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 01/10] list-objects-filter: make API easier to use Matthew DeVore
2019-06-21 22:58     ` Jonathan Tan
2019-06-27  0:46       ` Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 02/10] list-objects-filter: put omits set in filter struct Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 03/10] list-objects-filter-options: always supply *errbuf Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 04/10] list-objects-filter: implement composite filters Matthew DeVore
2019-06-18  8:42     ` Johannes Schindelin
2019-06-18 20:22       ` Matthew DeVore
2019-06-21 18:17         ` Johannes Schindelin
2019-06-22  0:26     ` Jonathan Tan
2019-06-27 21:12       ` Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 05/10] list-objects-filter-options: move error check up Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 06/10] list-objects-filter-options: make filter_spec a string_list Matthew DeVore
2019-06-22  0:37     ` Jonathan Tan
2019-06-27 21:17       ` Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 07/10] strbuf: give URL-encoding API a char predicate fn Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 08/10] list-objects-filter-options: allow mult. --filter Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 09/10] list-objects-filter-options: clean up use of ALLOC_GROW Matthew DeVore
2019-06-15  0:40   ` [PATCH v4 10/10] list-objects-filter-options: make parser void Matthew DeVore
2019-06-22  0:46     ` Jonathan Tan
2019-06-27 21:24       ` Matthew DeVore
2019-06-27 22:27         ` Matthew DeVore
2019-06-18  1:25   ` [PATCH v4 00/10] Filter combination Junio C Hamano
2019-06-27 22:54 ` [PATCH v5 " Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 01/10] list-objects-filter: encapsulate filter components Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 02/10] list-objects-filter: put omits set in filter struct Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 03/10] list-objects-filter-options: always supply *errbuf Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 04/10] list-objects-filter: implement composite filters Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 05/10] list-objects-filter-options: move error check up Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 06/10] list-objects-filter-options: make filter_spec a string_list Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 07/10] strbuf: give URL-encoding API a char predicate fn Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 08/10] list-objects-filter-options: allow mult. --filter Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 09/10] list-objects-filter-options: clean up use of ALLOC_GROW Matthew DeVore
2019-06-27 22:54   ` [PATCH v5 10/10] list-objects-filter-options: make parser void Matthew DeVore
2019-06-28 16:05   ` [PATCH v5 00/10] Filter combination Junio C Hamano
2019-06-28 17:16   ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5dc58484-b206-dfe0-981a-437d0a162444@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=dstolee@microsoft.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=jrn@google.com \
    --cc=jrnieder@gmail.com \
    --cc=matvore@comcast.net \
    --cc=matvore@google.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.