Hi, I have a copy of Gitea application repository [1] on my local Git server on my computer. I was playing with partial clones using that repository as remote. I began with blobless clone the repo by: $ git clone https://<myhost>/bagas/gitea.git gitea --filter=blob:none Then I tried to fetch missing objects. First, I gathered list of them: $ git rev-list --objects --all --missing=print | grep -o -P '^\?\K\w+' > .git/missing.list I had asked how to properly fetch objects from the list above before on this list, and brian m. carlson (CC'ed) suggested [2] that I should use xargs: $ xargs git fetch origin < .git/missing.list I expected that I received all missing objects. However, the error message I got was something like below, repeated: > remote: ...<skipped> > Receiving objects: 100% (64/64), 154.49 KiB | 2.97 MiB/s, done. > remote: ...<skipped> > Receiving objects: 100% (37/37), 168.35 KiB | 4.95 MiB/s, done. > Resolving deltas: 100% (5/5), done. > Resolving deltas: 100% (49/49), completed with 47 local objects. > fatal: bad revision 'd5e9cd36ab21839af3d116eff3221c53f6ca7fd6' > error: https://<myhost>/bagas/gitea.git did not send all necessary objects Then I repeated the object list gathering and fetching above, and the error still occured. Even I tried fetching only one of problematic object above and still errored. When I inspected that object with `git cat-file -p` from my other, full clone version (which I used for submitting changes upstream), the object is properly displayed (in this case as source code in Go). Note that I have uploadpack.allowfilter=true config on the server. Am I missing something? [1]: https://github.com/go-gitea/gitea [2]: https://lore.kernel.org/git/YD7bczBsIR5rkqfc@camp.crustytoothpaste.net/ -- An old man doll... just what I always wanted! - Clara
On 5/7/21 8:56 AM, Bagas Sanjaya wrote: > Hi, > > I have a copy of Gitea application repository [1] on my local Git > server on my computer. I was playing with partial clones using that > repository as remote. > > I began with blobless clone the repo by: > > $ git clone https://<myhost>/bagas/gitea.git gitea --filter=blob:none > > Then I tried to fetch missing objects. First, I gathered list of them: > > $ git rev-list --objects --all --missing=print | grep -o -P '^\?\K\w+' > > .git/missing.list > > I had asked how to properly fetch objects from the list above before on > this list, and brian m. carlson (CC'ed) suggested [2] that I should > use xargs: > > $ xargs git fetch origin < .git/missing.list > > I expected that I received all missing objects. However, the error message > I got was something like below, repeated: > >> remote: ...<skipped> >> Receiving objects: 100% (64/64), 154.49 KiB | 2.97 MiB/s, done. >> remote: ...<skipped> >> Receiving objects: 100% (37/37), 168.35 KiB | 4.95 MiB/s, done. >> Resolving deltas: 100% (5/5), done. >> Resolving deltas: 100% (49/49), completed with 47 local objects. >> fatal: bad revision 'd5e9cd36ab21839af3d116eff3221c53f6ca7fd6' >> error: https://<myhost>/bagas/gitea.git did not send all necessary >> objects > > Then I repeated the object list gathering and fetching above, and the error > still occured. Even I tried fetching only one of problematic object above > and still errored. > > When I inspected that object with `git cat-file -p` from my other, full > clone version (which I used for submitting changes upstream), the object > is properly displayed (in this case as source code in Go). > > Note that I have uploadpack.allowfilter=true config on the server. > > Am I missing something? > > [1]: https://github.com/go-gitea/gitea > [2]: > https://lore.kernel.org/git/YD7bczBsIR5rkqfc@camp.crustytoothpaste.net/ > I'm not sure why you're getting that error, but you might also try enabling `uploadpack.allowAnySHA1InWant` (or one of the other "allow*SHS1InWant" values) and see if that helps. https://git-scm.com/docs/git-config#Documentation/git-config.txt-uploadpackallowAnySHA1InWant Jeff
On Tue, May 11, 2021 at 02:44:52PM -0400, Jeff Hostetler wrote:
> > > remote: ...<skipped>
> > > Receiving objects: 100% (64/64), 154.49 KiB | 2.97 MiB/s, done.
> > > remote: ...<skipped>
> > > Receiving objects: 100% (37/37), 168.35 KiB | 4.95 MiB/s, done.
> > > Resolving deltas: 100% (5/5), done.
> > > Resolving deltas: 100% (49/49), completed with 47 local objects.
> > > fatal: bad revision 'd5e9cd36ab21839af3d116eff3221c53f6ca7fd6'
> > > error: https://<myhost>/bagas/gitea.git did not send all necessary
> > > objects
> [...]
>
> I'm not sure why you're getting that error, but you might
> also try enabling `uploadpack.allowAnySHA1InWant`
> (or one of the other "allow*SHS1InWant" values) and see if
> that helps.
That shouldn't be necessary these days, since the v2 protocol allows
arbitrary objects to be fetched.
I think it is actually a bug with pack-objects not sending the object,
but it only seems to trigger with bitmaps. This works:
git init repo
cd repo
echo content >file
git add file
git commit -m base
git config uploadpack.allowfilter true
git clone --no-local --bare --filter=blob:none . clone
cd clone
git fetch origin $(git rev-parse HEAD:file)
But if I add a "git repack -adb" in the parent repository before the "cd
clone", then I get:
remote: Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed'
error: /home/peff/tmp/repo/. did not send all necessary objects
So presumably this is a bug in the bitmap-aware filtering code that is
not present in the regular filter-traversing code. But what really
puzzles me is that the result seems totally broken. Yet the test
coverage in t5310 passes, and nobody has noticed on major sites like
GitHub (which supports partial clones and most certainly has bitmaps
enabled).
So I think it will require some digging. My _guess_ is that it has to do
with the "never filter out an object that was explicitly requested" rule
not being consistently followed. But we'll see.
-Peff
[+cc Patrick, who also worked in this area recently and is probably interested to hear about this bug] On Thu, May 13, 2021 at 05:57:57AM -0400, Jeff King wrote: > I think it is actually a bug with pack-objects not sending the object, > but it only seems to trigger with bitmaps. This works: > > git init repo > cd repo > > echo content >file > git add file > git commit -m base > > git config uploadpack.allowfilter true > git clone --no-local --bare --filter=blob:none . clone > > cd clone > git fetch origin $(git rev-parse HEAD:file) > > But if I add a "git repack -adb" in the parent repository before the "cd > clone", then I get: > > remote: Total 0 (delta 0), reused 0 (delta 0), pack-reused 0 > fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed' > error: /home/peff/tmp/repo/. did not send all necessary objects > > So presumably this is a bug in the bitmap-aware filtering code that is > not present in the regular filter-traversing code. But what really > puzzles me is that the result seems totally broken. Yet the test > coverage in t5310 passes, and nobody has noticed on major sites like > GitHub (which supports partial clones and most certainly has bitmaps > enabled). > > So I think it will require some digging. My _guess_ is that it has to do > with the "never filter out an object that was explicitly requested" rule > not being consistently followed. But we'll see. OK, I think I've unraveled the mysteries. It is indeed a problem with the "never filter out an object that was explicitly requested" rule. But really, that rule is stronger: it is "always include an explicitly requested object". I.e., it must override even the usual "you said commit X was uninteresting, so we did not include objects reachable from X" logic. And that is what was happening here. In that final fetch, the client says: "I have the commit pointed to by HEAD, but I need the blob from HEAD:file". So the server's bitmap traversal, even before we hit the blob:none filtering, has cleared the bit for HEAD:file's object! So there is nothing to filter, but the filtering code must _add it back in_ to the result. And indeed, if we are adding it back in anyway, we do not need to exclude it when filtering in the first place. So I think the fix looks something like this: diff --git a/pack-bitmap.c b/pack-bitmap.c index d90e1d9d8c..8458eedbdb 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -797,8 +797,6 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, for (i = 0, init_type_iterator(&it, bitmap_git, type); i < to_filter->word_alloc && ewah_iterator_next(&mask, &it); i++) { - if (i < tips->word_alloc) - mask &= ~tips->words[i]; to_filter->words[i] &= ~mask; } @@ -810,11 +808,18 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; i++) { uint32_t pos = i + bitmap_git->pack->num_objects; if (eindex->objects[i]->type == type && - bitmap_get(to_filter, pos) && - !bitmap_get(tips, pos)) + bitmap_get(to_filter, pos)) bitmap_unset(to_filter, pos); } + /* + * Add back in any bits that were explicitly asked for (we may have + * cleared them as part of our filtering just now, or they may + * have been dropped by because they were reachable from the "have" + * side of the bitmap). + */ + bitmap_or(to_filter, tips); + bitmap_free(tips); } diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index b02838750e..d34f086228 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -214,6 +214,16 @@ test_expect_success 'partial clone from bitmapped repository' ' ) ' +test_expect_success 'follow-up fetch from bitmapped repository' ' + test_config uploadpack.allowfilter true && + ( + cd partial-clone.git && + blob=$(git rev-parse HEAD:file-1.t) && + git fetch origin "$blob" && + git cat-file -t "$blob" + ) +' + test_expect_success 'setup further non-bitmapped commits' ' test_commit_bulk --id=further 10 ' That fixes the example I showed above, as well as the included test in t5310. Some follow-on thoughts: - this covers blob:none, and indeed any type-exclusion filter, as they share this code. But we probably need a similar fix for other filters (like the blob-limit filter). I think this could even be hoisted out to the filter_bitmap() function, and then we only have to do it in one place. - alternatively, the actual bitmap traversal could be smart enough to say "this is an explicitly-requested tip object; make sure we _don't_ mark it as uninteresting". Which is exactly what the non-bitmap traversal code does (I think this is the USER_GIVEN stuff). And we even try to build on that code by passing the list_objects_filter struct along, but: - that is only for fill-in traversal; an on-disk bitmap means we might touch those bits without calling into the list-objects code at all. - we don't feed the list-objects code all the information it needs anyway. We do two separate traversals, one for the "haves" bitmap and one for the "wants", and then AND-NOT the former into the latter. So when we do that "haves" traversal, those objects are _not_ being marked as UNINTERESTING. They are considered positive tips from the perspective of the traversal code. So I think we really do need to manipulate the bits in the bitmap to get the right answer. But it seems like there's a big optimization opportunity we're missing here. If the other side says "I don't have this commit, but I need this blob" (as our examples do), then there is no point in figuring out the "haves" bitmap at all! We know that it will not be used. In practice I'm not sure how big a deal this is (if the other side asked for any commit, we'd need it, and in theory it's quick to compute because of the on-disk bitmaps). At any rate, I think we should focus on correctness first here. - as to the mystery of why nobody has noticed this breakage: it's because the usual code to fetch-on-demand an object that we are missing _doesn't_ send any other "we already have this commit" tips. It just says "hey, give me these objects". So a workaround in the meantime is: git -c fetch.negotiationAlgorithm=noop fetch origin ... I'm going to sleep on it, but I think the fix above is the right direction. I just need to look at the other filters, and add a bit of polish. -Peff
[+cc Jonathan Tan for partial clone wisdom] On Thu, May 13, 2021 at 06:53:36AM -0400, Jeff King wrote: > > So I think it will require some digging. My _guess_ is that it has to do > > with the "never filter out an object that was explicitly requested" rule > > not being consistently followed. But we'll see. > > OK, I think I've unraveled the mysteries. Nope. This part is wrong: > It is indeed a problem with the "never filter out an object that was > explicitly requested" rule. But really, that rule is stronger: it is > "always include an explicitly requested object". I.e., it must override > even the usual "you said commit X was uninteresting, so we did not > include objects reachable from X" logic. The rule really is the softer "don't filter out explicitly-requested objects". It's just that the non-bitmap traversal code is way less careful about finding uninteresting objects. Here's the same scenario failing without using bitmaps at all: # same as before; repo with one commit git init repo cd repo echo content >file git add file git commit -m base # and now we make a partial clone omitting the blob git config uploadpack.allowfilter true git clone --no-local --bare --filter=blob:none . clone # but here's the twist. Now we make a second commit... echo more >file git commit -am more # ...and then we fetch both the object we need _and_ that second # commit. That causes pack-objects to traverse from base..more. # The boundary is at "base", so we mark its tree and blob as # UNINTERESTING, and thus we _don't_ send them. cd clone git fetch origin $(git rev-parse HEAD:file) HEAD So I guess the first question is: is this supposed to work? Without bitmaps, it often will. Because we walk commits first, and then only mark trees uninteresting at the boundary; so if there were more commits here, and we were asking to get a blob from one of the middle ones, it would probably work. But fundamentally the client is lying to the server here (as all partial clones must); it is saying "I have that first commit", but of course we don't have all of the reachable objects. If this is supposed to work, I think we need to teach the traversal code to "add back" all of the objects that were explicitly given when a filter is in use (either explicitly, or perhaps just clearing or avoiding the UNINTERESTING flag on user-given objects in the first place). And my earlier patch does that for the bitmap side, but not the regular traversal. Alternatively, part of that fundamental "partial clones are lying about their reachability" property is that we assume they can fetch the objects they need on-demand. And indeed, the on-demand fetch we'd do will use the noop negotiation algorithm, and will succeed. So should we, after receiving an empty pack from the other side (because we claimed to have objects reachable from the commit), then do a follow-up on-demand fetch? If so, then why isn't that kicking in (I can guess there might be some limits to on-demand fetching during a fetch itself, in order to avoid infinite recursion)? -Peff
On 13/05/21 16.57, Jeff King wrote:
> I think it is actually a bug with pack-objects not sending the object,
> but it only seems to trigger with bitmaps. This works:
>
> git init repo
> cd repo
>
> echo content >file
> git add file
> git commit -m base
>
> git config uploadpack.allowfilter true
> git clone --no-local --bare --filter=blob:none . clone
>
> cd clone
> git fetch origin $(git rev-parse HEAD:file)
>
> But if I add a "git repack -adb" in the parent repository before the "cd
> clone", then I get:
>
> remote: Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
> fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed'
> error: /home/peff/tmp/repo/. did not send all necessary objects
>
> So presumably this is a bug in the bitmap-aware filtering code that is
> not present in the regular filter-traversing code. But what really
> puzzles me is that the result seems totally broken. Yet the test
> coverage in t5310 passes, and nobody has noticed on major sites like
> GitHub (which supports partial clones and most certainly has bitmaps
> enabled).
I can reproduce using your test case above, thanks.
Anyway, for the moment being, I clone using "git clone --depth <N>" and
then deepen the clone by "git fetch --deepen <N>" iteratively until
complete. I do that because of quota constraints on my network.
--
An old man doll... just what I always wanted! - Clara
[-- Attachment #1: Type: text/plain, Size: 4436 bytes --] On Fri, May 14, 2021 at 03:27:12AM -0400, Jeff King wrote: > [+cc Jonathan Tan for partial clone wisdom] > > On Thu, May 13, 2021 at 06:53:36AM -0400, Jeff King wrote: > > > > So I think it will require some digging. My _guess_ is that it has to do > > > with the "never filter out an object that was explicitly requested" rule > > > not being consistently followed. But we'll see. > > > > OK, I think I've unraveled the mysteries. > > Nope. This part is wrong: > > > It is indeed a problem with the "never filter out an object that was > > explicitly requested" rule. But really, that rule is stronger: it is > > "always include an explicitly requested object". I.e., it must override > > even the usual "you said commit X was uninteresting, so we did not > > include objects reachable from X" logic. > > The rule really is the softer "don't filter out explicitly-requested > objects". It's just that the non-bitmap traversal code is way less > careful about finding uninteresting objects. > > Here's the same scenario failing without using bitmaps at all: > > # same as before; repo with one commit > git init repo > cd repo > > echo content >file > git add file > git commit -m base > > # and now we make a partial clone omitting the blob > git config uploadpack.allowfilter true > git clone --no-local --bare --filter=blob:none . clone > > # but here's the twist. Now we make a second commit... > echo more >file > git commit -am more > > # ...and then we fetch both the object we need _and_ that second > # commit. That causes pack-objects to traverse from base..more. > # The boundary is at "base", so we mark its tree and blob as > # UNINTERESTING, and thus we _don't_ send them. > cd clone > git fetch origin $(git rev-parse HEAD:file) HEAD > > So I guess the first question is: is this supposed to work? Without > bitmaps, it often will. Because we walk commits first, and then only > mark trees uninteresting at the boundary; so if there were more commits > here, and we were asking to get a blob from one of the middle ones, it > would probably work. But fundamentally the client is lying to the server > here (as all partial clones must); it is saying "I have that first > commit", but of course we don't have all of the reachable objects. I'm not really in a position to jugde about partial clones given that I got next to no experience with them, so I'm going to skip commenting on this part. > If this is supposed to work, I think we need to teach the traversal code > to "add back" all of the objects that were explicitly given when a > filter is in use (either explicitly, or perhaps just clearing or > avoiding the UNINTERESTING flag on user-given objects in the first > place). And my earlier patch does that for the bitmap side, but not the > regular traversal. I think adding these objects back in after computations is the easiest thing we can do here. For bitmaps, it should be relatively easy to do. The only thing I wonder about in your patch is whether we should really do this in a specific filter, or if we can't just do it at the end after all filters have been computed. For the traversal-based code, it feels like the current design with the `NOT_USER_GIVEN` flag is making this harder than necessary. I really wonder whether we should again refactor the code to use a positive `USER_GIVEN` flag again: it's much more explicit, there's only one site where we need to add it and ultimately it's a lot easier to reason about. Furthermore, we can easily avoid marking such an object as `UNINTERESTING` if we see that it was explicitly added to the set while still marking its referenced objects as `UNINTERESTING` if need be. Patrick > Alternatively, part of that fundamental "partial clones are lying about > their reachability" property is that we assume they can fetch the > objects they need on-demand. And indeed, the on-demand fetch we'd do > will use the noop negotiation algorithm, and will succeed. So should we, > after receiving an empty pack from the other side (because we claimed to > have objects reachable from the commit), then do a follow-up on-demand > fetch? If so, then why isn't that kicking in (I can guess there might be > some limits to on-demand fetching during a fetch itself, in order to > avoid infinite recursion)? > > -Peff [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
On Mon, May 17, 2021 at 08:03:22AM +0200, Patrick Steinhardt wrote: > > If this is supposed to work, I think we need to teach the traversal code > > to "add back" all of the objects that were explicitly given when a > > filter is in use (either explicitly, or perhaps just clearing or > > avoiding the UNINTERESTING flag on user-given objects in the first > > place). And my earlier patch does that for the bitmap side, but not the > > regular traversal. > > I think adding these objects back in after computations is the easiest > thing we can do here. For bitmaps, it should be relatively easy to do. > The only thing I wonder about in your patch is whether we should really > do this in a specific filter, or if we can't just do it at the end after > all filters have been computed. We definitely should do it at the end for all filters; my patch was just illustrating the fix for this specific problem. It was actually while cleaning it up to work on all filters that I dug further and found the non-bitmap behavior. > For the traversal-based code, it feels like the current design with the > `NOT_USER_GIVEN` flag is making this harder than necessary. I really > wonder whether we should again refactor the code to use a positive > `USER_GIVEN` flag again: it's much more explicit, there's only one site > where we need to add it and ultimately it's a lot easier to reason > about. Furthermore, we can easily avoid marking such an object as > `UNINTERESTING` if we see that it was explicitly added to the set while > still marking its referenced objects as `UNINTERESTING` if need be. That's my gut feeling, too. But I'm a little afraid that it would be opening up a can of worms that we attempted to fix with 99c9aa9579 (revision: mark non-user-given objects instead, 2018-10-05) in the first place. For the purposes of this particular fix, though, we might be able to skip this question either way with the "add back in" strategy. The objects found in revs->pending right before prepare_revision_walk() make up the "user given" list, so we could possibly just make us of that (possibly stashing it away as necessary). Thanks for the response. I'll be curious to hear the thoughts of folks with Jonathan Tan who were involved in the original partial-clone design (i.e., is this behavior of filters a surprise to them, or just a subtle corner of the initial design). -Peff
On 5/14/2021 3:27 AM, Jeff King wrote: > [+cc Jonathan Tan for partial clone wisdom] > > On Thu, May 13, 2021 at 06:53:36AM -0400, Jeff King wrote: > >>> So I think it will require some digging. My _guess_ is that it has to do >>> with the "never filter out an object that was explicitly requested" rule >>> not being consistently followed. But we'll see. >> >> OK, I think I've unraveled the mysteries. > > Nope. This part is wrong: > >> It is indeed a problem with the "never filter out an object that was >> explicitly requested" rule. But really, that rule is stronger: it is >> "always include an explicitly requested object". I.e., it must override >> even the usual "you said commit X was uninteresting, so we did not >> include objects reachable from X" logic. > > The rule really is the softer "don't filter out explicitly-requested > objects". It's just that the non-bitmap traversal code is way less > careful about finding uninteresting objects. > > Here's the same scenario failing without using bitmaps at all: > ... > # ...and then we fetch both the object we need _and_ that second > # commit. That causes pack-objects to traverse from base..more. > # The boundary is at "base", so we mark its tree and blob as > # UNINTERESTING, and thus we _don't_ send them. > cd clone > git fetch origin $(git rev-parse HEAD:file) HEAD This is the critical reason why this isn't failing in practice: it is very rare to ask for a commit and a blob at the same time. Usually, a blob request comes when something like 'git checkout' is actually navigating to a commit, and then it only asks for missing blobs. > So I guess the first question is: is this supposed to work? Without > bitmaps, it often will. Because we walk commits first, and then only > mark trees uninteresting at the boundary; so if there were more commits > here, and we were asking to get a blob from one of the middle ones, it > would probably work. But fundamentally the client is lying to the server > here (as all partial clones must); it is saying "I have that first > commit", but of course we don't have all of the reachable objects. It _should_ work. We should be specifying the blob:none filter, so when we say "we have these commits" it should apply that filter to those commits for the "haves". If there was room to adjust the protocol more, it would perhaps be helpful to annotate the haves in this way ("have-filtered" or something, similar to how the "shallow" advertisement works). > If this is supposed to work, I think we need to teach the traversal code > to "add back" all of the objects that were explicitly given when a > filter is in use (either explicitly, or perhaps just clearing or > avoiding the UNINTERESTING flag on user-given objects in the first > place). And my earlier patch does that for the bitmap side, but not the > regular traversal. This gets more complicated if we were in a treeless clone, for example. We could explicitly ask for a tree and really need all of its reachable trees and blobs. It's not enough to just isolate that single object. I wouldn't spend too much time optimizing for the treeless clone case, as I believe the client will self-recover by asking for those reachable trees when it walks to them. > Alternatively, part of that fundamental "partial clones are lying about > their reachability" property is that we assume they can fetch the > objects they need on-demand. And indeed, the on-demand fetch we'd do > will use the noop negotiation algorithm, and will succeed. So should we, > after receiving an empty pack from the other side (because we claimed to > have objects reachable from the commit), then do a follow-up on-demand > fetch? If so, then why isn't that kicking in (I can guess there might be > some limits to on-demand fetching during a fetch itself, in order to > avoid infinite recursion)? The client has an opportunity to be more robust here, especially in the case of a promisor remote. A simple retry would be a good band-aid for now. A more involved case might be to isolate objects that we have identified as reachable from a local object, and set those aside for a separate request. Here, we could ask for the blob in a different request than for the remote's HEAD reference. Thanks, -Stolee
On Mon, May 17, 2021 at 12:25:02PM -0400, Derrick Stolee wrote: > > Here's the same scenario failing without using bitmaps at all: > > > ... > > # ...and then we fetch both the object we need _and_ that second > > # commit. That causes pack-objects to traverse from base..more. > > # The boundary is at "base", so we mark its tree and blob as > > # UNINTERESTING, and thus we _don't_ send them. > > cd clone > > git fetch origin $(git rev-parse HEAD:file) HEAD > > This is the critical reason why this isn't failing in practice: it > is very rare to ask for a commit and a blob at the same time. Usually, > a blob request comes when something like 'git checkout' is actually > navigating to a commit, and then it only asks for missing blobs. Yes, I think that may be part of it. But more fundamentally, the request done by "git checkout" does not send any "have" lines at all, so it could never trigger this bug, even if it did try to ask for a commit. > > So I guess the first question is: is this supposed to work? Without > > bitmaps, it often will. Because we walk commits first, and then only > > mark trees uninteresting at the boundary; so if there were more commits > > here, and we were asking to get a blob from one of the middle ones, it > > would probably work. But fundamentally the client is lying to the server > > here (as all partial clones must); it is saying "I have that first > > commit", but of course we don't have all of the reachable objects. > > It _should_ work. We should be specifying the blob:none filter, so when > we say "we have these commits" it should apply that filter to those > commits for the "haves". I guess my "should it work" was more about: are filter requests that feed arbitrary combinations to git-fetch, including "haves", something we want to support? I think the world is a better place if we do. But what I'm wondering is if there was some intentional choice to avoid triggering this (especially in the way that the on-demand fetcher uses the "noop" negotiation algorithm). And yes, we do specify the blob:none filter. BTW, I saw an interesting side-behavior here. If the server _stops_ supporting filters and then we try to fetch with it, we will happily say "oh well, the other side doesn't support filters, so I won't bother sending my filter spec". And then all hell breaks loose, as the other side has no clue that we are a partial clone. In practice I think it's an unlikely failure mode for a server you partial-cloned from to turn off filters, so it's probably not that important. I hit it because a test script used test_config to enable them, and then the follow-on test I added to run git-fetch got quite confused. A more likely scenario is that you might see it a misconfigured load-balanced pool of servers. I do wonder how hitting a third-party server should work, though. E.g., I partial clone from A, and then ask B to fetch some related history built on top. Do I tell B that I'm a partial clone and might be missing some objects? Or do I behave as normal, and expect to fault in objects that it assumes I have (e.g., a delta base)? And if the latter, does that work (if it does, then why doesn't the same logic kick in for this fetch?). Anyway, that's maybe orthogonal to the bug at hand (and my questions above are all sincere; it might well work just fine, but I haven't dug into it further). > > If this is supposed to work, I think we need to teach the traversal code > > to "add back" all of the objects that were explicitly given when a > > filter is in use (either explicitly, or perhaps just clearing or > > avoiding the UNINTERESTING flag on user-given objects in the first > > place). And my earlier patch does that for the bitmap side, but not the > > regular traversal. > > This gets more complicated if we were in a treeless clone, for example. > We could explicitly ask for a tree and really need all of its reachable > trees and blobs. It's not enough to just isolate that single object. Good point. That really requires walking any objects that were listed to "add back", but _only_ the ones that would have been filtered. In the treeless clone case, that's easier. But if I say "don't give me any trees deeper than X", how do I even know which ones those are? > I wouldn't spend too much time optimizing for the treeless clone case, > as I believe the client will self-recover by asking for those reachable > trees when it walks to them. Yes, I think as long as the on-demand fetch kicks in, then it becomes an optimization problem, not a correctness one. So perhaps the first fix should focus on that, even for the blob case being discussed. Then it _works_, just with an extra round-trip for the on-demand fetch. -Peff
On 18/05/21 15.11, Jeff King wrote: > In practice I think it's an unlikely failure mode for a server you > partial-cloned from to turn off filters, so it's probably not that > important. I hit it because a test script used test_config to enable > them, and then the follow-on test I added to run git-fetch got quite > confused. A more likely scenario is that you might see it a > misconfigured load-balanced pool of servers. > > I do wonder how hitting a third-party server should work, though. E.g., > I partial clone from A, and then ask B to fetch some related history > built on top. Do I tell B that I'm a partial clone and might be missing > some objects? Or do I behave as normal, and expect to fault in objects > that it assumes I have (e.g., a delta base)? And if the latter, does > that work (if it does, then why doesn't the same logic kick in for this > fetch?). > My server is running Gitea (compiled from main branch [1]). The server itself runs Git 2.31.1. I can also reproduce the issue using Jeff's test case [2] without Gitea. So this is not Gitea's issue, this is Git's issue. Even my server setup is just application server + separate database server and not load-balanced pool ones. [1]: https://github.com/go-gitea/gitea/tree/main [2]: https://lore.kernel.org/git/YJz4JTsFjTtL7mE2@coredump.intra.peff.net/ -- An old man doll... just what I always wanted! - Clara
On 5/18/2021 6:14 AM, Bagas Sanjaya wrote:
> On 18/05/21 15.11, Jeff King wrote:
>> In practice I think it's an unlikely failure mode for a server you
>> partial-cloned from to turn off filters, so it's probably not that
>> important. I hit it because a test script used test_config to enable
>> them, and then the follow-on test I added to run git-fetch got quite
>> confused. A more likely scenario is that you might see it a
>> misconfigured load-balanced pool of servers.
>>
>> I do wonder how hitting a third-party server should work, though. E.g.,
>> I partial clone from A, and then ask B to fetch some related history
>> built on top. Do I tell B that I'm a partial clone and might be missing
>> some objects? Or do I behave as normal, and expect to fault in objects
>> that it assumes I have (e.g., a delta base)? And if the latter, does
>> that work (if it does, then why doesn't the same logic kick in for this
>> fetch?).
>>
>
> My server is running Gitea (compiled from main branch [1]). The server
> itself runs Git 2.31.1. I can also reproduce the issue using Jeff's
> test case [2] without Gitea. So this is not Gitea's issue, this is Git's
> issue.
>
> Even my server setup is just application server + separate database
> server and not load-balanced pool ones.
I think the "third party" thing is mostly about if we have forks
hosted in different places. Imagine that git/git is hosted on
GitHub with partial clone enabled, but git-for-windows/git was
hosted on Azure Repos, which doesn't have partial clone.
What happens when we partial clone git/git from GitHub, but then
add git-for-windows/git as a remote and fetch from it?
This is a separate discussion. I think it really points out the
issues that arise because partial clone makes the remote a more
critical resource. This removes some decentralization from Git,
as requested by the user. These concerns about the remote changing
the availability of filters or about other remotes hosted
elsewhere are unlikely to come up in a centralized workflow (i.e.
a set of users working against a common remote without forks).
Thanks,
-Stolee