git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug in 2.26: git-fetch fetching too many objects?
@ 2020-04-20 15:44 Dixit, Ashutosh
  2020-04-20 17:03 ` Konstantin Ryabitsev
  2020-04-21  6:45 ` Jonathan Nieder
  0 siblings, 2 replies; 8+ messages in thread
From: Dixit, Ashutosh @ 2020-04-20 15:44 UTC (permalink / raw)
  To: git

I am seeing a strange behavior in git-fetch in 2.26. I frequently fetch
from a couple of linux kernel remotes (so you will have an idea how big the
repo is). I have a different system with 2.20 on which I never see a
problem.

So let us say I fetch with 2.20 and it fetches say 20,000 objects. However
with 2.26 it starts fetching millions of objects, objects which are already
present locally. I don't know yet if this happens each time or only once in
a while, I have seen it happen twice, will keep an eye out for this.

If you open a bug please let me know and I can update it with my
findings. Unless it is a known issue, perhaps already fixed?

Thanks!
--
Ashutosh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in 2.26: git-fetch fetching too many objects?
  2020-04-20 15:44 Bug in 2.26: git-fetch fetching too many objects? Dixit, Ashutosh
@ 2020-04-20 17:03 ` Konstantin Ryabitsev
  2020-04-21  5:18   ` Dixit, Ashutosh
  2020-04-21  6:45 ` Jonathan Nieder
  1 sibling, 1 reply; 8+ messages in thread
From: Konstantin Ryabitsev @ 2020-04-20 17:03 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: git

On Mon, Apr 20, 2020 at 08:44:39AM -0700, Dixit, Ashutosh wrote:
> I am seeing a strange behavior in git-fetch in 2.26. I frequently fetch
> from a couple of linux kernel remotes (so you will have an idea how big the
> repo is). I have a different system with 2.20 on which I never see a
> problem.
> 
> So let us say I fetch with 2.20 and it fetches say 20,000 objects. However
> with 2.26 it starts fetching millions of objects, objects which are already
> present locally. I don't know yet if this happens each time or only once in
> a while, I have seen it happen twice, will keep an eye out for this.
> 
> If you open a bug please let me know and I can update it with my
> findings. Unless it is a known issue, perhaps already fixed?

It's a known issue with protocol v2, but nobody's been able to properly 
reproduce it in order to debug. If you can reliably make it reoccur, 
then please make a copy of your local tree and share with this list, 
together with your gitconfig and the remote you're pulling.

Setting protocol.version=1 should fix it, but if you are willing to help 
troubleshoot it, a bunch of people will be super thankful to you for 
that, as it affects quite a number of kernel developers.

-K

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in 2.26: git-fetch fetching too many objects?
  2020-04-20 17:03 ` Konstantin Ryabitsev
@ 2020-04-21  5:18   ` Dixit, Ashutosh
  0 siblings, 0 replies; 8+ messages in thread
From: Dixit, Ashutosh @ 2020-04-21  5:18 UTC (permalink / raw)
  To: Dixit, Ashutosh, git

On Mon, 20 Apr 2020 10:03:01 -0700, Konstantin Ryabitsev wrote:
>
> On Mon, Apr 20, 2020 at 08:44:39AM -0700, Dixit, Ashutosh wrote:
> > I am seeing a strange behavior in git-fetch in 2.26. I frequently fetch
> > from a couple of linux kernel remotes (so you will have an idea how big the
> > repo is). I have a different system with 2.20 on which I never see a
> > problem.
> >
> > So let us say I fetch with 2.20 and it fetches say 20,000 objects. However
> > with 2.26 it starts fetching millions of objects, objects which are already
> > present locally. I don't know yet if this happens each time or only once in
> > a while, I have seen it happen twice, will keep an eye out for this.
> >
> > If you open a bug please let me know and I can update it with my
> > findings. Unless it is a known issue, perhaps already fixed?
>
> It's a known issue with protocol v2, but nobody's been able to properly
> reproduce it in order to debug. If you can reliably make it reoccur,
> then please make a copy of your local tree and share with this list,
> together with your gitconfig and the remote you're pulling.
>
> Setting protocol.version=1 should fix it, but if you are willing to help
> troubleshoot it, a bunch of people will be super thankful to you for
> that, as it affects quite a number of kernel developers.

I will see what I can do. I think I am seeing an instance where a branch
has incorrect SHA's during the clone itself but I haven't been able to
reproduce it after the HEAD moved at the remote. I'll report back if I have
a reliable reproducer. Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in 2.26: git-fetch fetching too many objects?
  2020-04-20 15:44 Bug in 2.26: git-fetch fetching too many objects? Dixit, Ashutosh
  2020-04-20 17:03 ` Konstantin Ryabitsev
@ 2020-04-21  6:45 ` Jonathan Nieder
  2020-04-21 19:16   ` Junio C Hamano
  1 sibling, 1 reply; 8+ messages in thread
From: Jonathan Nieder @ 2020-04-21  6:45 UTC (permalink / raw)
  To: Dixit, Ashutosh; +Cc: git, Jonathan Tan, Konstantin Ryabitsev

(+cc: Jonathan Tan, fetch negotiation expert)
Hi,

Dixit, Ashutosh wrote:

> I am seeing a strange behavior in git-fetch in 2.26. I frequently fetch
> from a couple of linux kernel remotes (so you will have an idea how big the
> repo is). I have a different system with 2.20 on which I never see a
> problem.
>
> So let us say I fetch with 2.20 and it fetches say 20,000 objects. However
> with 2.26 it starts fetching millions of objects, objects which are already
> present locally. I don't know yet if this happens each time or only once in
> a while, I have seen it happen twice, will keep an eye out for this.
>
> If you open a bug please let me know and I can update it with my
> findings. Unless it is a known issue, perhaps already fixed?

Does "git config --global fetch.negotiationAlgorithm skipping" help?
It might be time for us to make that the default.

I suspect this is related to the change that protocol v2 does to use
stateless-rpc even in stateful protocols.  If my suspicion is correct,
then the same behavior would show up with protocol v0 over http and
https as well.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in 2.26: git-fetch fetching too many objects?
  2020-04-21  6:45 ` Jonathan Nieder
@ 2020-04-21 19:16   ` Junio C Hamano
  2020-04-21 19:36     ` Jonathan Nieder
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2020-04-21 19:16 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Dixit, Ashutosh, git, Jonathan Tan, Konstantin Ryabitsev

Jonathan Nieder <jrnieder@gmail.com> writes:

> (+cc: Jonathan Tan, fetch negotiation expert)
> Hi,
>
> Dixit, Ashutosh wrote:
>
>> I am seeing a strange behavior in git-fetch in 2.26. I frequently fetch
>> from a couple of linux kernel remotes (so you will have an idea how big the
>> repo is). I have a different system with 2.20 on which I never see a
>> problem.
> ...
> I suspect this is related to the change that protocol v2 does to use
> stateless-rpc even in stateful protocols.  If my suspicion is correct,
> then the same behavior would show up with protocol v0 over http and
> https as well.

Thanks.  

This is at least the fourth time we hear that v2 may not be ready
for the real-world use.  Perhaps we should revert the default flip
on the maintenance track while we hunt for bugs and improve the
protocol support?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in 2.26: git-fetch fetching too many objects?
  2020-04-21 19:16   ` Junio C Hamano
@ 2020-04-21 19:36     ` Jonathan Nieder
  2020-04-21 20:06       ` Junio C Hamano
  2020-04-21 21:11       ` Jeff King
  0 siblings, 2 replies; 8+ messages in thread
From: Jonathan Nieder @ 2020-04-21 19:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dixit, Ashutosh, git, Jonathan Tan, Konstantin Ryabitsev

Hi,

Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:
>> Dixit, Ashutosh wrote:

>>> I am seeing a strange behavior in git-fetch in 2.26. I frequently fetch
>>> from a couple of linux kernel remotes (so you will have an idea how big the
>>> repo is). I have a different system with 2.20 on which I never see a
>>> problem.
>> ...
>> I suspect this is related to the change that protocol v2 does to use
>> stateless-rpc even in stateful protocols.  If my suspicion is correct,
>> then the same behavior would show up with protocol v0 over http and
>> https as well.
[...]
> This is at least the fourth time we hear that v2 may not be ready
> for the real-world use.  Perhaps we should revert the default flip
> on the maintenance track while we hunt for bugs and improve the
> protocol support?

That feels to me like an overreaction, since these are all reports of
the same issue that we have a fix to.  Shouldn't we just flip the
default for fetch.negotiationAlgorithm to skipping?  If we revert to
buy time, what would we do with that time?

In other words, if I understand correctly, it's describing an issue
that also exists in protocol v0 for https.  I would be *very*
interested in any evidence one way or another about whether I am
understanding correctly.  If we flip the default, I don't see how
we'll get that evidence, since we've been using protocol v2 as the
default at $DAYJOB for quite a long time now.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in 2.26: git-fetch fetching too many objects?
  2020-04-21 19:36     ` Jonathan Nieder
@ 2020-04-21 20:06       ` Junio C Hamano
  2020-04-21 21:11       ` Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2020-04-21 20:06 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Dixit, Ashutosh, git, Jonathan Tan, Konstantin Ryabitsev

Jonathan Nieder <jrnieder@gmail.com> writes:

> In other words, if I understand correctly, it's describing an issue
> that also exists in protocol v0 for https.  I would be *very*
> interested in any evidence one way or another about whether I am
> understanding correctly.

I am assuming that the issue experienced by these people after
flipping the default to v2 was *not* experienced by the same folks
back when they were not on v2.  If not, I cannot explain why their
report say "it suddenly started doing this".

> ..., since we've been using protocol v2 as the
> default at $DAYJOB for quite a long time now.

Is it possible that folks getting hurt after 2.26 got released have
quite different use case / fetch pattern from what you see at
$DAYJOB, which are covered well in the current code?  Keep using v2
at $DAYJOB may not help us diagnose the issue more than flipping the
default back (and at $DAYJOB the default is under your control ;-).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Bug in 2.26: git-fetch fetching too many objects?
  2020-04-21 19:36     ` Jonathan Nieder
  2020-04-21 20:06       ` Junio C Hamano
@ 2020-04-21 21:11       ` Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff King @ 2020-04-21 21:11 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Junio C Hamano, Dixit, Ashutosh, git, Jonathan Tan, Konstantin Ryabitsev

On Tue, Apr 21, 2020 at 12:36:11PM -0700, Jonathan Nieder wrote:

> > This is at least the fourth time we hear that v2 may not be ready
> > for the real-world use.  Perhaps we should revert the default flip
> > on the maintenance track while we hunt for bugs and improve the
> > protocol support?
> 
> That feels to me like an overreaction, since these are all reports of
> the same issue that we have a fix to.  Shouldn't we just flip the
> default for fetch.negotiationAlgorithm to skipping?  If we revert to
> buy time, what would we do with that time?

Do we know that fetch.negotiationAlgorithm helps? I thought we didn't
yet know the actual cause of the bug. If that is the culprit, and people
would have seen this under v0 using stateless-http, why didn't we get
more reports of it then? Surely some people used http over git://?

I do agree that flipping the default away from v2 may be premature,
especially if we don't have a plan for moving forward. It sounds like
swapping out the negotiationAlgorithm would at least be likely to
generate more data, even if it is only a guess.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-04-21 21:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-20 15:44 Bug in 2.26: git-fetch fetching too many objects? Dixit, Ashutosh
2020-04-20 17:03 ` Konstantin Ryabitsev
2020-04-21  5:18   ` Dixit, Ashutosh
2020-04-21  6:45 ` Jonathan Nieder
2020-04-21 19:16   ` Junio C Hamano
2020-04-21 19:36     ` Jonathan Nieder
2020-04-21 20:06       ` Junio C Hamano
2020-04-21 21:11       ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).