* Fetching too many tags? @ 2023-08-10 6:08 Ronan Pigott 2023-08-11 18:09 ` Jeff King 2023-08-11 22:06 ` Ronan Pigott 0 siblings, 2 replies; 5+ messages in thread From: Ronan Pigott @ 2023-08-10 6:08 UTC (permalink / raw) To: git Hey git, I am interested in git performance today and can't figure out what's going on here. I was wondering why my git-fetch might be slow in an up-to-date repo: $ git pull Already up to date. $ time git fetch origin master From https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux * branch master -> FETCH_HEAD git fetch origin master 0.13s user 0.06s system 10% cpu 1.705 total GIT_TRACE_CURL shows it spends most of the time transfering (all) tags from the remote. It's much faster with --no-tags: $ time git fetch -n origin master From https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux * branch master -> FETCH_HEAD git fetch -n origin master 0.11s user 0.03s system 36% cpu 0.383 total But I don't have tagOpt set: $ git config remote.origin.tagOpt || echo $? 1 And the remote doesn't have to send me any commits, so I don't see why I should receive any tags at all. Why might I be receiving so many tags? Thanks, Ronan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fetching too many tags? 2023-08-10 6:08 Fetching too many tags? Ronan Pigott @ 2023-08-11 18:09 ` Jeff King 2023-08-11 22:06 ` Ronan Pigott 1 sibling, 0 replies; 5+ messages in thread From: Jeff King @ 2023-08-11 18:09 UTC (permalink / raw) To: Ronan Pigott; +Cc: git On Thu, Aug 10, 2023 at 06:08:34AM +0000, Ronan Pigott wrote: > I am interested in git performance today and can't figure out what's going on > here. I was wondering why my git-fetch might be slow in an up-to-date repo: > > $ git pull > Already up to date. > $ time git fetch origin master > From https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux > * branch master -> FETCH_HEAD > git fetch origin master 0.13s user 0.06s system 10% cpu 1.705 total > > GIT_TRACE_CURL shows it spends most of the time transfering (all) tags from the > remote. It's much faster with --no-tags: > > $ time git fetch -n origin master > From https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux > * branch master -> FETCH_HEAD > git fetch -n origin master 0.11s user 0.03s system 36% cpu 0.383 total > > But I don't have tagOpt set: > > $ git config remote.origin.tagOpt || echo $? > 1 > > And the remote doesn't have to send me any commits, so I don't see why I should > receive any tags at all. Why might I be receiving so many tags? You didn't define "receiving tags", but I assume you just mean that you saw the tag names and object ids in the trace output. From the output above, it looks like no actual tag objects were transferred. And the answer, then, is that this is how the Git protocol works. The server says "here are all the refs I know about", then the client decides what it wants from that list and asks the server to send the necessary objects, after which it updates its local refs. So the server will necessarily send all of the tags. Only the client knows what it already has and whether any of them are new. And in the default mode, which will fetch tags that point to commits we have, it is checking each such new tag to see if it is worth fetching. Even if we did not fetch new commits, we might see new tags that point to existing commits. When you use "--no-tags", that explicitly says "do not bother with tags at all". Recent versions of Git have a protocol extension where the client can say "I am only interested in refs/heads/master; don't bother telling me about other stuff". Since the client knows we do not care about tags, it can use that extension to get a much smaller ref advertisement from the server. -Peff ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fetching too many tags? 2023-08-10 6:08 Fetching too many tags? Ronan Pigott 2023-08-11 18:09 ` Jeff King @ 2023-08-11 22:06 ` Ronan Pigott 2023-08-11 23:58 ` Jeff King 2023-08-12 1:04 ` Ronan Pigott 1 sibling, 2 replies; 5+ messages in thread From: Ronan Pigott @ 2023-08-11 22:06 UTC (permalink / raw) To: Jeff King; +Cc: git > And the answer, then, is that this is how the Git protocol works. The > server says "here are all the refs I know about", then the client > decides what it wants from that list and asks the server to send the > necessary objects, after which it updates its local refs. Thanks, this clears up some of my confusion. I had thought that the client sent the server what we had and that the server would then decide what objects to send over. > When you use "--no-tags", that explicitly says "do not bother with tags > at all". Recent versions of Git have a protocol extension where the > client can say "I am only interested in refs/heads/master; don't bother > telling me about other stuff". Since the client knows we do not care > about tags, it can use that extension to get a much smaller ref > advertisement from the server. Do you mean the --negotiation-tip fetch option? In my experience, it doesn't appear to have much of an effect in this case. $ time git fetch origin master From https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux * branch master -> FETCH_HEAD git fetch origin master 0.13s user 0.04s system 9% cpu 1.793 total $ time git fetch --negotiation-tip=master origin master From https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux * branch master -> FETCH_HEAD git fetch --negotiation-tip=master origin master 0.10s user 0.06s system 9% cpu 1.762 total Is that because (most) the tags point to commits reachable from master? My prior (apparently incorrect) understanding of the fetch negotiation is based on my interpretation of the description of this option in git-fetch(1): > By default, Git will report, to the server, commits reachable from all local > refs to find common commits in an attempt to reduce the size of the > to-be-received packfile. If specified, Git will only report commits reachable > from the given tips. This is useful to speed up fetches when the user knows > which local ref is likely to have commits in common with the upstream ref being > fetched. Now, if I understand correctly, the report does not include the tags that we already have? Cheers, Ronan ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fetching too many tags? 2023-08-11 22:06 ` Ronan Pigott @ 2023-08-11 23:58 ` Jeff King 2023-08-12 1:04 ` Ronan Pigott 1 sibling, 0 replies; 5+ messages in thread From: Jeff King @ 2023-08-11 23:58 UTC (permalink / raw) To: Ronan Pigott; +Cc: git On Fri, Aug 11, 2023 at 10:06:43PM +0000, Ronan Pigott wrote: > > When you use "--no-tags", that explicitly says "do not bother with tags > > at all". Recent versions of Git have a protocol extension where the > > client can say "I am only interested in refs/heads/master; don't bother > > telling me about other stuff". Since the client knows we do not care > > about tags, it can use that extension to get a much smaller ref > > advertisement from the server. > > Do you mean the --negotiation-tip fetch option? In my experience, it doesn't > appear to have much of an effect in this case. No, the "negotiation" phase only happens when there are objects to fetch, and the client and server have to agree on which ones. That's not happening at all in your case (so --negotiation-tip won't have any effect). The feature I was thinking of is that in Git's "v2" protocol, the client gets to speak first, and so it can say "btw, I am only interested in these refs". v2 became the default in git v2.29 (of course both client and server have to support it, but kernel.org is definitely up to date there). You can see it in action with something like this: GIT_TRACE_PACKET=1 git fetch --no-tags origin master The "ref-prefix" lines are the client telling the server which prefixes it's interested in (we have to ask for several variants because "master" from the command line gets fully qualified based on what the other side offers). Try it without --no-tags and you'll see a wider ref-prefix request. If you try: GIT_TRACE_PACKET=1 git -c protocol.version=0 fetch --no-tags origin master you'll see the full advertisement, even with --no-tags. In v0, the server speaks first and just dumps its complete list of refs. > > By default, Git will report, to the server, commits reachable from all local > > refs to find common commits in an attempt to reduce the size of the > > to-be-received packfile. If specified, Git will only report commits reachable > > from the given tips. This is useful to speed up fetches when the user knows > > which local ref is likely to have commits in common with the upstream ref being > > fetched. > > Now, if I understand correctly, the report does not include the tags that we > already have? So there's no negotiation here at all, as I explained above. But when it does happen, Git should use all refs, including tags and branches, to try to reach a common point in the history graph. If you run with GIT_TRACE_PACKET on a request that actually fetches objects, you'll see "have" and "want" lines from the client. For a vanilla fetch from a server you regularly fetch from, the negotiation is pretty boring and fast (the client tells the server about the old commit at the tip of the branch, and the server immediately says "OK, I know about that"). A more interesting one is if you fetch the kernel from Linus's repo, and then fetch from the stable kernel repo after that. Or maybe vice versa. There you have two histories that share significant chunks, but also have each diverged. So you should see the client and server dumping sha1's at each other until they reach a common point. That's a case where --negotiation-tip can sometimes speed things up. -Peff ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fetching too many tags? 2023-08-11 22:06 ` Ronan Pigott 2023-08-11 23:58 ` Jeff King @ 2023-08-12 1:04 ` Ronan Pigott 1 sibling, 0 replies; 5+ messages in thread From: Ronan Pigott @ 2023-08-12 1:04 UTC (permalink / raw) To: Jeff King; +Cc: git > No, the "negotiation" phase only happens when there are objects to > fetch, and the client and server have to agree on which ones. That's not > happening at all in your case (so --negotiation-tip won't have any > effect). Ah, I see. > The feature I was thinking of is that in Git's "v2" protocol, the client > gets to speak first, and so it can say "btw, I am only interested in > these refs". v2 became the default in git v2.29 (of course both client > and server have to support it, but kernel.org is definitely up to date > there). > > You can see it in action with something like this: > > GIT_TRACE_PACKET=1 git fetch --no-tags origin master > > The "ref-prefix" lines are the client telling the server which prefixes > it's interested in (we have to ask for several variants because "master" > from the command line gets fully qualified based on what the other side > offers). Try it without --no-tags and you'll see a wider ref-prefix > request. If you try: Thanks. I tried this and indeed without --no-tags there is an additional line > 17:41:29.163545 pkt-line.c:86 packet: git< ref-prefix refs/tags/ I understand now that this is why the server is telling me about all those tags. I had thought it would only need to tell me about tags that point to something reachable from master, and was confused why the server was advertising all the tags. Thanks, Ronan ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-08-12 1:04 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-08-10 6:08 Fetching too many tags? Ronan Pigott 2023-08-11 18:09 ` Jeff King 2023-08-11 22:06 ` Ronan Pigott 2023-08-11 23:58 ` Jeff King 2023-08-12 1:04 ` Ronan Pigott
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).