* [BUGREPORT] Why is git-push fetching content? @ 2023-02-21 22:01 Sean Allred 2023-02-21 23:02 ` brian m. carlson 0 siblings, 1 reply; 7+ messages in thread From: Sean Allred @ 2023-02-21 22:01 UTC (permalink / raw) To: Sean Allred, Kyle VandeWalle, git What did you do before the bug happened? (Steps to reproduce your issue) # in a new directory, cd $(mktemp -d) # initialize a new repository git init # fetch a single commit from a remote git fetch --filter=tree:0 --depth=1 $REMOTE $COMMIT_OID # create a ref on that remote git push --no-verify $REMOTE $COMMIT_OID:$REFNAME What did you expect to happen? (Expected behavior) I expected this process to complete very, very quickly. We believe the version where it had been doing so was ~2.37. What happened instead? (Actual behavior) The fetch completes nearly instantly as expected. We receive ~200B from the remote for the commit object itself. What's truly bizarre is what happens during the push. It starts receiving objects from the remote! By the end of this process, the local repository is a whopping ~700MB -- though interestingly only about a tenth of the full repository size. This result in particular is strange in context. I would expect to either see 'almost all' the repository content, 'about half' (we have two trunks and fetching a single commit would at most fetch one of them), or 'virtual none at all'. There isn't a straightforward explanation for why 'one tenth' would make sense. What's different between what you expected and what actually happened? Why should git-push ever be fetching objects? This doesn't map well to my mental model of the relationship between push/fetch. I would expect the local repository to stay in that 'git init'+200B range. Anything else you want to add: Please review the rest of the bug report below. You can delete any lines you don't wish to share. I've truncated the system information normally included by git-bugreport as I am sending this email from a different machine. Versions of Git that can reproduce: - 2.39.2.windows.1 (Windows 10) git version: git version 2.39.2.windows.1 cpu: x86_64 built from commit: a82fa99b36ddfd643e61ed45e52abe314687df67 sizeof-long: 4 sizeof-size_t: 8 shell-path: /bin/sh feature: fsmonitor--daemon uname: Windows 10.0 19044 compiler info: gnuc: 12.2 libc info: no libc information available $SHELL (typically, interactive shell): C:\Program Files\Git\usr\bin\bash.exe - 2.31.1 (AIX UNIX 7.2) git version: git version 2.31.1 cpu: 00F905E64C00 no commit associated with this build sizeof-long: 8 sizeof-size_t: 8 shell-path: /opt/freeware/bin/bash uname: AIX 2 7 00FBC37A4C00 compiler info: gnuc: 8.3 libc info: no libc information available $SHELL (typically, interactive shell): /usr/bin/ksh -- Sean Allred ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUGREPORT] Why is git-push fetching content? 2023-02-21 22:01 [BUGREPORT] Why is git-push fetching content? Sean Allred @ 2023-02-21 23:02 ` brian m. carlson 2023-02-22 15:04 ` Sean Allred [not found] ` <7bfb7ecd4a4c78668f97b00d5f06af0c9b2878269476e89c3311eeb8071b1ab3@mu.id> 0 siblings, 2 replies; 7+ messages in thread From: brian m. carlson @ 2023-02-21 23:02 UTC (permalink / raw) To: Sean Allred; +Cc: Sean Allred, Kyle VandeWalle, git [-- Attachment #1: Type: text/plain, Size: 2696 bytes --] On 2023-02-21 at 22:01:04, Sean Allred wrote: > What did you do before the bug happened? (Steps to reproduce your issue) > > # in a new directory, > cd $(mktemp -d) > > # initialize a new repository > git init > > # fetch a single commit from a remote > git fetch --filter=tree:0 --depth=1 $REMOTE $COMMIT_OID > > # create a ref on that remote > git push --no-verify $REMOTE $COMMIT_OID:$REFNAME > > What did you expect to happen? (Expected behavior) > > I expected this process to complete very, very quickly. We believe > the version where it had been doing so was ~2.37. > > What happened instead? (Actual behavior) > > The fetch completes nearly instantly as expected. We receive ~200B > from the remote for the commit object itself. What's truly bizarre > is what happens during the push. It starts receiving objects from > the remote! By the end of this process, the local repository is a > whopping ~700MB -- though interestingly only about a tenth of the > full repository size. > > This result in particular is strange in context. I would expect to > either see 'almost all' the repository content, 'about half' (we > have two trunks and fetching a single commit would at most fetch one > of them), or 'virtual none at all'. There isn't a straightforward > explanation for why 'one tenth' would make sense. It's hard to know for certain what's going on here, but it depends on your history. You did a partial clone with no trees, so you've likely received a single commit object and no trees or blobs. However, when you push a commit, that necessitates pushing the trees and blobs as well, and you don't have those. If the remote said that it already had the commit, then it might push no objects at all (which I've seen before) and thus just update the references. However, if it pushes even one commit, it may need to walk the history and find common commits, which will necessitate fetching objects, and it will have to push any trees and blobs as well, which also will require objects to be fetched. My guess is that this is probably made worse by the fact that this is shallow, and that necessitates certain additional computations, which means more objects are fetched. However, I'm not super sure how that code works, so I think it may be helpful for someone else to chime in who's more familiar with this. If you want to see what's going on, you can run with `GIT_TRACE=1 GIT_TRACE_PACKET=1`, which may show interesting information about the negotiation. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUGREPORT] Why is git-push fetching content? 2023-02-21 23:02 ` brian m. carlson @ 2023-02-22 15:04 ` Sean Allred 2023-06-20 11:26 ` Tao Klerks [not found] ` <7bfb7ecd4a4c78668f97b00d5f06af0c9b2878269476e89c3311eeb8071b1ab3@mu.id> 1 sibling, 1 reply; 7+ messages in thread From: Sean Allred @ 2023-02-22 15:04 UTC (permalink / raw) To: brian m. carlson; +Cc: Sean Allred, Kyle VandeWalle, git "brian m. carlson" <sandals@crustytoothpaste.net> writes: > It's hard to know for certain what's going on here, but it depends on > your history. You did a partial clone with no trees, so you've likely > received a single commit object and no trees or blobs. Yup, this was the intention behind `--depth=1 --filter=tree:0`. The server doing this ref update needs to be faster than having the full history would allow. > However, when you push a commit, that necessitates pushing the trees and > blobs as well, and you don't have those. If the remote said that it > already had the commit, then it might push no objects at all (which I've > seen before) and thus just update the references. However, if it pushes > even one commit, it may need to walk the history and find common > commits, which will necessitate fetching objects, and it will have to > push any trees and blobs as well, which also will require objects to be > fetched. Absolutely. The commit in question was fetched from the same remote to which we're pushing, so it would seem by definition that git-push should not need to push *any* object content whatsoever. > My guess is that this is probably made worse by the fact that this is > shallow, and that necessitates certain additional computations, which > means more objects are fetched. However, I'm not super sure how that > code works, so I think it may be helpful for someone else to chime in > who's more familiar with this. I'm certain this is just an unforeseen interaction between all these pieces. I wouldn't be too surprised if we're among only a handful of folks using git in this way. > If you want to see what's going on, you can run with > `GIT_TRACE=1 GIT_TRACE_PACKET=1`, which may show interesting information > about the negotiation. I'm not sure of the best way to include this information, so I'm just going to inline it. I've edited this log file to remove several tens of thousands of lines of object hashes and operational refnames. I've annotated it with some guesses of what things might mean. I'm still *relatively* new to reading such log files for serious debugging. 08:30:47.655623 exec-cmd.c:237 trace: resolved executable dir: C:/Program Files/Git/mingw64/bin 08:30:47.655623 git.c:460 trace: built-in: git push --no-verify -o emc2.enable-logging git@$REPO.git FETCH_HEAD:refs/tags/hswebrec/app/stage1/latest This is the command actually run in the foreground. As you might surmise, we're uing Git4Win. It's worth noting that FETCH_HEAD here is 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5. 08:30:47.655623 run-command.c:655 trace: run_command: unset GIT_PREFIX; ssh git@$REPO_HOST 'git-receive-pack '\''$REPO_PROJECT.git'\''' 08:30:48.100544 pkt-line.c:80 packet: push< 4edfa5e150857e21c686826e1e430f6b014ed173 refs/archive/app/devnull\0report-status report-status-v2 delete-refs side-band-64k quiet atomic ofs-delta push-options object-format=sha1 agent=git/2.38.4.gl1 08:30:48.100544 pkt-line.c:80 packet: push< 5d27e7331f08365c9bb3d342ae020807a386f42a refs/heads/app/10.1/stage1 ---8<--- literally thousands of refs removed... 08:30:49.121310 pkt-line.c:80 packet: push< 90c12d8c0ad0559047b3b9de78d948901fcffac3 refs/tags/zrb/I10121236/726711 08:30:49.121310 pkt-line.c:80 packet: push< 90c12d8c0ad0559047b3b9de78d948901fcffac3 refs/tags/zrb/I10121236/726712 08:30:49.121310 pkt-line.c:80 packet: push< 0000 Looks like the above was the remote telling the client what objects it has by virtue of what refs it is tracking. 08:30:49.193113 pkt-line.c:80 packet: push> shallow 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5 08:30:49.193113 pkt-line.c:80 packet: push> 0000000000000000000000000000000000000000 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5 refs/tags/hswebrec/app/stage1/latest\0 report-status-v2 side-band-64k quiet push-options object-format=sha1 agent=git/2.39.2.windows.1 08:30:49.193113 pkt-line.c:80 packet: push> 0000 And this is the client telling the remote what objects it has ('shallow 0962f6...'?) and what changes it would like to make. 08:30:49.193113 pkt-line.c:80 packet: push> emc2.enable-logging 08:30:49.193113 pkt-line.c:80 packet: push> 0000 ...as well as our push-option that enabled more logging for us (nothing relevant to Git communication -- mostly just internal web services and print-statement debugging). 08:30:49.193113 run-command.c:655 trace: run_command: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset -q --shallow 08:30:49.224949 exec-cmd.c:237 trace: resolved executable dir: C:/Program Files/Git/mingw64/libexec/git-core 08:30:49.224949 git.c:460 trace: built-in: git pack-objects --all-progress-implied --revs --stdout --thin --delta-base-offset -q --shallow 08:30:49.224949 run-command.c:655 trace: run_command: git -c fetch.negotiationAlgorithm=noop fetch git@$REPO.git --no-tags --no-write-fetch-head --recurse-submodules=no --filter=blob:none --stdin 08:30:49.246673 exec-cmd.c:237 trace: resolved executable dir: C:/Program Files/Git/mingw64/libexec/git-core 08:30:49.256718 git.c:460 trace: built-in: git fetch git@$REPO.git --no-tags --no-write-fetch-head --recurse-submodules=no --filter=blob:none --stdin 08:30:49.256718 run-command.c:655 trace: run_command: unset GIT_CONFIG_PARAMETERS GIT_PREFIX; GIT_PROTOCOL=version=2 ssh -o SendEnv=GIT_PROTOCOL git@tracklab.epic.com 'git-upload-pack '\''epic/test/trackdev/mono-23/app.git'\''' It looks like this is the client initiating a fetch. 08:30:49.699720 pkt-line.c:80 packet: fetch< version 2 08:30:49.699720 pkt-line.c:80 packet: fetch< agent=git/2.38.4.gl1 08:30:49.699720 pkt-line.c:80 packet: fetch< ls-refs=unborn 08:30:49.699720 pkt-line.c:80 packet: fetch< fetch=shallow wait-for-done filter 08:30:49.699720 pkt-line.c:80 packet: fetch< server-option 08:30:49.699720 pkt-line.c:80 packet: fetch< object-format=sha1 08:30:49.699720 pkt-line.c:80 packet: fetch< object-info The remote tells the client what version it is so the client can send a request the remote understands. 08:30:49.699720 pkt-line.c:80 packet: fetch< 0000 08:30:49.699720 pkt-line.c:80 packet: fetch> command=fetch 08:30:49.699720 pkt-line.c:80 packet: fetch> agent=git/2.39.2.windows.1 08:30:49.699720 pkt-line.c:80 packet: fetch> object-format=sha1 08:30:49.699720 pkt-line.c:80 packet: fetch> 0001 08:30:49.699720 pkt-line.c:80 packet: fetch> thin-pack 08:30:49.699720 pkt-line.c:80 packet: fetch> no-progress 08:30:49.699720 pkt-line.c:80 packet: fetch> ofs-delta 08:30:49.699720 pkt-line.c:80 packet: fetch> shallow 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5 08:30:49.699720 pkt-line.c:80 packet: fetch> filter blob:none 08:30:49.699720 pkt-line.c:80 packet: fetch> want 8da2fa849db733188b1820865deb800d8e6abfc6 08:30:49.699720 pkt-line.c:80 packet: fetch> done 08:30:49.699720 pkt-line.c:80 packet: fetch> 0000 The client asks the remote for content. Looks like the filter here got changed from tree:0 to blob:none. This could be the bug -- and could explain the 'weird' amount of content that was actually downloaded. A fully-fleshed-out clone would be about 8GB, but I could certainly see a blobless history being ~700MB. Interesting to note here that 0962f6^{tree} is 8da2fa. 08:30:49.715316 pkt-line.c:80 packet: fetch< shallow-info 08:30:49.715316 pkt-line.c:80 packet: fetch< 0001 08:30:49.715316 pkt-line.c:80 packet: fetch< packfile Not sure what this bit is, to be honest. 08:30:50.527739 pkt-line.c:80 packet: sideband< PACK ... 08:30:50.542281 run-command.c:655 trace: run_command: git index-pack --stdin --fix-thin '--keep=fetch-pack 6448 on win-pool7447' --promisor --pack_header=2,71814 08:30:50.574053 exec-cmd.c:237 trace: resolved executable dir: C:/Program Files/Git/mingw64/libexec/git-core 08:30:50.590271 git.c:460 trace: built-in: git index-pack --stdin --fix-thin '--keep=fetch-pack 6448 on win-pool7447' --promisor --pack_header=2,71814 08:30:53.027721 pkt-line.c:80 packet: sideband< 0000 08:30:53.200395 run-command.c:655 trace: run_command: git maintenance run --auto --no-quiet 08:30:53.246231 exec-cmd.c:237 trace: resolved executable dir: C:/Program Files/Git/mingw64/libexec/git-core 08:30:53.248264 git.c:460 trace: built-in: git maintenance run --auto --no-quiet 08:30:59.362381 run-command.c:655 trace: run_command: git -c fetch.negotiationAlgorithm=noop fetch git@$REPO.git --no-tags --no-write-fetch-head --recurse-submodules=no --filter=blob:none --stdin 08:30:59.378458 exec-cmd.c:237 trace: resolved executable dir: C:/Program Files/Git/mingw64/libexec/git-core 08:30:59.394133 git.c:460 trace: built-in: git fetch git@$REPO.git --no-tags --no-write-fetch-head --recurse-submodules=no --filter=blob:none --stdin 08:33:28.966748 run-command.c:655 trace: run_command: unset GIT_CONFIG_PARAMETERS GIT_PREFIX; GIT_PROTOCOL=version=2 ssh -o SendEnv=GIT_PROTOCOL git@tracklab.epic.com 'git-upload-pack '\''epic/test/trackdev/mono-23/app.git'\''' Nor why we'd go through a separate round of fetching. 08:33:29.530124 pkt-line.c:80 packet: fetch< version 2 08:33:29.530124 pkt-line.c:80 packet: fetch< agent=git/2.38.4.gl1 08:33:29.530124 pkt-line.c:80 packet: fetch< ls-refs=unborn 08:33:29.530124 pkt-line.c:80 packet: fetch< fetch=shallow wait-for-done filter 08:33:29.530124 pkt-line.c:80 packet: fetch< server-option 08:33:29.530124 pkt-line.c:80 packet: fetch< object-format=sha1 08:33:29.530124 pkt-line.c:80 packet: fetch< object-info 08:33:29.530124 pkt-line.c:80 packet: fetch< 0000 08:33:51.764098 pkt-line.c:80 packet: fetch> command=fetch 08:33:51.764098 pkt-line.c:80 packet: fetch> agent=git/2.39.2.windows.1 08:33:51.764098 pkt-line.c:80 packet: fetch> object-format=sha1 08:33:51.764098 pkt-line.c:80 packet: fetch> 0001 08:33:51.764098 pkt-line.c:80 packet: fetch> thin-pack 08:33:51.764098 pkt-line.c:80 packet: fetch> no-progress 08:33:51.764098 pkt-line.c:80 packet: fetch> ofs-delta 08:33:51.764098 pkt-line.c:80 packet: fetch> shallow 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5 08:33:51.764098 pkt-line.c:80 packet: fetch> filter blob:none But we can see this blob:none 'mistake' again... 08:33:51.764098 pkt-line.c:80 packet: fetch> want 0000063ae70e4385b0527df060daf0a81b306c8d 08:33:51.764098 pkt-line.c:80 packet: fetch> want 00003efd5bd3b3795950588f7d051e8d0b42def3 ---8<--- many, many thousands of objects removed 08:33:54.154726 pkt-line.c:80 packet: fetch> want ffffd8aad00ab11c0672096203f57564b286da08 08:33:54.154726 pkt-line.c:80 packet: fetch> want ffffd967eaed43bf87d40a84f2f9e12c59575abe 08:33:54.154726 pkt-line.c:80 packet: fetch> done 08:33:54.154726 pkt-line.c:80 packet: fetch> 0000 ... with all of those trees 08:33:56.058314 pkt-line.c:80 packet: fetch< shallow-info 08:33:56.058314 pkt-line.c:80 packet: fetch< 0001 08:33:56.058314 pkt-line.c:80 packet: fetch< packfile 08:33:57.783631 pkt-line.c:80 packet: sideband< PACK ... 08:33:57.799321 run-command.c:655 trace: run_command: git index-pack --stdin --fix-thin '--keep=fetch-pack 6540 on win-pool7447' --promisor --pack_header=2,344628 08:33:57.830596 exec-cmd.c:237 trace: resolved executable dir: C:/Program Files/Git/mingw64/libexec/git-core 08:33:57.830596 git.c:460 trace: built-in: git index-pack --stdin --fix-thin '--keep=fetch-pack 6540 on win-pool7447' --promisor --pack_header=2,344628 08:34:35.033275 pkt-line.c:80 packet: sideband< 0000 08:34:46.298777 run-command.c:655 trace: run_command: git maintenance run --auto --no-quiet 08:34:46.330052 exec-cmd.c:237 trace: resolved executable dir: C:/Program Files/Git/mingw64/libexec/git-core 08:34:46.345653 git.c:460 trace: built-in: git maintenance run --auto --no-quiet 08:45:14.094158 pkt-line.c:80 packet: sideband< \1 08:45:18.281129 pkt-line.c:80 packet: sideband< \2pre-receive started at 1677077118283 remote: pre-receive started at 1677077118283 08:45:18.297266 pkt-line.c:80 packet: sideband< \2Received line '0000000000000000000000000000000000000000 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5 refs/tags/hswebrec/app/stage1/l 08:45:18.297266 pkt-line.c:80 packet: sideband< \2atest' remote: Received line '0000000000000000000000000000000000000000 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5 refs/tags/hswebrec/app/stage1/latest' ---8<--- clipped pre-receive output 08:45:18.550847 pkt-line.c:80 packet: sideband< \2pre-receive finished at 1677077118556 (273 ms) remote: pre-receive finished at 1677077118556 (273 ms) 08:45:21.448647 pkt-line.c:80 packet: sideband< \1000eunpack ok002cok refs/tags/hswebrec/app/stage1/latest0000 08:45:21.448647 pkt-line.c:80 packet: push< unpack ok 08:45:21.448647 pkt-line.c:80 packet: push< ok refs/tags/hswebrec/app/stage1/latest 08:45:21.448647 pkt-line.c:80 packet: push< 0000 08:45:21.798337 pkt-line.c:80 packet: sideband< \2post-receive started at 1677077121804 remote: post-receive started at 1677077121804 08:45:21.814043 pkt-line.c:80 packet: sideband< \2Received line '0000000000000000000000000000000000000000 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5 refs/tags/hswebrec/app/stage1/l 08:45:21.814043 pkt-line.c:80 packet: sideband< \2atest' remote: Received line '0000000000000000000000000000000000000000 0962f6cd9b1f2b5a012581823c12f0f0619bd3f5 refs/tags/hswebrec/app/stage1/latest' ---8<--- clipped post-receive output 08:45:21.972423 pkt-line.c:80 packet: sideband< \23.0000; path=/; Httponly; Secure"]}}post-receive finished at 1677077121979 (175 ms) remote: post-receive finished at 1677077121979 (175 ms) 08:45:22.051793 pkt-line.c:80 packet: sideband< 0000 To $REPO.git * [new tag] FETCH_HEAD -> hswebrec/app/stage1/latest -- Sean Allred ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUGREPORT] Why is git-push fetching content? 2023-02-22 15:04 ` Sean Allred @ 2023-06-20 11:26 ` Tao Klerks 2023-07-08 6:27 ` Sean Allred 0 siblings, 1 reply; 7+ messages in thread From: Tao Klerks @ 2023-06-20 11:26 UTC (permalink / raw) To: Sean Allred; +Cc: brian m. carlson, Sean Allred, Kyle VandeWalle, git On Wed, Feb 22, 2023 at 4:45 PM Sean Allred <allred.sean@gmail.com> wrote: > > > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > > It's hard to know for certain what's going on here, but it depends on > > your history. You did a partial clone with no trees, so you've likely > > received a single commit object and no trees or blobs. > > Yup, this was the intention behind `--depth=1 --filter=tree:0`. The > server doing this ref update needs to be faster than having the full > history would allow. > FWIW, you're not alone - we do exactly the same thing, for the same reasons, and get the same outcome: We want to create a tag in a CI job, that particular CI job has no reason to check out the code, all we know is we want ref XXXXX to point to commit YYYYY. The most logical way to achieve that seems to be to do a shallow partial no-checkout clone of commit YYYYY, and then push to remote ref XXXXX, but the push ends up doing extra seemingly-unnecessary jit-fetching work. In our case it's still better than any alternative we've found, but wastes a few seconds that we'd love to see optimized away. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUGREPORT] Why is git-push fetching content? 2023-06-20 11:26 ` Tao Klerks @ 2023-07-08 6:27 ` Sean Allred 2023-07-08 8:39 ` Sean Allred 0 siblings, 1 reply; 7+ messages in thread From: Sean Allred @ 2023-07-08 6:27 UTC (permalink / raw) To: Tao Klerks Cc: Sean Allred, brian m. carlson, Sean Allred, Kyle VandeWalle, git Thanks for the replies. I'd like to bump this up again. This has come up in a new context and I don't see a viable workaround for us that doesn't involve a rewrite of the process and an excessive amount of new infrastructure. I have a feeling this is somehow a general issue with promisor remotes, though I don't know enough about how they work to know where to start investigation. I've got what I believe to be minimal reproduction steps below. Tao Klerks <tao@klerks.biz> writes: > On Wed, Feb 22, 2023 at 4:45 PM Sean Allred <allred.sean@gmail.com> wrote: >> "brian m. carlson" <sandals@crustytoothpaste.net> writes: >> > It's hard to know for certain what's going on here, but it depends on >> > your history. You did a partial clone with no trees, so you've likely >> > received a single commit object and no trees or blobs. >> >> Yup, this was the intention behind `--depth=1 --filter=tree:0`. The >> server doing this ref update needs to be faster than having the full >> history would allow. >> > > FWIW, you're not alone - we do exactly the same thing, for the same > reasons, and get the same outcome: We want to create a tag in a CI > job, that particular CI job has no reason to check out the code, all > we know is we want ref XXXXX to point to commit YYYYY. > > [...] > > In our case it's still better than any alternative we've found, but > wastes a few seconds that we'd love to see optimized away. Unfortunately in our case, 'a few seconds' is tens of minutes (I'm working with a repository of several million commits) and is timing out the remote host. ---- I devised some minimal steps to reproduce what I believe to be a related issue: rev-list fetching content. I've prepared a public repository on github.com to demonstrate, but you should be able to recreate this repository if needed by just making a handful of commits to a couple arbitrary files. (cwd:tmp) $ git clone --no-checkout --depth=1 --no-tags --filter=tree:0 https://github.com/vermiculus/testibus.git Cloning into 'testibus'... remote: Enumerating objects: 1, done. remote: Counting objects: 100% (1/1), done. remote: Total 1 (delta 0), reused 1 (delta 0), pack-reused 0 Receiving objects: 100% (1/1), done. Sweet, I've only received one object from the remote. This makes sense per what I want: a treeless, blobless, fetch of a single commit. Let's double-check. (cwd:testibus) $ git fsck Checking object directories: 100% (256/256), done. Checking objects: 100% (2/2), done. I have two objects? How'd that second one get in there? What is it? Let's try to find out... (cwd:testibus) $ git rev-list --objects --all d86642e7ae089b69e8a0b20a3e39337435833f92 Alright, I've got the commit object. That makes sense. c0fa909c5f67047abc027d9b06e1352954ee33f7 Weird, I also got the tree on the commit, even though I specified that this should be a treeless clone. remote: Enumerating objects: 1, done. remote: Counting objects: 100% (1/1), done. remote: Total 1 (delta 0), reused 1 (delta 0), pack-reused 0 Receiving objects: 100% (1/1), 54 bytes | 54.00 KiB/s, done. 94b334d80405218e281a6f5b48d31f73cd3af4be file Woah woah! All I did was rev-list; why are we fetching content? This is why I believe this is related to the push issue I'm ultimately facing -- I'm not familiar with the specifics, but it stands to reason that git-push needs to (somehow) iterate through objects in order to negotiate a packfile with the remote. I suspect these two issues have the same root cause. I believe the following can be used with git-bisect to determine if this truly ever worked or is a regression: setup: #!/bin/bash repo="https://github.com/vermiculus/testibus.git" repo_dir="~/path/to/repo" git clone --no-checkout --depth=1 --no-tags --filter=tree:0 "$repo" "$repo_dir" git -C "$repo_dir" remote set-url origin unreachable bisect script: git -C "$repo_dir" rev-list --objects --all (obviously using the just-built git) I'm going to start running this bisect, but I suspect it will take a while, so I wanted to get this out there. -- Sean Allred ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [BUGREPORT] Why is git-push fetching content? 2023-07-08 6:27 ` Sean Allred @ 2023-07-08 8:39 ` Sean Allred 0 siblings, 0 replies; 7+ messages in thread From: Sean Allred @ 2023-07-08 8:39 UTC (permalink / raw) To: Sean Allred Cc: Tao Klerks, brian m. carlson, Sean Allred, Kyle VandeWalle, git Following up with the results of my bisect (more discussion below). I'm forced to conclude this may somehow have never worked as I'm expecting (even though I do recall it working well in a long-gone environment), but I'm very much hoping I just did the bisect incorrectly. (It's not a feature I need to use much.) So, is this a bug or is this working as intended for a good reason? Sean Allred <allred.sean@gmail.com> writes: > Thanks for the replies. I'd like to bump this up again. This has come up > in a new context and I don't see a viable workaround for us that doesn't > involve a rewrite of the process and an excessive amount of new > infrastructure. > > I have a feeling this is somehow a general issue with promisor remotes, > though I don't know enough about how they work to know where to start > investigation. I've got what I believe to be minimal reproduction steps > below. > > [...] > > I believe the following can be used with git-bisect to determine if this > truly ever worked or is a regression: > > setup: > #!/bin/bash > > repo="https://github.com/vermiculus/testibus.git" > repo_dir="~/path/to/repo" > > git clone --no-checkout --depth=1 --no-tags --filter=tree:0 "$repo" "$repo_dir" > git -C "$repo_dir" remote set-url origin unreachable > > bisect script: > git -C "$repo_dir" rev-list --objects --all > > (obviously using the just-built git) > > I'm going to start running this bisect, but I suspect it will take a > while, so I wanted to get this out there. I ended up using a bisect script that looks like this #!/bin/bash make clean NO_GETTEXT=1 make -j8 || exit 125 ./bin-wrappers/git -C "$1" rev-list --objects --all || exit 1 git rev-parse HEAD >> ../good-commits and running git bisect start main 637fc4467e57872008171958eda0428818a7ee03 git bisect run ../bisect-script.sh ~/tmp/testibus/ It took less time than I thought, but unfortunately I was never able to actually find a 'good' commit. I arbitrarily chose "partial-clone: design doc" (Jeff Hostetler, Dec 14 2017) as the first commit to the partial-clone design document (under the assumption that it worked at some point). If potentially lying to git-bisect in this way is especially liable to bust it, I can start the exponentially-more- expensive process of testing every commit along --first-parent, but I suspect this may have never worked as I'm expecting. -- Sean Allred ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <7bfb7ecd4a4c78668f97b00d5f06af0c9b2878269476e89c3311eeb8071b1ab3@mu.id>]
* Re: [BUGREPORT] Why is git-push fetching content? [not found] ` <7bfb7ecd4a4c78668f97b00d5f06af0c9b2878269476e89c3311eeb8071b1ab3@mu.id> @ 2023-02-22 15:48 ` Sean Allred 0 siblings, 0 replies; 7+ messages in thread From: Sean Allred @ 2023-02-22 15:48 UTC (permalink / raw) To: brian m. carlson; +Cc: Sean Allred, Kyle VandeWalle, git Apologies for the double-email; in switching between desktops, I prematurely sent my last message. Luckily I was very nearly done. Sean Allred <allred.sean@gmail.com> writes: > But we can see this blob:none 'mistake' again... > > 08:33:51.764098 pkt-line.c:80 packet: fetch> want 0000063ae70e4385b0527df060daf0a81b306c8d > 08:33:51.764098 pkt-line.c:80 packet: fetch> want 00003efd5bd3b3795950588f7d051e8d0b42def3 > > ---8<--- many, many thousands of objects removed > > 08:33:54.154726 pkt-line.c:80 packet: fetch> want ffffd8aad00ab11c0672096203f57564b286da08 > 08:33:54.154726 pkt-line.c:80 packet: fetch> want ffffd967eaed43bf87d40a84f2f9e12c59575abe > 08:33:54.154726 pkt-line.c:80 packet: fetch> done > 08:33:54.154726 pkt-line.c:80 packet: fetch> 0000 > > ... with all of those trees I was verifying my suspicion in the other desktop -- but my suspicion was incorrect. These aren't all trees; in fact, the objects listed above are *just* blobs -- no other object types. One could assume that these are all the blobs in the 8da2fa tree, but I would expect that we'd get all the subtrees in 8da2fa as well in that case. I'm not sure how much more information I can extract from this list of blobs, but I'm open to suggestions if we think there's a pattern here to be discovered. -- Sean Allred ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-07-08 8:51 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-02-21 22:01 [BUGREPORT] Why is git-push fetching content? Sean Allred 2023-02-21 23:02 ` brian m. carlson 2023-02-22 15:04 ` Sean Allred 2023-06-20 11:26 ` Tao Klerks 2023-07-08 6:27 ` Sean Allred 2023-07-08 8:39 ` Sean Allred [not found] ` <7bfb7ecd4a4c78668f97b00d5f06af0c9b2878269476e89c3311eeb8071b1ab3@mu.id> 2023-02-22 15:48 ` Sean Allred
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).