All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lucas De Marchi <lucas.demarchi@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: git@vger.kernel.org
Subject: Re: [BUG REPORT] split-index behavior during interactive rebase
Date: Tue, 21 Sep 2021 00:34:02 -0700	[thread overview]
Message-ID: <20210921073402.cf4y3gp7yyfirfnq@ldmartin-desk2> (raw)
In-Reply-To: <20210916055057.GT3389343@mdroper-desk1.amr.corp.intel.com>

On Wed, Sep 15, 2021 at 10:50:57PM -0700, Matt Roper wrote:
>What did you do before the bug happened? (Steps to reproduce your issue)
>
>  I activated split index mode on a repo ("git config core.splitIndex
>  true"), performed an interactive rebase, modified a commit earlier in
>  the history.
>
>  The steps can be reproduced via a sequence of:
>      $ mkdir tmp && cd tmp && git init
>      $ git config core.splitIndex true
>      $ for x in `seq 20`; do echo $x >> count; git add count; git commit -m "Commit $x"; done
>      $ git rebase -i HEAD~10
>
>      ## Add "x git commit --amend --no-edit" as the first command of
>      ## the todo list.
>
>What did you expect to happen? (Expected behavior)
>
>  My expectation was that there would still only be a single shared index
>  file in the .git directory upon completion of the rebase.
>
>What happened instead? (Actual behavior)
>
>  A large number of distinct sharedindex.* files were generated in the .git
>  directory during the rebase.

Probably relevant to the debug, but I still didn't figure out the cause. This
works ok and only one .sharedindex is created

	git config core.splitIndex true
	git am 000[123].patch
	git config core.splitIndex false

Prepare test:
	git config core.splitIndex false
	git update-index --no-split-index
	rm .git/sharedindex.*
	git reset --hard HEAD~3

	git -c core.splitIndex=true am 000[123].patch

This will create 4 .git/sharedindex.* files.

Then it will create 1 .git/shareindex.* file per call to status if the
current head doesn't match the previous and the splitIndex doesn't match
the previous. This keeps increasing:

	git reset --hard ORIG_HEAD; git -c core.splitIndex=true status; ls -l .git/sharedindex.* | wc -l
	...
	4
	git reset --hard ORIG_HEAD; git -c core.splitIndex=true status; ls -l .git/sharedindex.* | wc -l
	...
	5
	...

note that if I pass -c core.splitIndex=true to git reset, this behavior
goes away. It seems that somehow the setting splitindex is getting reset
during git-am with multiple patches (or during rebase)... ?

Lucas De Marchi

>
>What's different between what you expected and what actually happened?
>
>  Rather than a single shared index file, I wound up with a huge number of
>  large shared index files.  The real repository I was working with (a Linux
>  kernel source tree) had a shared index file size of about 7MB, and I was
>  modifying a commit several hundred back in history (in case it
>  matters, these were all linear commits, no merges), so the resulting
>  collection of shared index files consumed a surprising amount of disk
>  space.
>
>Anything else you want to add:
>
>  As an experiment, I tried setting splitIndex.sharedIndexExpire=now to see
>  if it would avoid the explosion of shared index files, but it appears the
>  stale index files are still not being removed during the rebase, and I
>  still wind up with a huge number at the end of the rebase.  If I manually
>  run "git update-index --split-index" after the rebase completes it will
>  properly delete all of the stale ones at that point.
>
>  Rebases that do not actually modify the history do _not_ trigger the
>  explosion of shared index files (e.g., "git rebase -i HEAD~10 --exec 'echo
>  foo'").
>
>  If I do not set the core.splitIndex setting on the repository, but only
>  activate split index manually via "git update-index --split-index" there
>  is only one shared index file at the end of the rebase, but based on the
>  file size it appears the repository is no longer operating in split index
>  mode.
>
>  Before:
>  $ ll .git | grep index
>  -rw-rw-r--   1 mdroper mdroper   149165 Sep 15 22:21 index
>  -rw-rw-r--   1 mdroper mdroper  7296080 Sep 15 22:21 sharedindex.f916dd59ccc22ca34298f557a4659aca2767dae4
>
>  After (just amending HEAD~1 in this case):
>  $ ls -l .git | grep index
>  -rw-rw-r--   1 mdroper mdroper  7445145 Sep 15 22:22 index
>  -rw-rw-r--   1 mdroper mdroper  7296080 Sep 15 22:22 sharedindex.f916dd59ccc22ca34298f557a4659aca2767dae4
>
>
>[System Info]
>git version:
>git version 2.33.0
>cpu: x86_64
>no commit associated with this build
>sizeof-long: 8
>sizeof-size_t: 8
>shell-path: /bin/sh
>uname: Linux 5.8.18-100.fc31.x86_64 #1 SMP Mon Nov 2 20:32:55 UTC 2020 x86_64
>compiler info: gnuc: 9.3
>libc info: glibc: 2.30
>$SHELL (typically, interactive shell): /bin/bash
>
>
>[Enabled Hooks]
>
>-- 
>Matt Roper
>Graphics Software Engineer
>VTT-OSGC Platform Enablement
>Intel Corporation
>(916) 356-2795

  reply	other threads:[~2021-09-21  7:34 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-16  5:50 [BUG REPORT] split-index behavior during interactive rebase Matt Roper
2021-09-21  7:34 ` Lucas De Marchi [this message]
2021-09-26 21:57 ` SZEDER Gábor
2021-09-27  2:17   ` Matt Roper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210921073402.cf4y3gp7yyfirfnq@ldmartin-desk2 \
    --to=lucas.demarchi@intel.com \
    --cc=git@vger.kernel.org \
    --cc=matthew.d.roper@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.