All of lore.kernel.org
 help / color / mirror / Atom feed
* Cleaning up history with git rebase
@ 2011-07-31 17:20 Ricky Egeland
  2011-07-31 20:21 ` Michael Witten
  0 siblings, 1 reply; 12+ messages in thread
From: Ricky Egeland @ 2011-07-31 17:20 UTC (permalink / raw)
  To: git; +Cc: Patricia Bittencourt Egeland

Hi,

I'll put the question here before the long story: is there a way to automatically rebase a repository (i.e. no conflicts that need manual resolution) from root to HEAD such that the final state always ends up the same as the HEAD?

I'm a relative novice at git and I've been faced with the following task: take a large, 5-year old git repository that contains code for several modular components, and break it apart into ~240 separate git repositories, while preserving history.

big-repo.git (1000's of files) -> component-A.git (10's of files)
                               -> component-B.git (10's of files)
                               -> component-C.git (10's of files)
                               ... (~ 240 new sub-repositories)

The reason for this is to mutually decouple the versioning and release of these components into production, something that seems impossible to do from the single huge git repository we have today.

I've succeeded at breaking apart this big repository using `git filter-branch`, but where I am failing is the cleanup of the history of these new sub-repositories.  The original big repository was used for years in a CVS-like fashion, with about 20 or so developers doing a pull/edit/pull/push workflow using a centralized shared repository.  Most developers were working on unrelated components, so merge conflicts were rare, but there are some exceptions to that.  The end result is that there are a lot of merge commits in big-repo.git, and in the case of my split sub-repositories these merge commits still appear in the history, even for merges which did not involve files that end up in a given repository.  In most cases, there are more merge commits in the history than there are commits that
  actually affected the code that is left in these sub-repositires.  I really want to clean this up.

Looking online, it seems that `git rebase` is the way to go for this cleanup.  I use it like this:

git rebase $(git rev-list --reverse HEAD | head -n 1)

Which I take to mean "rebase this repository from the root to the current HEAD".  In many cases it works perfectly, resulting in a short, clean history that only pertains to the files left in the new sub-repository.  But some of the more actively developed components are problematic, as `git rebase` starts runs into conflicts and becomes interactive, and it is simply too tedious to use the interactive mode to resolve these problems.  I've found a recipe for resolving these conflicts:

 - git status
 - # look for files with problems like "both modified", or "both added", set $CONFLICTFILE
 - git checkout --theirs $CONFLICTFILE
 - git add $CONFLICTFILE
 - git commit -m 'Fixing conflict during rebase' $CONFLICTFILE
 - git rebase --continue
 - # look for message like "did you forget to add..." if so, use --skip
 - git rebase --skip
 - # repeat as often as necessary

For some of my sub-repositories this recipe did exactly what I wanted after repeating only a couple times.  However, some of my sub-repositories have been forcing me to repeat this more than 50 times, and I grew tired and started to look for ways to automate this.  In essence, I want a non-interactive `git rebase`.

To that end I upgraded my version of git to 1.7.4 and tried (without really understanding what these were doing):

1. git rebase -s recursive -X theirs $(git rev-list --reverse HEAD | head -n 1)
2. git rebase -s recursive -X ours $(git rev-list --reverse HEAD | head -n 1)
3. git rebase -s ours $(git rev-list --reverse HEAD | head -n 1)

Method 1 and 2 were still interactive and stopped at conflicts.  Method 3 was automatic but left me with the sub-repository at the state of the root commit... the opposite of what I want.

So finally, my question: is there a way to automatically rebase a repository (i.e. no conflicts that need manual resolution) from root to HEAD such that the final state always ends up the same as the HEAD?  I could try to script my recipe above in order to automate that, but the -s and -X options of `git rebase` lead me to believe git developers have really been working to try to automate stuff like this.  I don't know, I'm stuck and looking for suggestions.

Thanks,
Ricky Egeland

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-07-31 17:20 Cleaning up history with git rebase Ricky Egeland
@ 2011-07-31 20:21 ` Michael Witten
  2011-07-31 21:33   ` Michael Witten
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Witten @ 2011-07-31 20:21 UTC (permalink / raw)
  To: Ricky Egeland; +Cc: Patricia Bittencourt Egeland, git

On Sun, 31 Jul 2011 14:20:37 -0300, Ricky Egeland wrote:

> I've succeeded at breaking apart this big repository using
> `git filter-branch`, but where I am failing is the cleanup of
> the history of these new sub-repositories. The original big
> repository was used for years in a CVS-like fashion, with about
> 20 or so developers doing a pull/edit/pull/push workflow using a
> centralized shared repository. Most developers were working on
> unrelated components, so merge conflicts were rare, but there are
> some exceptions to that. The end result is that there are a lot
> of merge commits in big-repo.git, and in the case of my split
> sub-repositories these merge commits still appear in the history,
> even for merges which did not involve files that end up in a
> given repository. In most cases, there are more merge commits in
> the history than there are commits that actually affected the
> code that is left in these sub-repositires. I really want to
> clean this up.


Maybe you could use a more sophisticated filter-branch script to
examine merge commits and split them up or throw them out as
necessary.


> git rebase $(git rev-list --reverse HEAD | head -n 1)


As an aside, I would have expected to be able to limit the
`rev-list' output directly as in the following:

  git rebase $(git rev-list -1 --reverse HEAD)

but it doesn't work; when `-1' is passed, rev-list ignores
`--reverse', which I think is a bug.


> Which I take to mean "rebase this repository from the root to the
> current HEAD". In many cases it works perfectly, resulting in
> a short, clean history that only pertains to the files left in
> the new sub-repository. But some of the more actively developed
> components are problematic, as `git rebase` starts runs into
> conflicts and becomes interactive, and it is simply too tedious
> to use the interactive mode to resolve these problems. I've found
> a recipe for resolving these conflicts:


I don't understand why there would even be conflicts; can you give
a concrete example so that I might be able to understand the
scenario better?


>
>  - git status
>  - # look for files with problems like "both modified", or
>    "both added", set $CONFLICTFILE
>  - git checkout --theirs $CONFLICTFILE
>  - git add $CONFLICTFILE
>  - git commit -m 'Fixing conflict during rebase' $CONFLICTFILE


As an aside, putting `$CONFLICTFILE' on the `git commit' command
line seems to be redundant; from `git help commit':
  
  When files are given on the command line, the command commits
  the contents of the named files, without recording the changes
  already staged. The contents of these files are also staged for
  the next commit on top of what have been staged before.


>  - git rebase --continue
>  - # look for message like "did you forget to add..." if so, use --skip
>  - git rebase --skip
>  - # repeat as often as necessary
>
> For some of my sub-repositories this recipe did exactly what I
> wanted after repeating only a couple times. However, some of my
> sub-repositories have been forcing me to repeat this more than 50
> times, and I grew tired and started to look for ways to automate
> this. In essence, I want a non-interactive `git rebase`.
>
> To that end I upgraded my version of git to 1.7.4 and tried
> (without really understanding what these were doing):
>
> 1. git rebase -s recursive -X theirs \
>      $(git rev-list --reverse HEAD | head -n 1)
>
> 2. git rebase -s recursive -X ours \
>      $(git rev-list --reverse HEAD | head -n 1)
>
> 3. git rebase -s ours $(git rev-list --reverse HEAD | head -n 1)
>
> Method 1 and 2 were still interactive and stopped at conflicts.
> Method 3 was automatic but left me with the sub-repository at the
> state of the root commit... the opposite of what I want.

The results of Method 3 are predicted by `git help rebase' page:

  Because git rebase replays each commit from the working branch
  on top of the <upstream> branch using the given strategy, using
  the ours strategy simply discards all patches from the <branch>,
  which makes little sense.

In any case, how would you know which side to take in general?
It seems to me that you need to do something different at the
filter-branch step. Why are there conflicts anyway?

Sincerely,
Michael Witten

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-07-31 20:21 ` Michael Witten
@ 2011-07-31 21:33   ` Michael Witten
  2011-07-31 21:44     ` Ricky Egeland
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Witten @ 2011-07-31 21:33 UTC (permalink / raw)
  To: Ricky Egeland; +Cc: Patricia Bittencourt Egeland, git

On Sun, Jul 31, 2011 at 20:21, Michael Witten <mfwitten@gmail.com> wrote:
> Why are there conflicts anyway?

Oh...

I guess there were conflicts when the merge commit was made in the
original repository, and these conflicts were resolved by the merge
commit itself. Hence, when rebase tries to split up a merge by dealing
with just the non-merge parents, you end up having to deal with the
conflict again.

Shouldn't rebase take this into account?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-07-31 21:33   ` Michael Witten
@ 2011-07-31 21:44     ` Ricky Egeland
  2011-08-01  1:02       ` Michael Witten
  0 siblings, 1 reply; 12+ messages in thread
From: Ricky Egeland @ 2011-07-31 21:44 UTC (permalink / raw)
  To: Michael Witten; +Cc: Patricia Bittencourt Egeland, git


On Jul 31, 2011, at 6:33 PM, Michael Witten wrote:

> On Sun, Jul 31, 2011 at 20:21, Michael Witten <mfwitten@gmail.com> wrote:
>> Why are there conflicts anyway?
> 
> Oh...
> 
> I guess there were conflicts when the merge commit was made in the
> original repository, and these conflicts were resolved by the merge
> commit itself. Hence, when rebase tries to split up a merge by dealing
> with just the non-merge parents, you end up having to deal with the
> conflict again.

Yes, I thought it was something like this going on, too.  In the pre-rebase history, when there is a commit with "Conflict:" and listing file which is in the sub-repository history, this is a point where rebase stops with a conflict.

> Shouldn't rebase take this into account?

Not sure.  Seems that it does not, it makes me resolve the conflict again.

-- Ricky

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-07-31 21:44     ` Ricky Egeland
@ 2011-08-01  1:02       ` Michael Witten
  2011-08-01  1:07         ` Michael Witten
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Witten @ 2011-08-01  1:02 UTC (permalink / raw)
  To: Ricky Egeland; +Cc: Patricia Bittencourt Egeland, git

On Sun, 31 Jul 2011 18:44:43 -0300, Ricky, Egeland wrote:

> On Jul 31, 2011, at 6:33 PM, Michael Witten wrote:
> 
>> On Sun, Jul 31, 2011 at 20:21, Michael Witten <mfwitten@gmail.com> wrote:
>>> Why are there conflicts anyway?
>>
>> Oh...
>>
>> I guess there were conflicts when the merge commit was made in
>> the original repository, and these conflicts were resolved by
>> the merge commit itself. Hence, when rebase tries to split up
>> a merge by dealing with just the non-merge parents, you end up
>> having to deal with the conflict again.
> 
> Yes, I thought it was something like this going on, too. In the
> pre-rebase history, when there is a commit with "Conflict:" and
> listing file which is in the sub-repository history, this is a
> point where rebase stops with a conflict.
> 
>> Shouldn't rebase take this into account?
> 
> Not sure.  Seems that it does not, it makes me resolve the conflict =
> again.

I think git rebase should take this into account is what I'm saying.

The following implements what I think `git rebase' should be doing;
run it instead of `git rebase' in your repo:

  git branch saved
  git rev-list HEAD --reverse --first-parent --parents |
  {
    read root
    git reset --hard $root
    rebase_head=$root

    while read commit first_parent other_parents; do

      if [ -z "$other_parents" ]; then

        git cherry-pick $commit
        rebase_head=$commit

      else

        for parent in $other_parents; do

          if ! git cherry-pick $parent; then

            git reset --hard $rebase_head
            git merge $other_parents
            git rm -rf .
            git checkout -- $commit
            git commit -aC $commit 
            break

          fi

        done

        rebase_head=$(git rev-parse HEAD)

      fi

    done
  }

Sincerely,
Michael Witten

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-08-01  1:02       ` Michael Witten
@ 2011-08-01  1:07         ` Michael Witten
  2011-08-03 20:58           ` pbegeland
  2011-08-04 14:18           ` Michael Witten
  0 siblings, 2 replies; 12+ messages in thread
From: Michael Witten @ 2011-08-01  1:07 UTC (permalink / raw)
  To: Ricky Egeland; +Cc: Patricia Bittencourt Egeland, git

Michael Witten wrote:

> On Sun, 31 Jul 2011 18:44:43 -0300, Ricky, Egeland wrote:
>
>> On Jul 31, 2011, at 6:33 PM, Michael Witten wrote:
>> 
>>> On Sun, Jul 31, 2011 at 20:21, Michael Witten <mfwitten@gmail.com> wrote:
>>>> Why are there conflicts anyway?
>>>
>>> Oh...
>>>
>>> I guess there were conflicts when the merge commit was made in
>>> the original repository, and these conflicts were resolved by
>>> the merge commit itself. Hence, when rebase tries to split up
>>> a merge by dealing with just the non-merge parents, you end up
>>> having to deal with the conflict again.
>> 
>> Yes, I thought it was something like this going on, too. In the
>> pre-rebase history, when there is a commit with "Conflict:" and
>> listing file which is in the sub-repository history, this is a
>> point where rebase stops with a conflict.
>> 
>>> Shouldn't rebase take this into account?
>> 
>> Not sure.  Seems that it does not, it makes me resolve the conflict =
>> again.
>
> I think git rebase should take this into account is what I'm saying.
>
> The following implements what I think `git rebase' should be doing;
> run it instead of `git rebase' in your repo:
>
>   git branch saved
>   git rev-list HEAD --reverse --first-parent --parents |
>   {
>     read root
>     git reset --hard $root
>     rebase_head=$root
>
>     while read commit first_parent other_parents; do
>
>       if [ -z "$other_parents" ]; then
>
>         git cherry-pick $commit
>         rebase_head=$commit
>
>       else
>
>         for parent in $other_parents; do
>
>           if ! git cherry-pick $parent; then
>
>             git reset --hard $rebase_head
>             git merge $other_parents
>             git rm -rf .
>             git checkout -- $commit
>             git commit -aC $commit 
>             break
>
>           fi
>
>         done
>
>         rebase_head=$(git rev-parse HEAD)
>
>       fi
>
>     done
>   }

Woops!

This line:

  git checkout -- $commit

should be:

  git checkout $commit -- .

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-08-01  1:07         ` Michael Witten
@ 2011-08-03 20:58           ` pbegeland
  2011-08-04 14:35             ` Michael Witten
  2011-08-04 14:18           ` Michael Witten
  1 sibling, 1 reply; 12+ messages in thread
From: pbegeland @ 2011-08-03 20:58 UTC (permalink / raw)
  To: Michael Witten; +Cc: Ricky Egeland, git

 Dear Michael,

      I tried to run the script in my repo. However, seems like the 'git 
 merge $other_parents' process fails. In the script output I see some 
 lines saying that files were not able to be merged, ie:

 warning: Cannot merge binary files: 
 scienceportal/images/tabs/tabs-gray.png (HEAD vs. 
 84f6fc283861aa7c5798f58769789dd0b91a5e9d)
 warning: Cannot merge binary files: scienceportal/images/waiting.gif 
 (HEAD vs. e033cbbf1e9d24b66cb55a04701c059dc945c1c3)

      Do you have some suggestion?

 Thanks,
 Patricia


 On Mon, 01 Aug 2011 01:07:33 -0000, Michael Witten <mfwitten@gmail.com> 
 wrote:
> Michael Witten wrote:
>
>> On Sun, 31 Jul 2011 18:44:43 -0300, Ricky, Egeland wrote:
>>
>>> On Jul 31, 2011, at 6:33 PM, Michael Witten wrote:
>>>
>>>> On Sun, Jul 31, 2011 at 20:21, Michael Witten <mfwitten@gmail.com> 
>>>> wrote:
>>>>> Why are there conflicts anyway?
>>>>
>>>> Oh...
>>>>
>>>> I guess there were conflicts when the merge commit was made in
>>>> the original repository, and these conflicts were resolved by
>>>> the merge commit itself. Hence, when rebase tries to split up
>>>> a merge by dealing with just the non-merge parents, you end up
>>>> having to deal with the conflict again.
>>>
>>> Yes, I thought it was something like this going on, too. In the
>>> pre-rebase history, when there is a commit with "Conflict:" and
>>> listing file which is in the sub-repository history, this is a
>>> point where rebase stops with a conflict.
>>>
>>>> Shouldn't rebase take this into account?
>>>
>>> Not sure.  Seems that it does not, it makes me resolve the conflict 
>>> =
>>> again.
>>
>> I think git rebase should take this into account is what I'm saying.
>>
>> The following implements what I think `git rebase' should be doing;
>> run it instead of `git rebase' in your repo:
>>
>>   git branch saved
>>   git rev-list HEAD --reverse --first-parent --parents |
>>   {
>>     read root
>>     git reset --hard $root
>>     rebase_head=$root
>>
>>     while read commit first_parent other_parents; do
>>
>>       if [ -z "$other_parents" ]; then
>>
>>         git cherry-pick $commit
>>         rebase_head=$commit
>>
>>       else
>>
>>         for parent in $other_parents; do
>>
>>           if ! git cherry-pick $parent; then
>>
>>             git reset --hard $rebase_head
>>             git merge $other_parents
>>             git rm -rf .
>>             git checkout -- $commit
>>             git commit -aC $commit
>>             break
>>
>>           fi
>>
>>         done
>>
>>         rebase_head=$(git rev-parse HEAD)
>>
>>       fi
>>
>>     done
>>   }
>
> Woops!
>
> This line:
>
>   git checkout -- $commit
>
> should be:
>
>   git checkout $commit -- .

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-08-01  1:07         ` Michael Witten
  2011-08-03 20:58           ` pbegeland
@ 2011-08-04 14:18           ` Michael Witten
  1 sibling, 0 replies; 12+ messages in thread
From: Michael Witten @ 2011-08-04 14:18 UTC (permalink / raw)
  To: Patricia Bittencourt Egeland; +Cc: Ricky Egeland, git

On Mon, 01 Aug 2011 01:07:33 -0000, Michael Witten wrote:

> Michael Witten wrote:
>
>> On Sun, 31 Jul 2011 18:44:43 -0300, Ricky, Egeland wrote:
>>
>>> On Jul 31, 2011, at 6:33 PM, Michael Witten wrote:
>>> 
>>>> On Sun, Jul 31, 2011 at 20:21, Michael Witten <mfwitten@gmail.com> wrote:
>>>>> Why are there conflicts anyway?
>>>>
>>>> Oh...
>>>>
>>>> I guess there were conflicts when the merge commit was made in
>>>> the original repository, and these conflicts were resolved by
>>>> the merge commit itself. Hence, when rebase tries to split up
>>>> a merge by dealing with just the non-merge parents, you end up
>>>> having to deal with the conflict again.
>>> 
>>> Yes, I thought it was something like this going on, too. In the
>>> pre-rebase history, when there is a commit with "Conflict:" and
>>> listing file which is in the sub-repository history, this is a
>>> point where rebase stops with a conflict.
>>> 
>>>> Shouldn't rebase take this into account?
>>> 
>>> Not sure.  Seems that it does not, it makes me resolve the conflict =
>>> again.
>>
>> I think git rebase should take this into account is what I'm saying.
>>
>> The following implements what I think `git rebase' should be doing;
>> run it instead of `git rebase' in your repo:
>>
>>   git branch saved
>>   git rev-list HEAD --reverse --first-parent --parents |
>>   {
>>     read root
>>     git reset --hard $root
>>     rebase_head=$root
>>
>>     while read commit first_parent other_parents; do
>>
>>       if [ -z "$other_parents" ]; then
>>
>>         git cherry-pick $commit
>>         rebase_head=$commit
>>
>>       else
>>
>>         for parent in $other_parents; do
>>
>>           if ! git cherry-pick $parent; then
>>
>>             git reset --hard $rebase_head
>>             git merge $other_parents
>>             git rm -rf .
>>             git checkout -- $commit
>>             git commit -aC $commit 
>>             break
>>
>>           fi
>>
>>         done
>>
>>         rebase_head=$(git rev-parse HEAD)
>>
>>       fi
>>
>>     done
>>   }
>
> Woops!
>
> This line:
>
>   git checkout -- $commit
>
> should be:
>
>   git checkout $commit -- .

I noticed that my script has another problem; the line:

  rebase_head=$commit

should be:

  rebase_head=$(git rev-parse HEAD)

I was trying to make an optimization, but it's the wrong
logic :-/

Sorry for the confusion. Here is an updated version of
the entire script:

  git branch saved
  git rev-list HEAD --reverse --first-parent --parents |
  {
    read root
    git reset --hard $root
    rebase_head=$root

    while read commit first_parent other_parents; do

      if [ -z "$other_parents" ]; then

        git cherry-pick $commit

      else

        for parent in $other_parents; do

          if ! git cherry-pick $parent; then

            git reset --hard $rebase_head
            git merge $other_parents
            git rm -rf .
            git checkout $commit -- .
            git commit -aC $commit
            break

          fi

        done

      fi

      rebase_head=$(git rev-parse HEAD)

    done
  }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-08-03 20:58           ` pbegeland
@ 2011-08-04 14:35             ` Michael Witten
  2011-08-05 21:26               ` pbegeland
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Witten @ 2011-08-04 14:35 UTC (permalink / raw)
  To: Patricia Bittencourt Egeland; +Cc: Ricky Egeland, git

[Put your reply text below the quoted text.]

On Wed, 03 Aug 2011 17:58:26 -0300, Patricia Egeland wrote:

> On Mon, 01 Aug 2011 01:07:33 -0000, Michael Witten wrote:
>
>> Michael Witten wrote:
>>
>>> On Sun, 31 Jul 2011 18:44:43 -0300, Ricky, Egeland wrote:
>>>
>>>> On Jul 31, 2011, at 6:33 PM, Michael Witten wrote:
>>>>
>>>>> On Sun, Jul 31, 2011 at 20:21, Michael Witten <mfwitten@gmail.com> 
>>>>> wrote:
>>>>>> Why are there conflicts anyway?
>>>>>
>>>>> Oh...
>>>>>
>>>>> I guess there were conflicts when the merge commit was made in
>>>>> the original repository, and these conflicts were resolved by
>>>>> the merge commit itself. Hence, when rebase tries to split up
>>>>> a merge by dealing with just the non-merge parents, you end up
>>>>> having to deal with the conflict again.
>>>>
>>>> Yes, I thought it was something like this going on, too. In the
>>>> pre-rebase history, when there is a commit with "Conflict:" and
>>>> listing file which is in the sub-repository history, this is a
>>>> point where rebase stops with a conflict.
>>>>
>>>>> Shouldn't rebase take this into account?
>>>>
>>>> Not sure.  Seems that it does not, it makes me resolve the conflict 
>>>> =
>>>> again.
>>>
>>> I think git rebase should take this into account is what I'm saying.
>>>
>>> The following implements what I think `git rebase' should be doing;
>>> run it instead of `git rebase' in your repo:
>>>
>>>   git branch saved
>>>   git rev-list HEAD --reverse --first-parent --parents |
>>>   {
>>>     read root
>>>     git reset --hard $root
>>>     rebase_head=$root
>>>
>>>     while read commit first_parent other_parents; do
>>>
>>>       if [ -z "$other_parents" ]; then
>>>
>>>         git cherry-pick $commit
>>>         rebase_head=$commit
>>>
>>>       else
>>>
>>>         for parent in $other_parents; do
>>>
>>>           if ! git cherry-pick $parent; then
>>>
>>>             git reset --hard $rebase_head
>>>             git merge $other_parents
>>>             git rm -rf .
>>>             git checkout -- $commit
>>>             git commit -aC $commit
>>>             break
>>>
>>>           fi
>>>
>>>         done
>>>
>>>         rebase_head=$(git rev-parse HEAD)
>>>
>>>       fi
>>>
>>>     done
>>>   }
>>
>> Woops!
>>
>> This line:
>>
>>   git checkout -- $commit
>>
>> should be:
>>
>>   git checkout $commit -- .
>
>      I tried to run the script in my repo. However, seems like the 'git 
> merge $other_parents' process fails. In the script output I see some 
> lines saying that files were not able to be merged, ie:
>
> warning: Cannot merge binary files: 
> scienceportal/images/tabs/tabs-gray.png (HEAD vs. 
> 84f6fc283861aa7c5798f58769789dd0b91a5e9d)
> warning: Cannot merge binary files: scienceportal/images/waiting.gif 
> (HEAD vs. e033cbbf1e9d24b66cb55a04701c059dc945c1c3)
>
>      Do you have some suggestion?

That's probably as expected; the script is coming across the conflict, but
it should be taking care of the conflict automatically.

Unfortunately, though, the results probably end up being almost completely
similar to the original un-rebased branch because the original script
actually has ANOTHER mistake (sorry!). See the updated version here (or
in your inbox):

  http://marc.info/?l=git&m=131246773005168&w=2
  Message-ID: <d62225a3cc5740cda7cb163a94d55892-mfwitten@gmail.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-08-04 14:35             ` Michael Witten
@ 2011-08-05 21:26               ` pbegeland
  2011-08-06 23:59                 ` Michael Witten
  0 siblings, 1 reply; 12+ messages in thread
From: pbegeland @ 2011-08-05 21:26 UTC (permalink / raw)
  To: Michael Witten; +Cc: Ricky Egeland, git

 On Thu, 04 Aug 2011 14:35:19 -0000, Michael Witten <mfwitten@gmail.com> 
 wrote:
> [Put your reply text below the quoted text.]
>
> On Wed, 03 Aug 2011 17:58:26 -0300, Patricia Egeland wrote:
>
>> On Mon, 01 Aug 2011 01:07:33 -0000, Michael Witten wrote:
>>
>>> Michael Witten wrote:
>>>
>>>> On Sun, 31 Jul 2011 18:44:43 -0300, Ricky, Egeland wrote:
>>>>
>>>>> On Jul 31, 2011, at 6:33 PM, Michael Witten wrote:
>>>>>
>>>>>> On Sun, Jul 31, 2011 at 20:21, Michael Witten 
>>>>>> <mfwitten@gmail.com>
>>>>>> wrote:
>>>>>>> Why are there conflicts anyway?
>>>>>>
>>>>>> Oh...
>>>>>>
>>>>>> I guess there were conflicts when the merge commit was made in
>>>>>> the original repository, and these conflicts were resolved by
>>>>>> the merge commit itself. Hence, when rebase tries to split up
>>>>>> a merge by dealing with just the non-merge parents, you end up
>>>>>> having to deal with the conflict again.
>>>>>
>>>>> Yes, I thought it was something like this going on, too. In the
>>>>> pre-rebase history, when there is a commit with "Conflict:" and
>>>>> listing file which is in the sub-repository history, this is a
>>>>> point where rebase stops with a conflict.
>>>>>
>>>>>> Shouldn't rebase take this into account?
>>>>>
>>>>> Not sure.  Seems that it does not, it makes me resolve the 
>>>>> conflict
>>>>> =
>>>>> again.
>>>>
>>>> I think git rebase should take this into account is what I'm 
>>>> saying.
>>>>
>>>> The following implements what I think `git rebase' should be 
>>>> doing;
>>>> run it instead of `git rebase' in your repo:
>>>>
>>>>   git branch saved
>>>>   git rev-list HEAD --reverse --first-parent --parents |
>>>>   {
>>>>     read root
>>>>     git reset --hard $root
>>>>     rebase_head=$root
>>>>
>>>>     while read commit first_parent other_parents; do
>>>>
>>>>       if [ -z "$other_parents" ]; then
>>>>
>>>>         git cherry-pick $commit
>>>>         rebase_head=$commit
>>>>
>>>>       else
>>>>
>>>>         for parent in $other_parents; do
>>>>
>>>>           if ! git cherry-pick $parent; then
>>>>
>>>>             git reset --hard $rebase_head
>>>>             git merge $other_parents
>>>>             git rm -rf .
>>>>             git checkout -- $commit
>>>>             git commit -aC $commit
>>>>             break
>>>>
>>>>           fi
>>>>
>>>>         done
>>>>
>>>>         rebase_head=$(git rev-parse HEAD)
>>>>
>>>>       fi
>>>>
>>>>     done
>>>>   }
>>>
>>> Woops!
>>>
>>> This line:
>>>
>>>   git checkout -- $commit
>>>
>>> should be:
>>>
>>>   git checkout $commit -- .
>>
>>      I tried to run the script in my repo. However, seems like the 
>> 'git
>> merge $other_parents' process fails. In the script output I see some
>> lines saying that files were not able to be merged, ie:
>>
>> warning: Cannot merge binary files:
>> scienceportal/images/tabs/tabs-gray.png (HEAD vs.
>> 84f6fc283861aa7c5798f58769789dd0b91a5e9d)
>> warning: Cannot merge binary files: scienceportal/images/waiting.gif
>> (HEAD vs. e033cbbf1e9d24b66cb55a04701c059dc945c1c3)
>>
>>      Do you have some suggestion?
>
> That's probably as expected; the script is coming across the 
> conflict, but
> it should be taking care of the conflict automatically.
>
> Unfortunately, though, the results probably end up being almost 
> completely
> similar to the original un-rebased branch because the original script
> actually has ANOTHER mistake (sorry!). See the updated version here 
> (or
> in your inbox):
>
>   http://marc.info/?l=git&m=131246773005168&w=2
>   Message-ID: <d62225a3cc5740cda7cb163a94d55892-mfwitten@gmail.com>


 Thanks for taking a look at it again.
 I tried to run the script with that update, but in the end I got more 
 merge messages than I had originally. (71 additional merges. From those 
 71, I got 53 "Merge commit" messages. While in the saved repo I have 1 
 "Merge commit".). Do you see what may be causing that?

 Another thing I noticed is that the auto-merging is still failing:

 fatal: Commit b0596fce207735081b8aa9afdd9686b7d412f5d8 is a merge but 
 no -m option was given.
 HEAD is now at ac5eaa2 *Continue Last Commit*
 Auto-merging scienceportal/css/myprofile.css
 CONFLICT (content): Merge conflict in scienceportal/css/myprofile.css
 Auto-merging scienceportal/css/qc.css
 Automatic merge failed; fix conflicts and then commit the result.
 scienceportal/css/myprofile.css: needs merge
 rm 'scienceportal/css/des.css'

 Looking at this thread: 
 http://www.mail-archive.com/git-users@googlegroups.com/msg01046.html
 I thought that the attempt of removing the files was the step first 
 facing the conflicts as the one shown above. So that, I tried to iterate 
 through the files and in case the removal of any file failed, I added 
 the steps as suggested in the thread:
 git checkout --theirs $file
 git add $file
 git commit -m 'Fixing conflict during rebase'

 But that didn't work either.

 I'd be greatly appreciated if you are still willing to help.

 Thanks,
 Patricia

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-08-05 21:26               ` pbegeland
@ 2011-08-06 23:59                 ` Michael Witten
  2011-08-09 12:29                   ` pbegeland
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Witten @ 2011-08-06 23:59 UTC (permalink / raw)
  To: Patricia Egeland; +Cc: Ricky Egeland, git

On Fri, 05 Aug 2011 18:26:27 -0300, Patricia Egeland wrote:

>> On Thu, 04 Aug 2011 14:35:19 -0000, Michael Witten wrote:
>>
>> [Put your reply text below the quoted text.]
>>
>> On Wed, 03 Aug 2011 17:58:26 -0300, Patricia Egeland wrote:
>>
>>> On Mon, 01 Aug 2011 01:07:33 -0000, Michael Witten wrote:
>>>
>>>> Michael Witten wrote:
>>>>
>>>>> On Sun, 31 Jul 2011 18:44:43 -0300, Ricky, Egeland wrote:
>>>>>
>>>>>> On Jul 31, 2011, at 6:33 PM, Michael Witten wrote:
>>>>>>
>>>>>>> On Sun, Jul 31, 2011 at 20:21, Michael Witten 
>>>>>>> <mfwitten@gmail.com>
>>>>>>> wrote:
>>>>>>>> Why are there conflicts anyway?
>>>>>>>
>>>>>>> Oh...
>>>>>>>
>>>>>>> I guess there were conflicts when the merge commit was made in
>>>>>>> the original repository, and these conflicts were resolved by
>>>>>>> the merge commit itself. Hence, when rebase tries to split up
>>>>>>> a merge by dealing with just the non-merge parents, you end up
>>>>>>> having to deal with the conflict again.
>>>>>>
>>>>>> Yes, I thought it was something like this going on, too. In the
>>>>>> pre-rebase history, when there is a commit with "Conflict:" and
>>>>>> listing file which is in the sub-repository history, this is a
>>>>>> point where rebase stops with a conflict.
>>>>>>
>>>>>>> Shouldn't rebase take this into account?
>>>>>>
>>>>>> Not sure.  Seems that it does not, it makes me resolve the 
>>>>>> conflict
>>>>>> =
>>>>>> again.
>>>>>
>>>>> I think git rebase should take this into account is what I'm 
>>>>> saying.
>>>>>
>>>>> The following implements what I think `git rebase' should be 
>>>>> doing;
>>>>> run it instead of `git rebase' in your repo:
>>>>>
>>>>>   git branch saved
>>>>>   git rev-list HEAD --reverse --first-parent --parents |
>>>>>   {
>>>>>     read root
>>>>>     git reset --hard $root
>>>>>     rebase_head=$root
>>>>>
>>>>>     while read commit first_parent other_parents; do
>>>>>
>>>>>       if [ -z "$other_parents" ]; then
>>>>>
>>>>>         git cherry-pick $commit
>>>>>         rebase_head=$commit
>>>>>
>>>>>       else
>>>>>
>>>>>         for parent in $other_parents; do
>>>>>
>>>>>           if ! git cherry-pick $parent; then
>>>>>
>>>>>             git reset --hard $rebase_head
>>>>>             git merge $other_parents
>>>>>             git rm -rf .
>>>>>             git checkout -- $commit
>>>>>             git commit -aC $commit
>>>>>             break
>>>>>
>>>>>           fi
>>>>>
>>>>>         done
>>>>>
>>>>>         rebase_head=$(git rev-parse HEAD)
>>>>>
>>>>>       fi
>>>>>
>>>>>     done
>>>>>   }
>>>>
>>>> Woops!
>>>>
>>>> This line:
>>>>
>>>>   git checkout -- $commit
>>>>
>>>> should be:
>>>>
>>>>   git checkout $commit -- .
>>>
>>>      I tried to run the script in my repo. However, seems like the 
>>> 'git
>>> merge $other_parents' process fails. In the script output I see some
>>> lines saying that files were not able to be merged, ie:
>>>
>>> warning: Cannot merge binary files:
>>> scienceportal/images/tabs/tabs-gray.png (HEAD vs.
>>> 84f6fc283861aa7c5798f58769789dd0b91a5e9d)
>>> warning: Cannot merge binary files: scienceportal/images/waiting.gif
>>> (HEAD vs. e033cbbf1e9d24b66cb55a04701c059dc945c1c3)
>>>
>>>      Do you have some suggestion?
>>
>> That's probably as expected; the script is coming across the 
>> conflict, but
>> it should be taking care of the conflict automatically.
>>
>> Unfortunately, though, the results probably end up being almost 
>> completely
>> similar to the original un-rebased branch because the original script
>> actually has ANOTHER mistake (sorry!). See the updated version here 
>> (or
>> in your inbox):
>>
>>   http://marc.info/?l=git&m=131246773005168&w=2
>>   Message-ID: <d62225a3cc5740cda7cb163a94d55892-mfwitten@gmail.com>
>
>
> Thanks for taking a look at it again.
> I tried to run the script with that update, but in the end I got more 
> merge messages than I had originally. (71 additional merges. From those 
> 71, I got 53 "Merge commit" messages. While in the saved repo I have 1 
> "Merge commit".). Do you see what may be causing that?
>
> Another thing I noticed is that the auto-merging is still failing:
>
> fatal: Commit b0596fce207735081b8aa9afdd9686b7d412f5d8 is a merge but 
> no -m option was given.
> HEAD is now at ac5eaa2 *Continue Last Commit*
> Auto-merging scienceportal/css/myprofile.css
> CONFLICT (content): Merge conflict in scienceportal/css/myprofile.css
> Auto-merging scienceportal/css/qc.css
> Automatic merge failed; fix conflicts and then commit the result.
> scienceportal/css/myprofile.css: needs merge
> rm 'scienceportal/css/des.css'
>
> Looking at this thread: 
> http://www.mail-archive.com/git-users@googlegroups.com/msg01046.html
> I thought that the attempt of removing the files was the step first 
> facing the conflicts as the one shown above. So that, I tried to iterate 
> through the files and in case the removal of any file failed, I added 
> the steps as suggested in the thread:
> git checkout --theirs $file
> git add $file
> git commit -m 'Fixing conflict during rebase'
>
> But that didn't work either.
>
> I'd be greatly appreciated if you are still willing to help.

Those error messages are expected, but my original script is
incredibly naive (to the point of being incorrect).

Fortunately, I've thought a bit more about it, and I have a much
better solution in the works, so please hold on just a bit longer
while I work out the kinks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Cleaning up history with git rebase
  2011-08-06 23:59                 ` Michael Witten
@ 2011-08-09 12:29                   ` pbegeland
  0 siblings, 0 replies; 12+ messages in thread
From: pbegeland @ 2011-08-09 12:29 UTC (permalink / raw)
  To: Michael Witten; +Cc: Ricky Egeland, git

 On Sun, 07 Aug 2011 19:13:01 -0000, Michael Witten <mfwitten@gmail.com> 
 wrote:
> On Sat, 06 Aug 2011 23:59:50 -0000, Michael Witten wrote:
>
>>> On Fri, 05 Aug 2011 18:26:27 -0300, Patricia Egeland wrote:
>>>
>>> I'd be greatly appreciated if you are still willing to help.
>>
>> ... my original script is incredibly naive (to the point of
>> being incorrect).
>
> In particular, my use of `cherry-pick' doesn't make any sense
> for all but the most contrived case.
>
>> Fortunately, I've thought a bit more about it, and I have a much
>> better solution in the works, so please hold on just a bit longer
>> while I work out the kinks.
>
> Using `rebase' is not the right solution because it doesn't handle 
> merges
> meaningfully and it also throws out information, especially when it 
> does
> successfully flatten merges (something that my first script attempted
> to emulate, too).
>
> As I began figuring out the necessary bookkeeping required for 
> achieving
> a cleaner history for your case, it occurred to me that 
> `filter-branch'
> does indeed handle a lot of the details already.
>
> What follows is the new script, which I've tested fairly well this 
> time,
> and it shouldn't give you any confusing output either:
>
>   #!/bin/sh
>
>   if b=$(git symbolic-ref -q HEAD); then
>     git branch "${b#refs/heads/}-saved" || exit 1;
>   fi
>
>   workdir= git filter-branch -f --commit-filter '
>
>     if [ $# -eq 1 ]; then
>
>       if [ $(git ls-tree --name-only $1 | wc -l) -eq 0 ]; then
>         echo # Possible abuse of internals
>       else
>         git commit-tree $@
>       fi
>
>     elif [ $# -eq 3 ]; then
>
>       git_commit_non_empty_tree $@
>
>     else
>
>       tree=$1
>       shift
>
>       parents=
>       while [ $# -gt 0 ]; do
>         shift
>         parents="$parents $1"
>         shift
>       done
>
>       parents_independent=
>       for parent in $(git merge-base --independent $parents); do
>         parents_independent="$parents_independent -p $parent"
>       done
>
>       git_commit_non_empty_tree $tree $parents_independent
>
>     fi
>
>   ' HEAD
>
> That `workdir=' bit at the top simply exports the `workdir' variable 
> for
> the duration of the `filter-branch' command, and it is a workaround 
> for
> the following bug:
>
>   Message-ID: <f06dd070abcc485e98c054ec3ee298f9-mfwitten@gmail.com>
>   http://marc.info/?l=git&m=131268922806303&w=2
>
> In order to use this script, save it to a file (e.g. /tmp/s) and 
> then:
>
>   git checkout branch-to-clean
>   sh /tmp/s
>
> That should save the messy history as a branch named 
> `branch-to-clean-saved'
> and also leave the clean history behind in `branch-to-clean'.
>
> Now, you've been doing a lot of work with your subrepos, so please 
> make
> sure that you're running the script on the right branch.


 It worked perfectly, Michael. Like a charm.
 I believe your script can become a great contribution to the git 
 community.

 My merge commit messages reduced from 470 to 155. The history looks 
 very clean and organized.

 Thanks a lot for all the time you spent helping us.

 Best,
 Patricia



>
> Sincerely,
> Michael Witten
>
> I built a test repository as follows (which requires a little manual
> intervention along the way, in that you must save commit messages by
> hand after the automatic conflict resolutions):
>
>   #!/bin/sh
>
>   git init test
>   cd test
>
>   printf A > foo
>   printf A > bar
>   git add .
>   git commit -m A
>   git branch A
>
>   git checkout -b B_foo A
>   printf B > foo
>   git commit -am B_foo
>
>   git checkout -b B_bar A
>   printf B > bar
>   git commit -am B_bar
>
>   git checkout -b B_baz A
>   printf B > baz
>   git add baz
>   git commit -m B_baz
>
>   git checkout -b C A
>   git merge --no-ff B_foo B_bar
>
>   git checkout -b D C
>   printf D > foo
>   printf D > bar
>   git commit -am D
>
>   git checkout -b E_foo_0 D
>   printf 0 > foo
>   git commit -am E_foo_0
>
>   git checkout -b E_foo_1 D
>   printf 1 > foo
>   git commit -am E_foo_1
>
>   git checkout -b F D
>   git merge --no-ff E_foo_0 E_foo_1  # Conflict.
>   printf F > foo                     # Resolve.
>   git commit -a                      # Commit (using the supplied 
> message).
>
>   git checkout -b G_foo_0 F
>   printf 0 > foo
>   git commit -am G_foo_0
>
>   git checkout -b G_foo_1 F
>   printf 1 > foo
>   git commit -am G_foo_1
>
>   git checkout -b G_quux G_foo_1
>   printf G > quux
>   git add quux
>   git commit -m G_quux
>
>   git checkout -b G_foo_2 G_quux
>   printf 2 > foo
>   git commit -am G_foo_2
>
>   git checkout -b H G_foo_0
>   git merge G_foo_2                  # Conflict.
>   printf G > foo                     # Resolve.
>   git commit -a                      # Commit (using the supplied 
> message).
>
>   git checkout -b I H
>   git merge --no-ff B_baz
>
>   git checkout -b J I
>   printf J > foo
>   printf J > bar
>   printf J > baz
>   git commit -am J
>
>   git checkout -b foo-only J
>   git filter-branch -f --index-filter '
>     git rm --cached --ignore-unmatch bar baz quux
>   ' HEAD
>
>   git checkout -b bar-only J
>   git filter-branch -f --index-filter '
>     git rm --cached --ignore-unmatch foo baz quux
>   ' HEAD
>
>   git checkout -b baz-only J
>   git filter-branch -f --index-filter '
>     git rm --cached --ignore-unmatch foo bar quux
>   ' HEAD
>
>   git checkout -b quux-only J
>   git filter-branch -f --index-filter '
>     git rm --cached --ignore-unmatch foo bar baz
>   ' HEAD
>
>   git checkout -b baz-quux-only J
>   git filter-branch -f --index-filter '
>     git rm --cached --ignore-unmatch foo bar
>   ' HEAD
>
> The branches of importance here are:
>
>   foo-only
>   bar-only
>   baz-only
>   quux-only
>   baz-quux-only
>
> Each is a slice of the 'big' branch `J', and each is reminiscent of 
> one
> of your smaller subrepos.
>
> You'll notice that I didn't bother using `--prune-empty' when I 
> created
> the slices, as the new script can handle empty commits better anyway
> (it accounts for merge commits, too). Thus, the result is that the
> history of each slice still has the same topology as 'big' branch 
> `J',
> but many of the commits in that topology don't actually contribute 
> any
> content.
>
> So, at this point, the graph of each of the branches in question all
> look the same in terms of the number of commits and the connections
> between them (the only difference is betweeen commit IDs, etc.). For
> instance, the unclean `baz-quux-only' history looks like this:
>
>   $ git log baz-quux-only --graph --format='%s%ncommit: %H%ntree: 
> %T%n%b%n'
>   * J
>   | commit: 26f3c0a6d026c8957439a52bf2d3e6955f627c82
>   | tree: ffe9a251c0df24b955f2e5ebf3d10592052c985a
>   |
>   |
>   *   Merge branch 'B_baz' into I
>   |\  commit: 9242bb70580a3bf11f20c8a2ab3fd277885739ad
>   | | tree: 155fd83a71d531e1a0413dc4a475af4e72d471c2
>   | |
>   | |
>   | * B_baz
>   | | commit: b95287cda5d2ed5e8af100cd059ac9aef25d4369
>   | | tree: e60c1f8cd2a0eccc2e4527aafa58865c37dc3ba9
>   | |
>   | |
>   * |   Merge branch 'G_foo_2' into H
>   |\ \  commit: 9093055281c6f971b9fbdede06e03bd4824e42ef
>   | | | tree: ff8ecc87699e099733f3145c4bb97466b0a9aec9
>   | | | Conflicts:
>   | | | 	foo
>   | | |
>   | | |
>   | * | G_foo_2
>   | | | commit: e9f3fd76023f0b3f32404b4123c541aa2acfef77
>   | | | tree: ff8ecc87699e099733f3145c4bb97466b0a9aec9
>   | | |
>   | | |
>   | * | G_quux
>   | | | commit: 6eeb3f33f356558fd603afe9dc52331cad279805
>   | | | tree: ff8ecc87699e099733f3145c4bb97466b0a9aec9
>   | | |
>   | | |
>   | * | G_foo_1
>   | | | commit: d33ca0afaa1a844fc84a2e7cba4bdd581a5794d4
>   | | | tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   | | |
>   | | |
>   * | | G_foo_0
>   |/ /  commit: f2d083d0c92f1fca146893dd094daa5b2dfb73c0
>   | |   tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   | |
>   | |
>   | |
>   |  \
>   *-. \   Merge branches 'E_foo_0' and 'E_foo_1' into F
>   |\ \ \  commit: f6b0822ff36b35e6239fffddaa82750d3d9e2832
>   | | | | tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   | | | | Conflicts:
>   | | | | 	foo
>   | | | |
>   | | | |
>   | | * | E_foo_1
>   | |/ /  commit: f1f4dddcd5de8bdf23f700b3267b20ce9f88c58c
>   |/| |   tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   | | |
>   | | |
>   | * | E_foo_0
>   |/ /  commit: dddb96a1b0b3659db3ed0245dc5c08d2d74855aa
>   | |   tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   | |
>   | |
>   * | D
>   | | commit: 9d643ba07d58211211351c9e9266c5a9171965e8
>   | | tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   | |
>   | |
>   | |
>   |  \
>   *-. \   Merge branches 'B_foo' and 'B_bar' into C
>   |\ \ \  commit: ed26df8f1192de9eae864f220dda35bbd53b8516
>   | |_|/  tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   |/| |
>   | | |
>   | | * B_bar
>   | |/  commit: 3202bd87d37ae084bee86158391489fc7c55002a
>   |/|   tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   | |
>   | |
>   | * B_foo
>   |/  commit: 6088f291122686d6f06ab33082e77937d486b354
>   |   tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>   |
>   |
>   * A
>     commit: 104393c27f4ee7a1cbbdae17d21a2f478e8f723f
>     tree: 4b825dc642cb6eb9a060e54bf8d69288fbee4904
>
> As you can see, there are a lot of commits there that are completely
> irrelevant.
>
> Assuming that the new script is stored as /tmp/s, then this history
> can be cleaned up as follows:
>
>   git checkout baz-quux-only
>   sh /tmp/s
>
> This should save the messy history as branch `baz-quux-only-saved' 
> and
> also leave the following clean history behind in branch 
> `baz-quux-only':
>
>   $ git log baz-quux-only --graph --format='%s%ncommit: %H%ntree: 
> %T%n%b%n'
>   * J
>   | commit: f460d134835ece09bab7b3584934930bc5abfb40
>   | tree: ffe9a251c0df24b955f2e5ebf3d10592052c985a
>   |
>   |
>   *   Merge branch 'B_baz' into I
>   |\  commit: 38da9ce07e6859b6cabadcaca1b2eed78c3f8cd5
>   | | tree: 155fd83a71d531e1a0413dc4a475af4e72d471c2
>   | |
>   | |
>   | * B_baz
>   |   commit: bd25181fd0c806a4495400cd13e21007e36c600b
>   |   tree: e60c1f8cd2a0eccc2e4527aafa58865c37dc3ba9
>   |
>   |
>   * G_quux
>     commit: 7c7080def8e2e000e9e9b1e9efc863155b8188e0
>     tree: ff8ecc87699e099733f3145c4bb97466b0a9aec9
>
> Notice how even the root commits have been improved; there is now:
>
>   * a root that creates the `quux' file.
>   * a root that creates the `baz'  file.
>
> and these 2 roots are merged together. There are no empty commits.
>
> Similarly, the unclean history for `foo-only' may be cleaned up as 
> follows:
>
>   git checkout foo-only
>   sh /tmp/s
>
> producing the following history:
>
>   $ git log foo-only --graph --format='%s%ncommit: %H%ntree: 
> %T%n%b%n'
>   * J
>   | commit: 6c88a36c2bba348baea557687601a5337cf978c4
>   | tree: 9e990d3e85287e21abe7b0b06f062fb9b16e769c
>   |
>   |
>   *   Merge branch 'G_foo_2' into H
>   |\  commit: cdb5e89aad7b399bb8c7de5702350205158bdede
>   | | tree: 0f149c5cc454a9d23e00d40819c25f49c90f682a
>   | | Conflicts:
>   | | 	foo
>   | |
>   | |
>   | * G_foo_2
>   | | commit: 861c76fb89846c4100c241e554123befdd074fa0
>   | | tree: cfa452475c8f564b762ce25143cf905568c172a9
>   | |
>   | |
>   | * G_foo_1
>   | | commit: b8a3e5d6230c2312b6675e2b36deaa9310c5bd18
>   | | tree: 0c6433b15dcd449d4d6acf053a18301d96b43fc0
>   | |
>   | |
>   * | G_foo_0
>   |/  commit: 8c0c515f0f9ec5994d4f8e5f244c3e869abb0384
>   |   tree: a39b4d0fdb82ee3da3e01875236754ef983f431b
>   |
>   |
>   *   Merge branches 'E_foo_0' and 'E_foo_1' into F
>   |\  commit: be03f5a201d9fea098f733b9bb1d58d30e89ac9c
>   | | tree: 8d5bb46eff3e15e67f77e2fc9ddb6be755df2b5b
>   | | Conflicts:
>   | | 	foo
>   | |
>   | |
>   | * E_foo_1
>   | | commit: ba04ecb0bac23315b5ef1e68d571487766ab017e
>   | | tree: 0c6433b15dcd449d4d6acf053a18301d96b43fc0
>   | |
>   | |
>   * | E_foo_0
>   |/  commit: 676e381590a316d68b65c2c57d82f06d189e42c7
>   |   tree: a39b4d0fdb82ee3da3e01875236754ef983f431b
>   |
>   |
>   * D
>   | commit: 6ade1fbc6b41577c03bf38bfdb08e5fc8f8581e2
>   | tree: 6a34a4b5b563e07707e3da3e9c39de796b4bf790
>   |
>   |
>   * B_foo
>   | commit: 8dd97d2a76ea2a07741b86064fd44e8da1170a7e
>   | tree: 9712ce0451874e06403b4b60fe662ef2d111d8e5
>   |
>   |
>   * A
>     commit: 7d063cc6bd302f0c5d83bd7de3415397cf45cec8
>     tree: 150d77852b97356b90e6a1cae8b62e85613637a1
>
> As you can see, the first merge commit has been completely flattened,
> the branching histories have been simplified by fast-forwading where
> possible, and non-contributing commits have been pruned.
>
> The rest of the cleaned histories follow:
>
>   $ git log bar-only --graph --format='%s%ncommit: %H%ntree: 
> %T%n%b%n'
>   * J
>   | commit: bc915e51a3619584adbe048a5002a9be26503d1b
>   | tree: e175a92b6e4240caf24b6b8517ab184f5fbc2deb
>   |
>   |
>   * D
>   | commit: 7a5d0cf2761f430e8d2879b02476b1c2882f28d3
>   | tree: a8710e5cbcfd7b411113eb0c8417a82c6214afa8
>   |
>   |
>   * B_bar
>   | commit: 18c5f8e0b0934577d7aa04cb8b3692b295503fe1
>   | tree: b09152c27cc708366ee1d2e920adfc2faf021143
>   |
>   |
>   * A
>     commit: 6ed22455ae04e9e6fcd9a5001dd33703781d5d07
>     tree: 3786d2ba9edc25dddab82eb87e53ebbee57a8dab
>
>
>   $ git log baz-only --graph --format='%s%ncommit: %H%ntree: 
> %T%n%b%n'
>   * J
>   | commit: 1a86f4d0424367e7e75ad45f60127ff4ad0e23dd
>   | tree: 966dd81a7ee77d0b6da290769f22060fed665984
>   |
>   |
>   * B_baz
>     commit: bd25181fd0c806a4495400cd13e21007e36c600b
>     tree: e60c1f8cd2a0eccc2e4527aafa58865c37dc3ba9
>
>
>   $ git log quux-only --graph --format='%s%ncommit: %H%ntree: 
> %T%n%b%n'
>   * G_quux
>     commit: 7c7080def8e2e000e9e9b1e9efc863155b8188e0
>     tree: ff8ecc87699e099733f3145c4bb97466b0a9aec9
>
> As you can see, there are some amazing simplifications here.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-08-09 12:29 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-31 17:20 Cleaning up history with git rebase Ricky Egeland
2011-07-31 20:21 ` Michael Witten
2011-07-31 21:33   ` Michael Witten
2011-07-31 21:44     ` Ricky Egeland
2011-08-01  1:02       ` Michael Witten
2011-08-01  1:07         ` Michael Witten
2011-08-03 20:58           ` pbegeland
2011-08-04 14:35             ` Michael Witten
2011-08-05 21:26               ` pbegeland
2011-08-06 23:59                 ` Michael Witten
2011-08-09 12:29                   ` pbegeland
2011-08-04 14:18           ` Michael Witten

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.