git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] git submodule split
@ 2014-04-02 21:52 Michal Sojka
  2014-04-06 16:08 ` Jens Lehmann
  0 siblings, 1 reply; 4+ messages in thread
From: Michal Sojka @ 2014-04-02 21:52 UTC (permalink / raw)
  To: git

Hello,

I needed to convert a subdirectory of a repo to a submodule and have the
histories of both repos linked together. I found that this was discussed
few years back [1], but the code seemed quite complicated and was not
merged.

[1]: http://git.661346.n2.nabble.com/RFC-What-s-the-best-UI-for-git-submodule-split-tp2318127.html

Now, the situation is better, because git subtree can already do most of
the work. Below is a script that I used to split a submodule from my
repo. It basically consist of a call to 'git subtree split' followed by
'git filter-branch' to link the histories together.

I'd like to get some initial feedback on it before attempting to
integrate it with git sources (i.e. writing tests and doc). What do you
think?

Thanks,
-Michal


#!/bin/sh

set -e

. git-sh-setup

url=$1
dir=$2

test -d "$dir" || die "$dir is not a directory"

# Create subtree corresponding to the directory
subtree=$(git subtree split --prefix="$dir")

subtree_tag=tmp/submodule-split-$$
git tag $subtree_tag $subtree
superproject=$PWD
export subtree subtree_tag superproject

# Replace the directory with submodule reference in the whole history
git filter-branch -f --index-filter "
    set -e
    # Check whether the $dir exists in this commit
    if git ls-files --error-unmatch '$dir' > /dev/null 2>&1; then

        # Find subtree commit corresponding to the commit in the
        # superproject (this could be made faster by not running git log
        # for every commit)
        subcommit=\$(git log --format='%T %H' $subtree |
     	    grep ^\$(git ls-tree \$GIT_COMMIT -- '$dir'|awk '{print \$3}') |
     	    awk '{print \$2}')

        # filter-branch runs the filter in an empty work-tree - create the
        # future submodule in it so that the 'git submodule add' below
        # does not try to clone it.
        if ! test -d '$dir'; then
     	    mkdir -p '$dir'
     	    ( cd '$dir' && clear_local_git_env && git init --quiet && git pull $superproject $subtree_tag )
        fi

        # Remove all files under $dir from index so that the 'git
        # submodule add' below does not complain.
        git ls-files '$dir'|git update-index --force-remove --stdin

        # Add the submodule - the goal here is to create/update .gitmodules
        git submodule add $url '$dir'

        # Update the submodule commit hash to the correct value
        echo \"160000 \$subcommit	$dir\"|git update-index --index-info
    fi
"

# Replace the directory in the working tree with the submodule
( cd "$dir" && find -mindepth 1 -delete && git init && git pull $superproject $subtree_tag )

# Clean up
git tag --delete $subtree_tag

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] git submodule split
  2014-04-02 21:52 [RFC] git submodule split Michal Sojka
@ 2014-04-06 16:08 ` Jens Lehmann
  2014-04-06 21:18   ` Michal Sojka
  0 siblings, 1 reply; 4+ messages in thread
From: Jens Lehmann @ 2014-04-06 16:08 UTC (permalink / raw)
  To: Michal Sojka, git

Am 02.04.2014 23:52, schrieb Michal Sojka:
> Hello,
> 
> I needed to convert a subdirectory of a repo to a submodule and have the
> histories of both repos linked together. I found that this was discussed
> few years back [1], but the code seemed quite complicated and was not
> merged.
> 
> [1]: http://git.661346.n2.nabble.com/RFC-What-s-the-best-UI-for-git-submodule-split-tp2318127.html
> 
> Now, the situation is better, because git subtree can already do most of
> the work. Below is a script that I used to split a submodule from my
> repo. It basically consist of a call to 'git subtree split' followed by
> 'git filter-branch' to link the histories together.
> 
> I'd like to get some initial feedback on it before attempting to
> integrate it with git sources (i.e. writing tests and doc). What do you
> think?

Why do want to rewrite the whole history of the superproject,
wouldn't it suffice to turn a directory into a submodule with
the same content in a simple commit? Don't get me wrong, I'm
not against adding such a functionality to contrib, I'm just
trying to understand the motivation for your script.

> Thanks,
> -Michal
> 
> 
> #!/bin/sh
> 
> set -e
> 
> . git-sh-setup
> 
> url=$1
> dir=$2
> 
> test -d "$dir" || die "$dir is not a directory"
> 
> # Create subtree corresponding to the directory
> subtree=$(git subtree split --prefix="$dir")
> 
> subtree_tag=tmp/submodule-split-$$
> git tag $subtree_tag $subtree
> superproject=$PWD
> export subtree subtree_tag superproject
> 
> # Replace the directory with submodule reference in the whole history
> git filter-branch -f --index-filter "
>     set -e
>     # Check whether the $dir exists in this commit
>     if git ls-files --error-unmatch '$dir' > /dev/null 2>&1; then
> 
>         # Find subtree commit corresponding to the commit in the
>         # superproject (this could be made faster by not running git log
>         # for every commit)
>         subcommit=\$(git log --format='%T %H' $subtree |
>      	    grep ^\$(git ls-tree \$GIT_COMMIT -- '$dir'|awk '{print \$3}') |
>      	    awk '{print \$2}')
> 
>         # filter-branch runs the filter in an empty work-tree - create the
>         # future submodule in it so that the 'git submodule add' below
>         # does not try to clone it.
>         if ! test -d '$dir'; then
>      	    mkdir -p '$dir'
>      	    ( cd '$dir' && clear_local_git_env && git init --quiet && git pull $superproject $subtree_tag )
>         fi
> 
>         # Remove all files under $dir from index so that the 'git
>         # submodule add' below does not complain.
>         git ls-files '$dir'|git update-index --force-remove --stdin
> 
>         # Add the submodule - the goal here is to create/update .gitmodules
>         git submodule add $url '$dir'
> 
>         # Update the submodule commit hash to the correct value
>         echo \"160000 \$subcommit	$dir\"|git update-index --index-info
>     fi
> "
> 
> # Replace the directory in the working tree with the submodule
> ( cd "$dir" && find -mindepth 1 -delete && git init && git pull $superproject $subtree_tag )
> 
> # Clean up
> git tag --delete $subtree_tag
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] git submodule split
  2014-04-06 16:08 ` Jens Lehmann
@ 2014-04-06 21:18   ` Michal Sojka
  2014-04-07 19:04     ` Jens Lehmann
  0 siblings, 1 reply; 4+ messages in thread
From: Michal Sojka @ 2014-04-06 21:18 UTC (permalink / raw)
  To: Jens Lehmann, git

On Sun, Apr 06 2014, Jens Lehmann wrote:
> Am 02.04.2014 23:52, schrieb Michal Sojka:
>> Hello,
>> 
>> I needed to convert a subdirectory of a repo to a submodule and have the
>> histories of both repos linked together. I found that this was discussed
>> few years back [1], but the code seemed quite complicated and was not
>> merged.
>> 
>> [1]: http://git.661346.n2.nabble.com/RFC-What-s-the-best-UI-for-git-submodule-split-tp2318127.html
>> 
>> Now, the situation is better, because git subtree can already do most of
>> the work. Below is a script that I used to split a submodule from my
>> repo. It basically consist of a call to 'git subtree split' followed by
>> 'git filter-branch' to link the histories together.
>> 
>> I'd like to get some initial feedback on it before attempting to
>> integrate it with git sources (i.e. writing tests and doc). What do you
>> think?
>
> Why do want to rewrite the whole history of the superproject,
> wouldn't it suffice to turn a directory into a submodule with
> the same content in a simple commit? 

I wanted to publish a project including its history but a part of that
project could not be made public due to legal reasons. Putting that part
into submodule seemed like best idea.

-Michal

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] git submodule split
  2014-04-06 21:18   ` Michal Sojka
@ 2014-04-07 19:04     ` Jens Lehmann
  0 siblings, 0 replies; 4+ messages in thread
From: Jens Lehmann @ 2014-04-07 19:04 UTC (permalink / raw)
  To: Michal Sojka, git

Am 06.04.2014 23:18, schrieb Michal Sojka:
> On Sun, Apr 06 2014, Jens Lehmann wrote:
>> Am 02.04.2014 23:52, schrieb Michal Sojka:
>>> Hello,
>>>
>>> I needed to convert a subdirectory of a repo to a submodule and have the
>>> histories of both repos linked together. I found that this was discussed
>>> few years back [1], but the code seemed quite complicated and was not
>>> merged.
>>>
>>> [1]: http://git.661346.n2.nabble.com/RFC-What-s-the-best-UI-for-git-submodule-split-tp2318127.html
>>>
>>> Now, the situation is better, because git subtree can already do most of
>>> the work. Below is a script that I used to split a submodule from my
>>> repo. It basically consist of a call to 'git subtree split' followed by
>>> 'git filter-branch' to link the histories together.
>>>
>>> I'd like to get some initial feedback on it before attempting to
>>> integrate it with git sources (i.e. writing tests and doc). What do you
>>> think?
>>
>> Why do want to rewrite the whole history of the superproject,
>> wouldn't it suffice to turn a directory into a submodule with
>> the same content in a simple commit? 
> 
> I wanted to publish a project including its history but a part of that
> project could not be made public due to legal reasons. Putting that part
> into submodule seemed like best idea.

Yep, that makes lots of sense.

I'm not sure yet if this functionality is needed often enough to
put the script under contrib, but I won't object as long as you'd
be willing to maintain it (and help people on this list when they
report any issues).

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-04-07 19:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-02 21:52 [RFC] git submodule split Michal Sojka
2014-04-06 16:08 ` Jens Lehmann
2014-04-06 21:18   ` Michal Sojka
2014-04-07 19:04     ` Jens Lehmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).