All of lore.kernel.org
 help / color / mirror / Atom feed
* [GSoC 11 submodule] Status update
@ 2011-06-27 19:34 Fredrik Gustafsson
  2011-06-28  0:29 ` Phil Hord
  0 siblings, 1 reply; 6+ messages in thread
From: Fredrik Gustafsson @ 2011-06-27 19:34 UTC (permalink / raw)
  To: git; +Cc: iveqy, hvoigt, jens.lehmann

Hi,
time for a status update on the git submodule improvements GSoC 11 project.
This will be divided in to three section, technical progress, personal
reflection and how to follow my work.

You can read my previous status update here [1].

Technical progress
------------------
Three patch-series has been sent with a total of five patches (six if you
count patches that I don't have written).

First a patch series to make git submodule update continue update other submodules
when one submodule fails to be updated [2]. This patch is now in Junios
pu-branch.

Second a minor patch to reduce memory consumption. The improvements was
pretty small, but I took it as an exercise to write and send patches (in the
first patch-series I'd too many non-code related error for my taste). This
patch was however rejected [3].

My third patch series was about making push submodule aware to prevent the
user to forgot to push. This is currently sent to the list as an RFC [4].
This was the most challenging patch to write and a good start for my next
task.

My fourth task (and the main task of this summer) will start on June 27 and
will be to move a submodules .git-dir into the super-projects .git-dir.
Design of this is already done and approved by my mentors.

Personal reflection
-------------------
Before starting this project I was a frequent git user. I use git every
day. Apparently, you can be a frequent git user without using the power of
git. I've learned a lot of git as a tool the past weeks. I've learned git
rebase very well, a tool I never used before. It's really useful and
dangerous. Other commands I started to use (but not had any need for
before) was cherry-pick and branches.

I can clearly see that git can be a huge problem in a workspace with
power-git-users and unmotivated anti-scm users.

Although I was familiar with valgrind and gdb before, I never had any
really use for the tools in my development. Actually, they are really good
and I believe I have a lot to learn on this area.

My start has been very slow, it was harder than I thought to write a proper
patch and I spent a lot of time on formalities and test-writing, parts that
I previously thought where small or non-existing.

For the first time I've truly used a test driven development cycle. My
school experience of this was very bad. It was hard, slow and not needed.
However, now I learned that you actually can benefit from a test-driven
development cycle, that it can save time and that tests actually can be
fairly easy to write.

The mentor/student communication has also been something completely new.
How much help can you ask for? That's a very hard question. My mentors
fortunately helps me with this as well. And so far this has been a really
good support. It's far, far better than the support I've at school.

The code review cycle is pretty amazing. All code that I submits is viewed
by at least three persons, probably ever more. Even though a few bugs has
slipped past to the last guard (Junio) and this proves the real value of
code reviewing. Git has by far the most serious code review process I ever
worked with. This is amazing and gives a good platform for having good code
written. This is also something that we do not practice in school.

How to follow my work
---------------------
You can follow my work on github [5]. There are as of now five interesting
branches:

* gsoc11_submodule_enhancements
  Contains all patches sent to the list (this does not include RFC
  patches). This will always be clean and "stable".
* git-submodule-update
  Contains all commits for the git-submodule update patch series. This is
  to be considered "stable"
* buggfix
  Contains the minor patch to reduce memory consumption. This wasn't
  accepted by Junio. This is to be considered "stable".
* push_limits
  Contains all commits for the git push limitation patch series. This is
  subject to unstable updates.
* move_gitdir
  Contains all commits for the move git-dir patch-series. This is subject
  to unstable updates.

Links
-----
[1] My previous update
http://article.gmane.org/gmane.comp.version-control.git/173095/match=iveqy

[2] submodule update continue patch.
http://thread.gmane.org/gmane.comp.version-control.git/175500/focus=175725

[3] Use correct value when hinting strbuf_read()
http://article.gmane.org/gmane.comp.version-control.git/175844/match=iveqy

[4] push checks for unpushed remotes in submodules
http://thread.gmane.org/gmane.comp.version-control.git/176328/focus=176327

[5] Github-page
https://github.com/iveqy/git
-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iveqy@iveqy.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [GSoC 11 submodule] Status update
  2011-06-27 19:34 [GSoC 11 submodule] Status update Fredrik Gustafsson
@ 2011-06-28  0:29 ` Phil Hord
  2011-06-28 18:43   ` Heiko Voigt
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Hord @ 2011-06-28  0:29 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, hvoigt, jens.lehmann

Hi Fredrik and git-submodule folks,

On 06/27/2011 03:34 PM, Fredrik Gustafsson wrote:
> My fourth task (and the main task of this summer) will start on June 27 and
> will be to move a submodules .git-dir into the super-projects .git-dir.
> Design of this is already done and approved by my mentors.

This frightens me a bit, so I read the wiki link about it.  Thanks for
explaining where I can find this information.

But I'm still confused.

If I understand right, the submodule/.git dirs will be moved into the
top-level at .git/submodule/.git.  The benefit is supposed to be that
this will free up contention on the non-empty submodule directory when
the super-project switches branches.

In the simple case, git warns "unable to rmdir sub: Directory not
empty".  But I can think of other conflicts as well.

My question is, how does this proposed change help the situation?

Phil

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [GSoC 11 submodule] Status update
  2011-06-28  0:29 ` Phil Hord
@ 2011-06-28 18:43   ` Heiko Voigt
  2011-06-29 21:27     ` Phil Hord
  0 siblings, 1 reply; 6+ messages in thread
From: Heiko Voigt @ 2011-06-28 18:43 UTC (permalink / raw)
  To: Phil Hord; +Cc: Fredrik Gustafsson, git, jens.lehmann

Hi,

On Mon, Jun 27, 2011 at 08:29:18PM -0400, Phil Hord wrote:
> On 06/27/2011 03:34 PM, Fredrik Gustafsson wrote:
> > My fourth task (and the main task of this summer) will start on June 27 and
> > will be to move a submodules .git-dir into the super-projects .git-dir.
> > Design of this is already done and approved by my mentors.
> 
> This frightens me a bit, so I read the wiki link about it.  Thanks for
> explaining where I can find this information.

I do not know what part of this change frightens you?

> But I'm still confused.
> 
> If I understand right, the submodule/.git dirs will be moved into the
> top-level at .git/submodule/.git.  The benefit is supposed to be that
> this will free up contention on the non-empty submodule directory when
> the super-project switches branches.
> 
> In the simple case, git warns "unable to rmdir sub: Directory not
> empty".  But I can think of other conflicts as well.
> 
> My question is, how does this proposed change help the situation?

The proposed change allows us to implement that a submodules directory
can be completely removed if it was deleted or moved. If we would do
that currently you would loose all local history of the submodule. I do
not know what you mean with "conflicts" but this change will help
submodule towards behaving like they were ordinary directories in git.

Cheers Heiko

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [GSoC 11 submodule] Status update
  2011-06-28 18:43   ` Heiko Voigt
@ 2011-06-29 21:27     ` Phil Hord
  2011-06-29 21:58       ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Hord @ 2011-06-29 21:27 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Fredrik Gustafsson, git, jens.lehmann

On 06/28/2011 02:43 PM, Heiko Voigt wrote:
> Hi,
>
> On Mon, Jun 27, 2011 at 08:29:18PM -0400, Phil Hord wrote:
>> On 06/27/2011 03:34 PM, Fredrik Gustafsson wrote:
>>> My fourth task (and the main task of this summer) will start on June 27 and
>>> will be to move a submodules .git-dir into the super-projects .git-dir.
>>> Design of this is already done and approved by my mentors.
>> This frightens me a bit, so I read the wiki link about it.  Thanks for
>> explaining where I can find this information.
> I do not know what part of this change frightens you?

It frightens me because it seems like a fundamental break from the
current submodule functionality.  Today a submodule exists as a git
repository with no knowledge that it is a submodule or who its
super-repository is. Maybe this is a design mistake in need of
correction.  But this change seems both huge and subtle to me.  I
suspect it will affect many tools which expect the traditional git
layout, submodules or not.  For example, a third-party tool might seek
out the ".git" directory by walking upwards.  Once it finds it, it will
(safely, today) assume that is the .git directory relating to its
files.  After this change, the tool will be broken.

    # Find my .git directory
    mygit=${PWD}
    while test -d "${mygit}" && ! test -d "${mygit}/.git" ; do
        mygit=$(dirname "$mygit")
    done

    # Now lets do things with "our" repo
    fetched_sha1=$(cat ${mygit}/FETCH_HEAD)
        .
        .
        .

Granted, script-writers should be using the git plumbing as much as
possible to avoid this kind of change.  But not everyone can afford to
be so conscientious.  

>> But I'm still confused.
>>
>> If I understand right, the submodule/.git dirs will be moved into the
>> top-level at .git/submodule/.git.  The benefit is supposed to be that
>> this will free up contention on the non-empty submodule directory when
>> the super-project switches branches.
>>
>> In the simple case, git warns "unable to rmdir sub: Directory not
>> empty".  But I can think of other conflicts as well.
>>
>> My question is, how does this proposed change help the situation?
> The proposed change allows us to implement that a submodules directory
> can be completely removed if it was deleted or moved. If we would do
> that currently you would loose all local history of the submodule. I do
> not know what you mean with "conflicts" but this change will help
> submodule towards behaving like they were ordinary directories in git.

I see.  By moving the local history out of the way, the submodule
directory is free to be changed or removed without harming the local
history.  That's clever.  I think Android's 'repo' tool does something
similar.

I think I can answer my other concerns now.  Do these answers sound right?

What happens if the submodule working directory is dirty?  
    Treat it the same as git does for its own working directory.

But what if the submodule working directory is 'clean' after considering
.gitignore?  Do untracked/ignored files also get deleted?
    Treat this the same as git does for its own working directory.

What if a 'git checkout' results in the submodule being removed?
    Remove the entire submodule directory (or just remove tracked files?)

What if a 'git checkout' or 'git merge' results in submodule 'foo' being
added where there is already a file named 'foo'?
    This is a working-directory merge conflict.

Thanks for explaining.  I feel better about it all now.  I remain
concerned about backwards compatibility, but I'm not so worried about
conflict-resolution anymore.

Phil

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [GSoC 11 submodule] Status update
  2011-06-29 21:27     ` Phil Hord
@ 2011-06-29 21:58       ` Junio C Hamano
  2011-06-30 17:54         ` Phil Hord
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2011-06-29 21:58 UTC (permalink / raw)
  To: Phil Hord; +Cc: Heiko Voigt, Fredrik Gustafsson, git, jens.lehmann

Phil Hord <hordp@cisco.com> writes:

> It frightens me because it seems like a fundamental break from the
> current submodule functionality.  Today a submodule exists as a git
> repository with no knowledge that it is a submodule or who its
> super-repository is.

The use of .git that is a text file that records where the real directory
is not limited to submodules.  Placing that "real directory" somewhere in
the .git directory of the superproject is merely a convention.

In other words, it does not change anything fundamental.

> Once it finds it, it will
> (safely, today) assume that is the .git directory relating to its
> files.  After this change, the tool will be broken.

Then it is already broken if it does not pay attention to .git that is not
a directory but is a text file that records where the real directory is.

> I think I can answer my other concerns now.  Do these answers sound right?
>
> - What happens if the submodule working directory is dirty?  
> - But what if the submodule working directory is 'clean' after considering
>   .gitignore?  Do untracked/ignored files also get deleted?
> - What if a 'git checkout' results in the submodule being removed?
> - What if a 'git checkout' or 'git merge' results in submodule 'foo' being
>   added where there is already a file named 'foo'?

These "different" questions are answered exactly the same way, which is:

>   Treat it the same as git does for its own working directory.

When switching to another branch, a directory that does not exist in the
switched-to branch needs to be removed, but we would refrain from "rm -fr"
that directory if it has any leftover cruft in it (untracked and unignored
files). A submodule directory should behave in the same way.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [GSoC 11 submodule] Status update
  2011-06-29 21:58       ` Junio C Hamano
@ 2011-06-30 17:54         ` Phil Hord
  0 siblings, 0 replies; 6+ messages in thread
From: Phil Hord @ 2011-06-30 17:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Heiko Voigt, Fredrik Gustafsson, git, jens.lehmann

On 06/29/2011 05:58 PM, Junio C Hamano wrote:
> Phil Hord <hordp@cisco.com> writes:
>> It frightens me because it seems like a fundamental break from the
>> current submodule functionality.  Today a submodule exists as a git
>> repository with no knowledge that it is a submodule or who its
>> super-repository is.
> The use of .git that is a text file that records where the real directory
> is not limited to submodules.  Placing that "real directory" somewhere in
> the .git directory of the superproject is merely a convention.
>
> In other words, it does not change anything fundamental.

Thanks for pointing that out.  I was unaware of that feature until I saw
it discussed in another thread this week.  Even then, it was not clear
to me that this is the same feature being employed here.

> When switching to another branch, a directory that does not exist in the
> switched-to branch needs to be removed, but we would refrain from "rm -fr"
> that directory if it has any leftover cruft in it (untracked and unignored
> files). A submodule directory should behave in the same way.

Thanks.  I am suitably enclued and no longer afraid.

Phil

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-06-30 17:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-27 19:34 [GSoC 11 submodule] Status update Fredrik Gustafsson
2011-06-28  0:29 ` Phil Hord
2011-06-28 18:43   ` Heiko Voigt
2011-06-29 21:27     ` Phil Hord
2011-06-29 21:58       ` Junio C Hamano
2011-06-30 17:54         ` Phil Hord

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.