All of lore.kernel.org
 help / color / mirror / Atom feed
* Setting file timestamps to commit time (git-checkout)
@ 2013-12-09 11:25 Dominik Vogt
  2013-12-09 20:35 ` Junio C Hamano
  2013-12-09 20:48 ` Jonathan Nieder
  0 siblings, 2 replies; 11+ messages in thread
From: Dominik Vogt @ 2013-12-09 11:25 UTC (permalink / raw)
  To: git

Me and some colleagues work on gcc in lots of different branches.
For each branch there is a separate build directory for each
branch, e.g. build-a, build-b and build-c.  Let's assume that all
branches are identical at the moment.  If a file in branch a is
changed that triggers a complete rebuild of gcc (e.g.
<target>.opt), rebuilding in build-a takes about an hour.  Now,
 when I switch to one of the other branches, said file is not
identical anymore and stamped with the _current_ time during
checkout.  Although branch b and c have not changed at all, they
will now be rebuilt completely because the timestamp on that files
has changed.  I.e. a chance on one branch forces a rebuild on n
other branches, which can take many hours.

I think this situation could be improved with an option to
git-checkout with the following logic:

$ git checkout <new branch>
  FOR EACH <file> in working directory of <new branch>
    IF <file> is identical to the version in the <old branch>
      THEN leave the file untouched
    ELSE IF <commit timestamp> of the HEAD of the <new branch>
            is in the future
      THEN checkout the new version of <file> and stamp it with
           the current time
    ELSE (commit timestamp is current or in the past)
      THEN checkout the new version of <file> and stamp it with
           the commit timestamp of the current HEAD of <new branch>

Any comments?  Is there already a way to do this?

(Please do not cc me on replies, I'm subscribed to the list.)

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-09 11:25 Setting file timestamps to commit time (git-checkout) Dominik Vogt
@ 2013-12-09 20:35 ` Junio C Hamano
  2013-12-10  8:35   ` Dominik Vogt
  2013-12-09 20:48 ` Jonathan Nieder
  1 sibling, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2013-12-09 20:35 UTC (permalink / raw)
  To: vogt; +Cc: git

Dominik Vogt <vogt@linux.vnet.ibm.com> writes:

> Me and some colleagues work on gcc in lots of different branches.
> For each branch there is a separate build directory for each
> branch, e.g. build-a, build-b and build-c.  Let's assume that all
> branches are identical at the moment.  If a file in branch a is
> changed that triggers a complete rebuild of gcc (e.g.
> <target>.opt), rebuilding in build-a takes about an hour.  Now,
>  when I switch to one of the other branches, said file is not
> identical anymore and stamped with the _current_ time during
> checkout.  Although branch b and c have not changed at all, they
> will now be rebuilt completely because the timestamp on that files
> has changed.

I am not quite sure I follow your set-up.  Do you have three working
trees connected to a repository (via contrib/workdir/git-new-workdir
perhaps), each having a checkout of its own branch?  And in one
working directory that has build-a checked out, a new commit touches
one file, <target>.opt, to make a new commit:

Before:

    ---o---o---X
               ^ refs/heads/build-a
                 refs/heads/build-b
                 refs/heads/build-c

After:
                   v refs/heads/build-a
    ---o---o---X---Y
               ^ refs/heads/build-b
                 refs/heads/build-c

Because you said that branch b and c hasn't changed at all, I do not
see how your build-b and/or build-c directories become dirty.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-09 11:25 Setting file timestamps to commit time (git-checkout) Dominik Vogt
  2013-12-09 20:35 ` Junio C Hamano
@ 2013-12-09 20:48 ` Jonathan Nieder
  2013-12-10  8:46   ` Dominik Vogt
  1 sibling, 1 reply; 11+ messages in thread
From: Jonathan Nieder @ 2013-12-09 20:48 UTC (permalink / raw)
  To: git

Hi,

Dominik Vogt wrote:

>                                                            Now,
> when I switch to one of the other branches, said file is not
> identical anymore and stamped with the _current_ time during
> checkout.  Although branch b and c have not changed at all, they
> will now be rebuilt completely because the timestamp on that files
> has changed.  I.e. a chance on one branch forces a rebuild on n
> other branches, which can take many hours.
>
> I think this situation could be improved with an option to
> git-checkout with the following logic:
>
> $ git checkout <new branch>
>   FOR EACH <file> in working directory of <new branch>
>     IF <file> is identical to the version in the <old branch>
>       THEN leave the file untouched
>     ELSE IF <commit timestamp> of the HEAD of the <new branch>
>             is in the future
>       THEN checkout the new version of <file> and stamp it with
>            the current time
>     ELSE (commit timestamp is current or in the past)
>       THEN checkout the new version of <file> and stamp it with
>            the commit timestamp of the current HEAD of <new branch>

Wouldn't that break "make"?  When you switch to an old branch, changed
files would then a timestamp *before* the corresponding build targets,
causing the stale (wrong function signatures, etc) build results from
the newer branch to be reused and breaking the build.

I suspect the simplest way to accomplish what you're looking for would
be to keep separate worktrees for each branch you regularly build.
It's possible to do that using entirely independent clones, clones
sharing some objects (using "git clone --shared" from some master
copy), or even multiple worktrees for the same clone (using the
git-new-workdir script from contrib/workdir/).

See [1] and [2] for more hints.

[...]
> (Please do not cc me on replies, I'm subscribed to the list.)

The convention on this list is to always reply-to-all, but I'm happy
to make an exception. :)

Hope that helps,
Jonathan

[1] https://git.wiki.kernel.org/index.php/Git_FAQ#Why_isn.27t_Git_preserving_modification_time_on_files.3F
[2] https://git.wiki.kernel.org/index.php/ExampleScripts#Setting_the_timestamps_of_the_files_to_the_commit_timestamp_of_the_commit_which_last_touched_them

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-09 20:35 ` Junio C Hamano
@ 2013-12-10  8:35   ` Dominik Vogt
  2013-12-10 19:02     ` Andreas Schwab
  2013-12-11  1:39     ` Constantine A. Murenin
  0 siblings, 2 replies; 11+ messages in thread
From: Dominik Vogt @ 2013-12-10  8:35 UTC (permalink / raw)
  To: git

On Mon, Dec 09, 2013 at 12:35:38PM -0800, Junio C Hamano wrote:
> Dominik Vogt <vogt@linux.vnet.ibm.com> writes:
> 
> > Me and some colleagues work on gcc in lots of different branches.
> > For each branch there is a separate build directory for each
> > branch, e.g. build-a, build-b and build-c.  Let's assume that all
> > branches are identical at the moment.  If a file in branch a is
> > changed that triggers a complete rebuild of gcc (e.g.
> > <target>.opt), rebuilding in build-a takes about an hour.  Now,
> >  when I switch to one of the other branches, said file is not
> > identical anymore and stamped with the _current_ time during
> > checkout.  Although branch b and c have not changed at all, they
> > will now be rebuilt completely because the timestamp on that files
> > has changed.
> 
> I am not quite sure I follow your set-up.  Do you have three working
> trees connected to a repository (via contrib/workdir/git-new-workdir
> perhaps), each having a checkout of its own branch?

No, just one working tree, but three separate build directories
for various branches.  Actually, the build directories could be
located at some random place on disk, but it's convenient to keep
them inside the working tree.  Personally I do not use multiple
working trees because in the past I had the impression that this
kind of setup creates more problems than it solves.  Just to give
you an idea how my current workspace looks like:

  ~/rpm/BUILD/gcc-4.1.2-20080825
    build-4.1/
    install-4.1/
    ...
  (branch "master")

  ~/rpm/BUILD/gcc-4.4.7-20120601
    build-4.4/
    install-4.1/
  (branch "master")

  ~/src/git/gcc-unpatched
    build/
    install/
    ...
  (branch "master")

  ~/src/git/gcc-patched
    build-4.8/
    build-4.9/
    build-somefeature/
    install-4.8/
    install-4.9/
    install-somefeature/
    ...
  (various feature branches)

> [snip]

Hm, the case I described was too simple.  Another try:

* With the setup described above I have, say, eleven branches, namely
  a and b, b2, ..., b9:

  ---o---X     <== a
     |
     `---Y     <== b
         |
         |---o <== b2
         ...
         `---o <== b9

* The two commits X and Y both touch a file that triggers a
  complete rebuild, say gcc/common.opt.

* Each branch has a matching build directory build-<branch>, and
  all of them are built for the latest version of the
  corresponding branch.

* Switch to branch a and do some work or just look at it.

* When I switch back to any of the b-branches, gcc/common.opt gets
  stamped with the current time, i.e. "make" considers the whole
  build directory to be outdated and builds everything from
  scratch.  Then I switch to another b-branch and the whole thing
  starts over etc.  With gcc-bootstrapping enabled, such a build
  takes me almost an hour.  In other words, just looking at branch
  a entails a full day just rebuilding branches that have not
  changed at all.

I've discussed that with some of my co-workers, but we still
could not come up with a nice solution.  The "right" way to "fix"
this might be to stash all file modification dates on a branch
switch and restore them when switching back to the original.  But
that sounds awfully expensive, and really out of the scope of an
RCS.  The second best approach I could think of is to stamp files
with the timestamp of the last commit that touched that, but I
guess that is not a cheap operation either.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-09 20:48 ` Jonathan Nieder
@ 2013-12-10  8:46   ` Dominik Vogt
  2013-12-10 10:34     ` Duy Nguyen
  2013-12-11  1:01     ` Jonathan Nieder
  0 siblings, 2 replies; 11+ messages in thread
From: Dominik Vogt @ 2013-12-10  8:46 UTC (permalink / raw)
  To: git

On Mon, Dec 09, 2013 at 12:48:16PM -0800, Jonathan Nieder wrote:
> Dominik Vogt wrote:
> > when I switch to one of the other branches, said file is not
> > identical anymore and stamped with the _current_ time during
> > checkout.  Although branch b and c have not changed at all, they
> > will now be rebuilt completely because the timestamp on that files
> > has changed.  I.e. a chance on one branch forces a rebuild on n
> > other branches, which can take many hours.
> >
> > I think this situation could be improved with an option to
> > git-checkout with the following logic:
> >
> > $ git checkout <new branch>
> >   FOR EACH <file> in working directory of <new branch>
> >     IF <file> is identical to the version in the <old branch>
> >       THEN leave the file untouched
> >     ELSE IF <commit timestamp> of the HEAD of the <new branch>
> >             is in the future
> >       THEN checkout the new version of <file> and stamp it with
> >            the current time
> >     ELSE (commit timestamp is current or in the past)
> >       THEN checkout the new version of <file> and stamp it with
> >            the commit timestamp of the current HEAD of <new branch>
> 
> Wouldn't that break "make"?  When you switch to an old branch, changed
> files would then a timestamp *before* the corresponding build targets,
> causing the stale (wrong function signatures, etc) build results from
> the newer branch to be reused and breaking the build.

Yes, if you share a common build directory, this logic would
utterly break the build system.  The point with gcc is, that you
do not build it in the source tree but in a separate build
directory, and it's easy to have separate build directories for
your branches.

> I suspect the simplest way to accomplish what you're looking for would
> be to keep separate worktrees for each branch you regularly build.
> It's possible to do that using entirely independent clones, clones
> sharing some objects (using "git clone --shared" from some master
> copy), or even multiple worktrees for the same clone (using the
> git-new-workdir script from contrib/workdir/).

I've tried the first two ways for separate workdirs in the past
but did not like them.  How does git-new-workdir cope with
rebasing (e.g. you have the same branch checked out in two working
trees and "rebase -i" it in one of them)?  Is it really a working
option?

> > (Please do not cc me on replies, I'm subscribed to the list.)
> 
> The convention on this list is to always reply-to-all, but I'm happy
> to make an exception. :)

It's just a hint; anyway, I guess I should remove the Reply-To
header if I don't want direct replies.  ;-)

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-10  8:46   ` Dominik Vogt
@ 2013-12-10 10:34     ` Duy Nguyen
  2013-12-11  1:08       ` Jonathan Nieder
  2013-12-11  1:01     ` Jonathan Nieder
  1 sibling, 1 reply; 11+ messages in thread
From: Duy Nguyen @ 2013-12-10 10:34 UTC (permalink / raw)
  To: Git Mailing List

On Tue, Dec 10, 2013 at 3:46 PM, Dominik Vogt <vogt@linux.vnet.ibm.com> wrote:
>
> > I suspect the simplest way to accomplish what you're looking for would
> > be to keep separate worktrees for each branch you regularly build.
> > It's possible to do that using entirely independent clones, clones
> > sharing some objects (using "git clone --shared" from some master
> > copy), or even multiple worktrees for the same clone (using the
> > git-new-workdir script from contrib/workdir/).
>
> I've tried the first two ways for separate workdirs in the past
> but did not like them.  How does git-new-workdir cope with
> rebasing (e.g. you have the same branch checked out in two working
> trees and "rebase -i" it in one of them)?  Is it really a working
> option?

I wonder if we could promote multiple worktree from a hack to a
supported feature. What I have in mind is when you "clone
--separate-worktree" it would create a .git file that describes
separate worktree:

gitbasedir: /path/to/the/original/.git
name: foo

HEAD, index and logs/HEAD would be stored in
/path/to/the/original/.git/worktrees/foo/. GIT_DIR would be set to
.../foo/, GIT_OBJECT_DIRECTORY, the new GIT_REF_DIRECTORY (which
covers root for all refs/, logs/ and packed-refs) and maybe
GIT_HOOKS_DIRECTORY are pointed to directories in
.../original/.git/... though.

This allows all worktrees to be aware of the others and locking could
be implemented so that no two worktrees check out the same branch (or
they can, but the other becomes detached if the ref is updated in this
worktree)..
-- 
Duy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-10  8:35   ` Dominik Vogt
@ 2013-12-10 19:02     ` Andreas Schwab
  2013-12-11  7:37       ` Dominik Vogt
  2013-12-11  1:39     ` Constantine A. Murenin
  1 sibling, 1 reply; 11+ messages in thread
From: Andreas Schwab @ 2013-12-10 19:02 UTC (permalink / raw)
  To: git

Dominik Vogt <vogt@linux.vnet.ibm.com> writes:

> The second best approach I could think of is to stamp files with the
> timestamp of the last commit that touched that, but I guess that is
> not a cheap operation either.

I'm using this script for this:

#!/bin/sh
git log --name-only --format=format:%n%ct -- "$@" |
perl -e 'my $do_date = 0; chomp(my $cdup = `git rev-parse --show-cdup`);
    while (<>) {
	chomp;
	if ($do_date) {
	    next if ($_ eq "");
	    die "Unexpected $_\n" unless /^[0-9]+$/;
	    $d = $_;
	    $do_date = 0;
	} elsif ($_ eq "") {
	    $do_date = 1;
	} elsif (!defined($seen{$_})) {
	    $seen{$_} = 1;
 	    utime $d, $d, "$cdup$_";
 	}
    }'

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-10  8:46   ` Dominik Vogt
  2013-12-10 10:34     ` Duy Nguyen
@ 2013-12-11  1:01     ` Jonathan Nieder
  1 sibling, 0 replies; 11+ messages in thread
From: Jonathan Nieder @ 2013-12-11  1:01 UTC (permalink / raw)
  To: git

Dominik Vogt wrote:

>                         How does git-new-workdir cope with
> rebasing (e.g. you have the same branch checked out in two working
> trees and "rebase -i" it in one of them)?

Generally you don't have the same branch checked out in two working
trees.  I tend to use "git checkout --detach" to not have *any*
branch checked out in most working trees, though that comes with its
own set of problems since the HEAD reflog is not shared.

>                                            Is it really a working
> option?

Yes, modulo the two warnings above. ;-)

If someone has time to work on it, the threads

 http://thread.gmane.org/gmane.comp.version-control.git/150559
 http://thread.gmane.org/gmane.comp.version-control.git/182821

describe one way to make those caveats go away.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-10 10:34     ` Duy Nguyen
@ 2013-12-11  1:08       ` Jonathan Nieder
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Nieder @ 2013-12-11  1:08 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Pierre Habouzit

Duy Nguyen wrote:

> I wonder if we could promote multiple worktree from a hack to a
> supported feature. What I have in mind is when you "clone
> --separate-worktree" it would create a .git file that describes
> separate worktree:
>
> gitbasedir: /path/to/the/original/.git
> name: foo
>
> HEAD, index and logs/HEAD would be stored in
> /path/to/the/original/.git/worktrees/foo/.

I like this idea a lot.

Jonathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-10  8:35   ` Dominik Vogt
  2013-12-10 19:02     ` Andreas Schwab
@ 2013-12-11  1:39     ` Constantine A. Murenin
  1 sibling, 0 replies; 11+ messages in thread
From: Constantine A. Murenin @ 2013-12-11  1:39 UTC (permalink / raw)
  To: vogt, git

On 10 December 2013 00:35, Dominik Vogt <vogt@linux.vnet.ibm.com> wrote:
> that sounds awfully expensive, and really out of the scope of an
> RCS.  The second best approach I could think of is to stamp files
> with the timestamp of the last commit that touched that, but I
> guess that is not a cheap operation either.

You can already do this with a very small third-party script:

    https://github.com/cnst/git-tools/blob/master/git-restore-mtime-core

C.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Setting file timestamps to commit time (git-checkout)
  2013-12-10 19:02     ` Andreas Schwab
@ 2013-12-11  7:37       ` Dominik Vogt
  0 siblings, 0 replies; 11+ messages in thread
From: Dominik Vogt @ 2013-12-11  7:37 UTC (permalink / raw)
  To: git

On Tue, Dec 10, 2013 at 08:02:29PM +0100, Andreas Schwab wrote:
> Dominik Vogt <vogt@linux.vnet.ibm.com> writes:
> 
> > The second best approach I could think of is to stamp files with the
> > timestamp of the last commit that touched that, but I guess that is
> > not a cheap operation either.
> 
> I'm using this script for this:
[snip]

Hm, that runs 18 s on the local Gcc repository.  That's not as
expensive as I would have thought, but definitely not suitable to
run automatically on each checkout.  I wonder if performance could
be improved by integrating the script logic into the git-checkout
code (activated by a command line option).

On Tue, Dec 10, 2013 at 05:39:05PM -0800, Constantine A. Murenin wrote:
> You can already do this with a very small third-party script:
>
>     https://github.com/cnst/git-tools/blob/master/git-restore-mtime-core

That script just produces error messages for me.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-12-11  7:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-09 11:25 Setting file timestamps to commit time (git-checkout) Dominik Vogt
2013-12-09 20:35 ` Junio C Hamano
2013-12-10  8:35   ` Dominik Vogt
2013-12-10 19:02     ` Andreas Schwab
2013-12-11  7:37       ` Dominik Vogt
2013-12-11  1:39     ` Constantine A. Murenin
2013-12-09 20:48 ` Jonathan Nieder
2013-12-10  8:46   ` Dominik Vogt
2013-12-10 10:34     ` Duy Nguyen
2013-12-11  1:08       ` Jonathan Nieder
2013-12-11  1:01     ` Jonathan Nieder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.