All of lore.kernel.org
 help / color / mirror / Atom feed
* cvs revision number -> git commit name?
@ 2010-01-26 21:53 Hallvard B Furuseth
  2010-01-26 22:53 ` Aaron Crane
  0 siblings, 1 reply; 7+ messages in thread
From: Hallvard B Furuseth @ 2010-01-26 21:53 UTC (permalink / raw)
  To: git

When moving from CVS to Git, what's a good way to help Git users
find an old commit given the original CVS revision number?  Are
there tools available to help?

There are plenty of still-useful references to CVS revisions
floating around - in bug reports, mailing list archives, commit
messages referring to other commits.  Some loose thoughts:

One could commit a table with a (file,revision)->commit mapping,
I suppose something can generate it when importing from cvs?

Many but far from all old file contain the CVS ID, named $OpenLDAP$.
Can Git grep all versions of a file for '\$OpenLDAP:.* 1.23 '?

Could maybe add a line like this to many of the log messages:
    "<cvs: version 1.23>"
for single-file commits, or
    "<cvs: here/foo.c 1.23, there/bar.c 1.45>"
for multi-file comments with few enough files that such an
annotation fits on one line.  That'll make log messages like "fix
rev 1.23" easier to read without need for a tool to find what the
message is talking about, but does clutter up the log a lot.

Some stats:
    1600 files = 23M text, 770k lines, in 100 directories.
   Maybe 20000 Git commits, 50M ldap.git/.git/ directory.

-- 
Hallvard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cvs revision number -> git commit name?
  2010-01-26 21:53 cvs revision number -> git commit name? Hallvard B Furuseth
@ 2010-01-26 22:53 ` Aaron Crane
  2010-01-26 23:43   ` Johan Herland
  0 siblings, 1 reply; 7+ messages in thread
From: Aaron Crane @ 2010-01-26 22:53 UTC (permalink / raw)
  To: Hallvard B Furuseth; +Cc: git

Hallvard B Furuseth <h.b.furuseth@usit.uio.no> wrote:
> When moving from CVS to Git, what's a good way to help Git users
> find an old commit given the original CVS revision number?  Are
> there tools available to help?
>
> One could commit a table with a (file,revision)->commit mapping,
> I suppose something can generate it when importing from cvs?

That's what we decided to do on a recent CVS-to-Git conversion, though
like you, we also considered munging the log messages instead.  Our
jury's still out on whether it was the right decision; we haven't had
much cause to use the result yet.

One thing to be aware of (beyond the need to run grep to convert old
CVS revision numbers to Git commit IDs) is that there's a good chance
the mapping file will pollute the results of `git grep` for some
tasks.  (We've put the mapping file into our repo, where it's easy to
find.)  I'm considering gzipping the mapping file as a workaround;
that would mean our users will need to use zgrep (or equivalent) to
look up CVS revision numbers, which may or may not be a problem in
your situation.

I have an initial patch to git-cvsimport that adds a switch to
generate the mapping as it goes.  I'm currently trying to find time to
clean it up and submit it.

-- 
Aaron Crane ** http://aaroncrane.co.uk/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cvs revision number -> git commit name?
  2010-01-26 22:53 ` Aaron Crane
@ 2010-01-26 23:43   ` Johan Herland
  2010-01-27  8:38     ` Junio C Hamano
  2010-01-27 17:47     ` cvs revision number -> git commit name? Hallvard B Furuseth
  0 siblings, 2 replies; 7+ messages in thread
From: Johan Herland @ 2010-01-26 23:43 UTC (permalink / raw)
  To: git; +Cc: Aaron Crane, Hallvard B Furuseth

On Tuesday 26 January 2010, Aaron Crane wrote:
> Hallvard B Furuseth <h.b.furuseth@usit.uio.no> wrote:
> > When moving from CVS to Git, what's a good way to help Git users
> > find an old commit given the original CVS revision number?  Are
> > there tools available to help?
> >
> > One could commit a table with a (file,revision)->commit mapping,
> > I suppose something can generate it when importing from cvs?
> 
> That's what we decided to do on a recent CVS-to-Git conversion, though
> like you, we also considered munging the log messages instead.  Our
> jury's still out on whether it was the right decision; we haven't had
> much cause to use the result yet.
> 
> One thing to be aware of (beyond the need to run grep to convert old
> CVS revision numbers to Git commit IDs) is that there's a good chance
> the mapping file will pollute the results of `git grep` for some
> tasks.  (We've put the mapping file into our repo, where it's easy to
> find.)  I'm considering gzipping the mapping file as a workaround;
> that would mean our users will need to use zgrep (or equivalent) to
> look up CVS revision numbers, which may or may not be a problem in
> your situation.
> 
> I have an initial patch to git-cvsimport that adds a switch to
> generate the mapping as it goes.  I'm currently trying to find time to
> clean it up and submit it.

You could consider adding the CVS revision numbers as notes (see "git help 
notes" in >= v1.6.6) to the corresponding commits. Then they don't pollute 
the commit messages, but instead live in a separate, but parallel hierarchy 
that can be easily pulled in when you need to reference them (e.g. 
GIT_NOTES_REF="refs/" git log).

The notes feature is still very new, and there are still outstanding patches 
to be merged, but the basics are there in v1.6.6.

FWIW, I was also working on a CVS-to-Git importer (based on what has later 
become the transport-helper infrastructure), that used notes to store 
exactly the metadata you mention above. However, I haven't worked on it for 
a while, and I probably won't have time to pick it up in the immediate 
future.


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cvs revision number -> git commit name?
  2010-01-26 23:43   ` Johan Herland
@ 2010-01-27  8:38     ` Junio C Hamano
  2010-01-27 11:28       ` git notes issues (was: cvs revision number -> git commit name?) Johan Herland
  2010-01-27 17:47     ` cvs revision number -> git commit name? Hallvard B Furuseth
  1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2010-01-27  8:38 UTC (permalink / raw)
  To: Johan Herland; +Cc: git, Aaron Crane, Hallvard B Furuseth

Johan Herland <johan@herland.net> writes:

> The notes feature is still very new, and there are still outstanding patches 
> to be merged, but the basics are there in v1.6.6.

By the way, we should seriously rethink how notes should propagate through
rebases and amends.  I've been using this in my post-applypatch hook
lately:

-- >8 -- cut here -- >8 --
#!/bin/sh
GIT_DIR=.git
dotest="$GIT_DIR/rebase-apply"

prec=4 &&
this=$(cat 2>/dev/null "$dotest/next") &&
msgnum=$(printf "%0${prec}d" $this) &&
test -f "$dotest/$msgnum" &&
message_id=$(sed -n '
	/^Message-I[Dd]:[ 	]*\(<.*>\)[ 	]*$/{
		s//\1/p
		q
	}
	/^$/q
' "$dotest/$msgnum") &&
test -n "$message_id" &&
GIT_NOTES_REF=refs/notes/amlog \
	git notes edit -m "Message-Id: $message_id" HEAD
-- 8< -- upto here -- 8< --

together with this in $HOME/.gitconfig

-- >8 -- cut here -- >8 --
[alias]
	lgm = "!sh -c 'GIT_NOTES_REF=refs/notes/amlog git log \"$@\" || :' -"
-- 8< -- upto here -- 8< --

so that I can say:

	$ git lgm -1 jh/maint-config-file-prefix
        commit 65807ee697a28cb30b8ad38ebb8b84cebd3f255d
        Author: Johan Herland <johan@herland.net>
        Date:   Tue Jan 26 16:02:16 2010 +0100

            builtin-config: Fix crash when using "-f <relative path>" from non-root dir

            When your current directory is not ...
	    ...

        Notes:
            Message-Id: <201001261602.16876.johan@herland.net>

A few observations I made myself so far:

 - I used to fix minor issues (styles, decl-after-stmt, etc.) using
   rebase-i long after running "am" in bulk, but these days I find myself
   going back to my "inbox" and fix them in MUA; this is only because I
   know these notes do not propagate across rebases and amends---adjusting
   the workflow to the tool's limitation is not very good.

 - The interface to tell tools to use which notes ref to use should be
   able to say "these refs", not just "this ref" i.e. GIT_NOTES_REF=a:b
   just like PATH=a:b:c...); I am fairly certain that we would want to
   store different kind of information in separate notes trees and
   aggregate them, as we gain experience with notes.

 - There should be an interface to tell tools to use which notes refs via
   command line options; "!alias" does not TAB-complete, and "git lgm"
   above doesn't, either. "git log --notes=notes/amlog --notes=notes/other"
   would probably be the way to go.

 - While reviewing the "inbox", I sometimes wonder if I applied a message
   to somewhere already, but there is no obvious way to grep in the notes
   tree and get the object name that a note is attached to.  Of course I
   know I can "git grep -c johan@herland.net notes/amlog" and it will give
   me something like:

    notes/amlog:65807ee697a28cb30b8ad38ebb8b84cebd3f255d:1
    notes/amlog:c789176020d6a008821e01af8b65f28abc138d4b:1

   but this won't scale and needs scripting to mechanize, once we start
   rebalancing the notes tree with different fan-outs.  The end user (me
   in this case) is interested in "set of objects that match this grep
   criteria", not "the pathnames the notes tree's implementation happens
   to use to store notes for them in the hierarchy".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git notes issues (was: cvs revision number -> git commit name?)
  2010-01-27  8:38     ` Junio C Hamano
@ 2010-01-27 11:28       ` Johan Herland
  0 siblings, 0 replies; 7+ messages in thread
From: Johan Herland @ 2010-01-27 11:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Aaron Crane, Hallvard B Furuseth

On Wednesday 27 January 2010, Junio C Hamano wrote:
> Johan Herland <johan@herland.net> writes:
> > The notes feature is still very new, and there are still outstanding
> > patches to be merged, but the basics are there in v1.6.6.
> 
> By the way, we should seriously rethink how notes should propagate
>  through rebases and amends.  I've been using this in my post-applypatch
>  hook lately:
> 
> [...]
> 
> A few observations I made myself so far:
> 
>  - I used to fix minor issues (styles, decl-after-stmt, etc.) using
>    rebase-i long after running "am" in bulk, but these days I find myself
>    going back to my "inbox" and fix them in MUA; this is only because I
>    know these notes do not propagate across rebases and
>  amends---adjusting the workflow to the tool's limitation is not very
>  good.

Agreed. I simply haven't had time to look much into this yet. We should 
probably add options (both command-line options and config variables) to 
'git rebase', 'git commit --amend', 'git cherry-pick', etc. for bringing 
notes across a commit rewrite. We should probably also add "git notes 
move/copy <old_object> <new_object>" subcommands to make the same operations 
available to scripts (and users).

>  - The interface to tell tools to use which notes ref to use should be
>    able to say "these refs", not just "this ref" i.e. GIT_NOTES_REF=a:b
>    just like PATH=a:b:c...); I am fairly certain that we would want to
>    store different kind of information in separate notes trees and
>    aggregate them, as we gain experience with notes.

Agreed.

>  - There should be an interface to tell tools to use which notes refs via
>    command line options; "!alias" does not TAB-complete, and "git lgm"
>    above doesn't, either. "git log --notes=notes/amlog
>  --notes=notes/other" would probably be the way to go.

Agreed.

>  - While reviewing the "inbox", I sometimes wonder if I applied a message
>    to somewhere already, but there is no obvious way to grep in the notes
>    tree and get the object name that a note is attached to.  Of course I
>    know I can "git grep -c johan@herland.net notes/amlog" and it will
>  give me something like:
> 
>     notes/amlog:65807ee697a28cb30b8ad38ebb8b84cebd3f255d:1
>     notes/amlog:c789176020d6a008821e01af8b65f28abc138d4b:1
> 
>    but this won't scale and needs scripting to mechanize, once we start
>    rebalancing the notes tree with different fan-outs.  The end user (me
>    in this case) is interested in "set of objects that match this grep
>    criteria", not "the pathnames the notes tree's implementation happens
>    to use to store notes for them in the hierarchy".

Agreed. Should add a "git notes grep" subcommand.

I hope that once the updated notes API is in (I'll send an update to what's 
in 'pu' shortly), people will have the tools they need to start integrating 
notes with their pet Git features. As you illustrate perfectly above, there 
are many corners to be smoothed out before this is well integrated with the 
other Git tools.


Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cvs revision number -> git commit name?
  2010-01-26 23:43   ` Johan Herland
  2010-01-27  8:38     ` Junio C Hamano
@ 2010-01-27 17:47     ` Hallvard B Furuseth
  2010-01-27 22:19       ` Johan Herland
  1 sibling, 1 reply; 7+ messages in thread
From: Hallvard B Furuseth @ 2010-01-27 17:47 UTC (permalink / raw)
  To: Johan Herland; +Cc: git, Aaron Crane

Johan Herland writes:
>On Tuesday 26 January 2010, Aaron Crane wrote:
>> One thing to be aware of (beyond the need to run grep to convert old
>> CVS revision numbers to Git commit IDs)

which sounds like a job for a small tool, maybe aliased in .git/config.
'git cvsinfo file.c 1.23' or 'git cvsinfo [file.c] <git-commitname>' -->
output cvs and git commit info (cvs rev, commit, log message, etc).
Or maybe it shouldn't be cvs-specific.

>> is that there's a good chance
>> the mapping file will pollute the results of `git grep` for some
>> tasks.  (We've put the mapping file into our repo, where it's easy to
>> find.)  I'm considering gzipping the mapping file as a workaround;
>> that would mean our users will need to use zgrep (or equivalent) to
>> look up CVS revision numbers, which may or may not be a problem in
>> your situation.

Thanks for the tip.  Zipping sounds good.  In particular combined with
the grepping tool above.  If the unzipping gets slow, cvsinfo --unpack
could always put a bunzipped file in .git/cvsinfo.txt or something.

> You could consider adding the CVS revision numbers as notes (see "git help 
> notes" in >= v1.6.6) to the corresponding commits. Then they don't pollute 
> the commit messages, but instead live in a separate, but parallel hierarchy 
> that can be easily pulled in when you need to reference them (e.g. 
> GIT_NOTES_REF="refs/" git log).

Thanks, looks better than munging the log.  Though with one common
weakness - should likely omit noting mass commits, since they'd clutter
what 'git log' displays too much.  Of course, either could used combined
with a mapping table.

-- 
Hallvard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: cvs revision number -> git commit name?
  2010-01-27 17:47     ` cvs revision number -> git commit name? Hallvard B Furuseth
@ 2010-01-27 22:19       ` Johan Herland
  0 siblings, 0 replies; 7+ messages in thread
From: Johan Herland @ 2010-01-27 22:19 UTC (permalink / raw)
  To: Hallvard B Furuseth; +Cc: git, Aaron Crane

On Wednesday 27 January 2010, Hallvard B Furuseth wrote:
> Johan Herland writes:
> > You could consider adding the CVS revision numbers as notes (see "git
> > help notes" in >= v1.6.6) to the corresponding commits. Then they don't
> > pollute the commit messages, but instead live in a separate, but
> > parallel hierarchy that can be easily pulled in when you need to
> > reference them (e.g. GIT_NOTES_REF="refs/" git log).
> 
> Thanks, looks better than munging the log.  Though with one common
> weakness - should likely omit noting mass commits, since they'd clutter
> what 'git log' displays too much.  Of course, either could used combined
> with a mapping table.

Of course, you wouldn't put the cvs revision numbers on the default notes 
ref ("refs/notes/commits"). You would rather put them on a _different_ notes 
refs (e.g. "refs/notes/cvs"), where they would not clutter you "git log", 
and then when you _need_ to look at them, you simply run:

  GIT_NOTES_REF=refs/notes/cvs git log


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-01-27 22:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-26 21:53 cvs revision number -> git commit name? Hallvard B Furuseth
2010-01-26 22:53 ` Aaron Crane
2010-01-26 23:43   ` Johan Herland
2010-01-27  8:38     ` Junio C Hamano
2010-01-27 11:28       ` git notes issues (was: cvs revision number -> git commit name?) Johan Herland
2010-01-27 17:47     ` cvs revision number -> git commit name? Hallvard B Furuseth
2010-01-27 22:19       ` Johan Herland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.