linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: New SCM and commit list
@ 2005-04-12  3:02 Adam J. Richter
  2005-04-12 21:54 ` Daniel Barkalow
  0 siblings, 1 reply; 24+ messages in thread
From: Adam J. Richter @ 2005-04-12  3:02 UTC (permalink / raw)
  To: barkalow
  Cc: benh, dwmw2, greg, james.bottomley, jgarzik, linux-kernel, mason,
	mingo, torvalds

On 2005-04-11, Daniel Barkalow wrote:
>If merge took trees instead of single files, and had some way of detecting
>renames (or it got additional information about the differences between
>files), would that give BK-quality performance? Or does BK also support
>cases like:
>
>orig ---> first ---> first-merge -
> |                /               \
> |------> second -                 -> final
> |                \               /
> |------> third ---> third-merge -
>
>where the final merge requires, for complete cleanliness, a comparison of
>more than 3 states (since some changes will have orig as the common
>ancestor and some will have second).

	With 3-way merge and the ability to regenerate the relevant
files from each step, this should be easy to handle as long
as you have a list of which patches are considered to have been
duplicated.  Let's detail your example:

orig ---> first 1a 1b 1c ---> first-merge - 1d 1e
 |                          /                    \
 |------> second 2a 2b 2c -                       -> final
 |                          \                    /
 |------> third 3a 3b 3c ---> third-merge - 3d 3e

Here, 1a, 1b, etc. refer to specific states of the source tree.
I will refer to differences by a notation like "1a->1b", which
is the difference to go from snapshot 1a to 1b.  All that the
merge algorithm for the final merge needs to know is that the
ends of the branches (that is, 1e and 3e) both contain the
following diffs:

		orig->2a
		2a->2b
		2b->2c

	The function merge(orig, ver1, ver2) can try to reverse
the duplicate merges in one of the branches:

		1e' = merge( 1e, 2c->2b);
		1e'' = merge(1e', 2b->2a);
		1e''' = merge(1e'', 2a->orig);
		return merge(1e''', 2c->3e)

	Of course, conflicts can happen, but that can happen
in any merge.  There are also other ways to calculate the
merge and because there are different ways one can write a
merge function, it is possible that merging in a different
order might produce slightly different results.  For example,
it would be possible to reverse the dpulicates in your "third merge"
branch instead of your "first merge" branch, or one could
reconstruct a branch without the duplicated merges by executing
the other changes forward from a common ancestor, like so:

		1e''' = merge(orig, 3d->3e);

	...regardless, the point is that all the information
that is absolutely needed is a list of instance of diffs
to be skipped.  It is not even necessary that the changes
have such a clearly explainable ancestory as that you have
described.  All the merge program needs to know are the changes
to be skipped, although information like changes the skipped
patches are duplicating may be useful for things like trying
to reverse a patch in your "third-merge" branch in your
example if reverseing the patch in "first-merge" fails.

	I believe that at least bitkeeper, darcs, a free python-based
system that I can't remember at the moment, and possibly arch do this
sort of machination already.


>Does this happen in real life? [...]

	Yes.  Both individual users and Linux distributions incorporate
patches that they think are useful to them and then futher patches
that they develop.  The time costs of rejecting such patches would
likely be paid for by other integration or development work not being
done.

>It seems like sane development processes
               ^^^^
>wouldn't have multiple mainline-candidate patch sets including the same
>patches, if for no other reason than that, should the merge fail, nobody
>with any clue about the original patches would be anywhere nearby.

	If you could avoid prejudicial subjective adjectives, it
it would make it easier for the saneness or insaneness of an
approach to be apparent just by discussing your more objective criteria,
like the remainder of your sentence, which is where the focus should
be.

	(1) Does allowing duplicate patches really mean that
	   "nobody with any clue about the original patches would be
	   anywhere near by?"  What attracts these clueful people
	   just by third parties having to rebase their patches?

	(2) Does this supposed benefit outweigh the cost of rejecting
	    many patches unnecessarily?  I know from my own experience
	    that I have either given up on or had to put into a very low
	    priority mode at least 66% of the patches that I haven't
	    gotten integrated, but which I am confident the kernel
	    would be better having (e.g.: devfs shrink, lookup()
	    trapping, ipv4 as a loadable (not not yet removable) module,
	    sysfs memory shrink, factoring much of the DMA mapping to
	    the common bus code from individual drivers, fewer kmap's
	    in crypto, I could go on).

>It
>seems better to throw something back to someone to rebase their diffs.
       ^^^^^^

	I try to avoid a general subjective adjectives like "better"
unless I am claiming that I've covered the trade-offs fully, and, even
then, avoiding it keeps the focus on analyzing the trade-offs.

                    __     ______________ 
Adam J. Richter        \ /
adam@yggdrasil.com      | g g d r a s i l

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-12  3:02 New SCM and commit list Adam J. Richter
@ 2005-04-12 21:54 ` Daniel Barkalow
  0 siblings, 0 replies; 24+ messages in thread
From: Daniel Barkalow @ 2005-04-12 21:54 UTC (permalink / raw)
  To: Adam J. Richter
  Cc: benh, dwmw2, greg, james.bottomley, jgarzik, linux-kernel, mason,
	mingo, torvalds

On Tue, 12 Apr 2005, Adam J. Richter wrote:

> On 2005-04-11, Daniel Barkalow wrote:
> >If merge took trees instead of single files, and had some way of detecting
> >renames (or it got additional information about the differences between
> >files), would that give BK-quality performance? Or does BK also support
> >cases like:
> >
> >orig ---> first ---> first-merge -
> > |                /               \
> > |------> second -                 -> final
> > |                \               /
> > |------> third ---> third-merge -
> >
> >where the final merge requires, for complete cleanliness, a comparison of
> >more than 3 states (since some changes will have orig as the common
> >ancestor and some will have second).
> 
> 	With 3-way merge and the ability to regenerate the relevant
> files from each step, this should be easy to handle as long
> as you have a list of which patches are considered to have been
> duplicated.  Let's detail your example:
> 
> orig ---> first 1a 1b 1c ---> first-merge - 1d 1e
>  |                          /                    \
>  |------> second 2a 2b 2c -                       -> final
>  |                          \                    /
>  |------> third 3a 3b 3c ---> third-merge - 3d 3e
> 
> Here, 1a, 1b, etc. refer to specific states of the source tree.
> I will refer to differences by a notation like "1a->1b", which
> is the difference to go from snapshot 1a to 1b.  All that the
> merge algorithm for the final merge needs to know is that the
> ends of the branches (that is, 1e and 3e) both contain the
> following diffs:
> 
> 		orig->2a
> 		2a->2b
> 		2b->2c
> 
> 	The function merge(orig, ver1, ver2) can try to reverse
> the duplicate merges in one of the branches:
> 
> 		1e' = merge( 1e, 2c->2b);
> 		1e'' = merge(1e', 2b->2a);
> 		1e''' = merge(1e'', 2a->orig);
> 		return merge(1e''', 2c->3e)

If 1d->1e depends on something in the 2 series, which is why I would
expect 1e to be pushing something containing the 2 series, there must be
conflicts. Likewise on the 3 series.

> 	Of course, conflicts can happen, but that can happen
> in any merge.  There are also other ways to calculate the
> merge and because there are different ways one can write a
> merge function, it is possible that merging in a different
> order might produce slightly different results.  For example,
> it would be possible to reverse the dpulicates in your "third merge"
> branch instead of your "first merge" branch, or one could
> reconstruct a branch without the duplicated merges by executing
> the other changes forward from a common ancestor, like so:
> 
> 		1e''' = merge(orig, 3d->3e);
> 
> 	...regardless, the point is that all the information
> that is absolutely needed is a list of instance of diffs
> to be skipped.  It is not even necessary that the changes
> have such a clearly explainable ancestory as that you have
> described.  All the merge program needs to know are the changes
> to be skipped, although information like changes the skipped
> patches are duplicating may be useful for things like trying
> to reverse a patch in your "third-merge" branch in your
> example if reverseing the patch in "first-merge" fails.

Right, an extended primitive solves the problem, certainly, and much more
effectively than sticking with 3-way merge.

> 	I believe that at least bitkeeper, darcs, a free python-based
> system that I can't remember at the moment, and possibly arch do this
> sort of machination already.
> 
> 
> >Does this happen in real life? [...]
> 
> 	Yes.  Both individual users and Linux distributions incorporate
> patches that they think are useful to them and then futher patches
> that they develop.  The time costs of rejecting such patches would
> likely be paid for by other integration or development work not being
> done.

It seems to me that users who use extra patches keep these separate from
their own patches (which they often keep in multiple series):

orig ---> other-people ---> local use, distribution
 |                       /
 |------> mine ----------
 |                       \
 |------> etc    ---------> mainline

If mainline is going to get the third-party patches in a distro tree, it
should get them from the original authors, not as part of a miscellaneous
patch set from the distro. If one patch series depends on another patch
series, it should hold off until the other one goes in, not include it in
the submission.

> >It seems like sane development processes
>                ^^^^
> >wouldn't have multiple mainline-candidate patch sets including the same
> >patches, if for no other reason than that, should the merge fail, nobody
> >with any clue about the original patches would be anywhere nearby.
> 
> 	If you could avoid prejudicial subjective adjectives, it
> it would make it easier for the saneness or insaneness of an
> approach to be apparent just by discussing your more objective criteria,
> like the remainder of your sentence, which is where the focus should
> be.
> 
> 	(1) Does allowing duplicate patches really mean that
> 	   "nobody with any clue about the original patches would be
> 	   anywhere near by?"  What attracts these clueful people
> 	   just by third parties having to rebase their patches?

The clueful people are the original authors (first, second, and
third); 1d-1e and 3d-3e would be rebased by their authors against a new
orig that's the merge of the 1c, 2c, and 3c (which all have a good common
ancestor).

Actually, the best 3-way merge path might be:

merge(merge(merge(3d,orig->1c),3d->3e),1d->1e)

That is, generate a complete merge at the point where people each merged
in the second line, and then continue forward from there.

> 	(2) Does this supposed benefit outweigh the cost of rejecting
> 	    many patches unnecessarily?  I know from my own experience
> 	    that I have either given up on or had to put into a very low
> 	    priority mode at least 66% of the patches that I haven't
> 	    gotten integrated, but which I am confident the kernel
> 	    would be better having (e.g.: devfs shrink, lookup()
> 	    trapping, ipv4 as a loadable (not not yet removable) module,
> 	    sysfs memory shrink, factoring much of the DMA mapping to
> 	    the common bus code from individual drivers, fewer kmap's
> 	    in crypto, I could go on).

This is unfortunate, certainly, but the alternative under discussion would
be to get those patches into as many other trees as possible until
Andrew/Linus picks them up and then finds that he's gotten multiple
copies of them via different routes. If each of these went in through the
respective maintainer, there would be no problem, and if any of them went
in without the respective maintainer's sign-off, that would upset
people.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-16  8:35         ` Paul Jackson
@ 2005-04-18  8:18           ` Catalin Marinas
  0 siblings, 0 replies; 24+ messages in thread
From: Catalin Marinas @ 2005-04-18  8:18 UTC (permalink / raw)
  To: Paul Jackson
  Cc: torvalds, jgarzik, benh, linux-kernel, James.Bottomley, dwmw2

Paul Jackson <pj@sgi.com> wrote:
>> "merge" does a better job than "diff3" since it can resolve the
>
> The merge command I know of is part of Tichy's RCS tools,
> and calls diff3, and has no inherent superior abilities.

You are right, I missed some diff3 options. It looks like "diff3 -mE"
generates the same output as "merge" (i.e. solving the identical
changes in the derived files). Sorry for the noise :-)

-- 
Catalin


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-12  9:52       ` Catalin Marinas
@ 2005-04-16  8:35         ` Paul Jackson
  2005-04-18  8:18           ` Catalin Marinas
  0 siblings, 1 reply; 24+ messages in thread
From: Paul Jackson @ 2005-04-16  8:35 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: torvalds, jgarzik, benh, linux-kernel, James.Bottomley, dwmw2

> "merge" does a better job than "diff3" since it can resolve the

The merge command I know of is part of Tichy's RCS tools,
and calls diff3, and has no inherent superior abilities.

Is this the merge command you have in mind here?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11 21:26       ` Linus Torvalds
  2005-04-11 21:31         ` James Bottomley
@ 2005-04-13 20:04         ` H. Peter Anvin
  1 sibling, 0 replies; 24+ messages in thread
From: H. Peter Anvin @ 2005-04-13 20:04 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <Pine.LNX.4.58.0504111424270.1267@ppc970.osdl.org>
By author:    Linus Torvalds <torvalds@osdl.org>
In newsgroup: linux.dev.kernel
> 
> On Mon, 11 Apr 2005, Greg KH wrote:
> > 
> > I have a feeling that the kernel.org mirror system is just going to
> > _love_ us using it to store temporary git trees :)
> 
> I don't think kernel.org mirrors the private home directories, so it you
> do _temporary_ trees, just make them readable in your private home
> directory rather than in /pub/linux/kernel/people. For people with 
> kernel.org accounts, we can use that as the "bkbits.net" thing.
> 
> For really public hosting, we need to find some other approach. 
> 

It's also pretty trivial to set up an additional /pub hierarchy, like
the current /pub/scm, which is up to individual mirrors to pick up or
not to pick up.  We only require /pub/linux and /pub/software to be
mirrored.

	-hpa

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11  6:15     ` Linus Torvalds
                         ` (3 preceding siblings ...)
  2005-04-11 22:50       ` Daniel Barkalow
@ 2005-04-12  9:52       ` Catalin Marinas
  2005-04-16  8:35         ` Paul Jackson
  4 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2005-04-12  9:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Benjamin Herrenschmidt, Linux Kernel list,
	James Bottomley, David Woodhouse

Linus Torvalds <torvalds@osdl.org> wrote:
> So anything that got modified in just one tree obviously merges to that 
> version. Any file that got modified in two trees will end up just being 
> passed to the "merge" program. See "man merge" and "man diff3". The merger 
> gets to fix up any conflicts by hand.

"merge" does a better job than "diff3" since it can resolve the
conflicts caused by similar changes to a "parent" file (this is
available in both BK and GNU Arch). This is useful when you try to
merge 2 branches that both include a patch which is not under the
revision control. It also solves the conflicts caused by
cherry-picking changes (just need to find the last consecutive common
changeset as the common ancestor).

-- 
Catalin


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11 22:50       ` Daniel Barkalow
@ 2005-04-12  8:36         ` Geert Uytterhoeven
  0 siblings, 0 replies; 24+ messages in thread
From: Geert Uytterhoeven @ 2005-04-12  8:36 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Linus Torvalds, Jeff Garzik, Benjamin Herrenschmidt,
	Linux Kernel list, James Bottomley, David Woodhouse

On Mon, 11 Apr 2005, Daniel Barkalow wrote:
> If merge took trees instead of single files, and had some way of detecting
> renames (or it got additional information about the differences between
> files), would that give BK-quality performance? Or does BK also support

I wrote a script to do merges on a tree (so far without rename detection,
though ;-) a long time ago, and still use it every time Linus or Marcelo
release a new version.

Look at `mergetree' on http://linux-m68k-cvs.ubb.ca/~geert/

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11 21:31         ` James Bottomley
@ 2005-04-12  4:24           ` Arjan van de Ven
  0 siblings, 0 replies; 24+ messages in thread
From: Arjan van de Ven @ 2005-04-12  4:24 UTC (permalink / raw)
  To: James Bottomley
  Cc: Linus Torvalds, Greg KH, Benjamin Herrenschmidt, Linux Kernel list

On Mon, 2005-04-11 at 16:31 -0500, James Bottomley wrote:
> On Mon, 2005-04-11 at 14:26 -0700, Linus Torvalds wrote:
> > I don't think kernel.org mirrors the private home directories, so it you
> > do _temporary_ trees, just make them readable in your private home
> > directory rather than in /pub/linux/kernel/people. For people with 
> > kernel.org accounts, we can use that as the "bkbits.net" thing.
> 
> It's also going to be a slight problem for those of us who don't have a
> kernel.org account...although I think the hosting I use on the parisc
> website might actually be outside the HP firewall, so I can probably beg
> for it to run any protocol you need (like rsync).

rsync also runs over ssh so if you can ssh in you can rsync to it


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11  6:15     ` Linus Torvalds
                         ` (2 preceding siblings ...)
  2005-04-11  7:38       ` Ingo Molnar
@ 2005-04-11 22:50       ` Daniel Barkalow
  2005-04-12  8:36         ` Geert Uytterhoeven
  2005-04-12  9:52       ` Catalin Marinas
  4 siblings, 1 reply; 24+ messages in thread
From: Daniel Barkalow @ 2005-04-11 22:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Benjamin Herrenschmidt, Linux Kernel list,
	James Bottomley, David Woodhouse

On Sun, 10 Apr 2005, Linus Torvalds wrote:

> On Mon, 11 Apr 2005, Jeff Garzik wrote:
> > 
> > > But I hope that I can get non-conflicting merges done fairly soon, and 
> > > maybe I can con James or Jeff or somebody to try out GIT then...
> > 
> > I don't mind being a guinea pig as long as someone else does the hard 
> > work of finding a new way to merge :)
> 
> So I can tell you what merges are going to be like, just to prepare you.
> 
> First, the good news: I think we can make the workflow look like bk, ie
> pretty much like "git pull" and "git push".  And for well-behaved stuff
> (ie minimal changes to the same files on both sides) it will even be fast.  
> I think.
> 
> Then the bad news: the merge algorithm is going to suck. It's going to be
> just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
> understanding of renames etc. I'll try to find the best parent to base the
> merge off of, although early testers may have to tell the piece of crud
> what the most recent common parent was.
>
> So anything that got modified in just one tree obviously merges to that 
> version. Any file that got modified in two trees will end up just being 
> passed to the "merge" program. See "man merge" and "man diff3". The merger 
> gets to fix up any conflicts by hand.

If merge took trees instead of single files, and had some way of detecting
renames (or it got additional information about the differences between
files), would that give BK-quality performance? Or does BK also support
cases like:

orig ---> first ---> first-merge -
 |                /               \
 |------> second -                 -> final
 |                \               /
 |------> third ---> third-merge -

where the final merge requires, for complete cleanliness, a comparison of
more than 3 states (since some changes will have orig as the common
ancestor and some will have second).

Does this happen in real life? It seems like sane development processes
wouldn't have multiple mainline-candidate patch sets including the same
patches, if for no other reason than that, should the merge fail, nobody
with any clue about the original patches would be anywhere nearby. It
seems better to throw something back to someone to rebase their diffs.

Otherwise, the problem seems to boil down to finding the common ancestor
well, getting trees instead of files to merge, and improving merge until
it handles all of the tractible cases.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11 21:26       ` Linus Torvalds
@ 2005-04-11 21:31         ` James Bottomley
  2005-04-12  4:24           ` Arjan van de Ven
  2005-04-13 20:04         ` H. Peter Anvin
  1 sibling, 1 reply; 24+ messages in thread
From: James Bottomley @ 2005-04-11 21:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Greg KH, Benjamin Herrenschmidt, Linux Kernel list

On Mon, 2005-04-11 at 14:26 -0700, Linus Torvalds wrote:
> I don't think kernel.org mirrors the private home directories, so it you
> do _temporary_ trees, just make them readable in your private home
> directory rather than in /pub/linux/kernel/people. For people with 
> kernel.org accounts, we can use that as the "bkbits.net" thing.

It's also going to be a slight problem for those of us who don't have a
kernel.org account...although I think the hosting I use on the parisc
website might actually be outside the HP firewall, so I can probably beg
for it to run any protocol you need (like rsync).

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11 20:53     ` Greg KH
@ 2005-04-11 21:26       ` Linus Torvalds
  2005-04-11 21:31         ` James Bottomley
  2005-04-13 20:04         ` H. Peter Anvin
  0 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2005-04-11 21:26 UTC (permalink / raw)
  To: Greg KH; +Cc: James Bottomley, Benjamin Herrenschmidt, Linux Kernel list



On Mon, 11 Apr 2005, Greg KH wrote:
> 
> I have a feeling that the kernel.org mirror system is just going to
> _love_ us using it to store temporary git trees :)

I don't think kernel.org mirrors the private home directories, so it you
do _temporary_ trees, just make them readable in your private home
directory rather than in /pub/linux/kernel/people. For people with 
kernel.org accounts, we can use that as the "bkbits.net" thing.

For really public hosting, we need to find some other approach. 

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11  3:25   ` James Bottomley
@ 2005-04-11 20:53     ` Greg KH
  2005-04-11 21:26       ` Linus Torvalds
  0 siblings, 1 reply; 24+ messages in thread
From: Greg KH @ 2005-04-11 20:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: James Bottomley, Benjamin Herrenschmidt, Linux Kernel list

On Sun, Apr 10, 2005 at 10:25:22PM -0500, James Bottomley wrote:
> On Sun, 2005-04-10 at 16:26 -0700, Linus Torvalds wrote:
> > On Mon, 11 Apr 2005, Benjamin Herrenschmidt wrote:
> > > If yes, then I would appreciate if you could either keep the same list,
> > > or if you want to change the list name, keep the subscriber list so
> > > those of us who actually archive it don't miss anything ;)
> > 
> > I didn't even set up the list. I think it's Bottomley. I'm cc'ing him just 
> > so that he sees the message, but I don't actually expect him to do 
> > anything about it. I'm not even ready to start _testing_ real merges yet. 
> > But I hope that I can get non-conflicting merges done fairly soon, and 
> > maybe I can con James or Jeff or somebody to try out GIT then...
> 
> Not guilty.  If I remember correctly, the list was set up by the vger
> list maintainers (davem and company).  It was tied to a trigger in one
> of your trees (which I think Larry did).  It shouldn't be too difficult
> to add to git ... it just means traversing all the added patches on a
> merge and sending out mail.
> 
> I can try out your source control tools ... I have some rc fixes
> ready ... when you're ready to try out merges...

I have some rc fixes too, let us know when you are ready to accept them,
and what format you want them in.

I have a feeling that the kernel.org mirror system is just going to
_love_ us using it to store temporary git trees :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11 12:51         ` Chris Mason
@ 2005-04-11 19:32           ` Chris Mason
  0 siblings, 0 replies; 24+ messages in thread
From: Chris Mason @ 2005-04-11 19:32 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Jeff Garzik, Benjamin Herrenschmidt,
	Linux Kernel list, James Bottomley, David Woodhouse

On Monday 11 April 2005 08:51, Chris Mason wrote:

> rej -M skips the merge program, so rej -a -M will give you something like
> this:
>
> coffee:/local/linux.p # rej -a -M drivers/ide/ide.c.rej
>         drivers/ide/ide.c: 1 matched, 0 conflicts remain
>
> But I would want to go over the bit that calculates the conflicts remaining
> more carefully if people plan on trusting this ;) 

Ok,  looks like this should be safe.  I changed -q to skip the gui compare 
when rej thinks it has resolved all the conflicts correctly.  With rej 0.14 
(just uploaded now) this should do what you want:

rej -q -a foo.rej 

Download site is here: ftp://ftp.suse.com/pub/people/mason/rej/

Please let me know if you find patches where rej is doing the wrong thing.

-chris

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
@ 2005-04-11 18:18 Adam J. Richter
  0 siblings, 0 replies; 24+ messages in thread
From: Adam J. Richter @ 2005-04-11 18:18 UTC (permalink / raw)
  To: linux-kernel, torvalds

On 2005-04-11 Linus Torvalds wrote:
>Then the bad news: the merge algorithm is going to suck. It's going to be
>just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
>understanding of renames etc. I'll try to find the best parent to base the
>merge off of, although early testers may have to tell the piece of crud
>what the most recent common parent was.

	I've been surprised at how well it works to put each character on a
separate line, pipe the input into diff3 and then join the lines
back together.  For example, let's consider the case of
a adding parameters to a function.  Here one version adds a parameter
before the existing parameter, and another version adds another parameter
after the existing parameter:

$ cat orig
call(bar);
$ cat ver1
call(foo,bar);
$ cat ver2
call(bar,baz);
$ charmerge ver1 orig ver2
call(foo,bar,baz);

	A more practically scaled application that I tried was with
another filter that I wrote that would automatically resolve certain
types of diff3 conflicts[1].  With that filter, I took the SCSI
FlashPoint driver, and made an edited version by piping it through GNU
indent, which not only reindents, but also splits and joins lines.
I made a second edited version by changing all 146 instances of
"SYNC" to "GROP" in the original.  It merged apparently successfully,
giving me a GNU indented version with all of the keyword changes.
The version of this resolution program dies if it his a diff3
conflict of a type that it is not prepared to resolve.  I'll post
it once I've got it properly preserving the conflicts that it
doesn't try to fix.  In the meantime, here is an illustrative
script to do get diff3 to do character-based merges, although it
gives garbage results if there are any conflicts.

[1] The type of conflict that was automatically resolved is as follows:

	variant1 = <prepended-new-text><original><appended-new-text>

	result --> <prepended-new-text><variant2><appended-new-text>

	...this is actually exactly the order one would want in the
case where <original> also occurs in variant2, but it was close
enough for this test.

                    __     ______________ 
Adam J. Richter        \ /
adam@yggdrasil.com      | g g d r a s i l



#!/bin/sh
# Usage: charmerge ver1_file orig_file ver2_file

lineify() {
	sed 's/\([^\n]\)/\1\
/g'
}

unlineify() {
	awk '/^$/ {print $0} /^..*/ { printf "%s", $0}'
}

tmpdir=/tmp/charmerge.$$

mkdir $tmpdir
lineify < "$1" > $tmpdir/1
lineify < "$2" > $tmpdir/2
lineify < "$3" > $tmpdir/3
diff3 -m $tmpdir/{1,2,3} | unlineify
rm -rf $tmpdir

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11  7:38       ` Ingo Molnar
@ 2005-04-11 12:51         ` Chris Mason
  2005-04-11 19:32           ` Chris Mason
  0 siblings, 1 reply; 24+ messages in thread
From: Chris Mason @ 2005-04-11 12:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Jeff Garzik, Benjamin Herrenschmidt,
	Linux Kernel list, James Bottomley, David Woodhouse

On Monday 11 April 2005 03:38, Ingo Molnar wrote:
> * Linus Torvalds <torvalds@osdl.org> wrote:
> > So anything that got modified in just one tree obviously merges to
> > that version. Any file that got modified in two trees will end up just
> > being passed to the "merge" program. See "man merge" and "man diff3".
> > The merger gets to fix up any conflicts by hand.
>
> at that point Chris Mason's "rej" tool is pretty nifty:
>
>   ftp://ftp.suse.com/pub/people/mason/rej/rej-0.13.tar.gz
>
> (There is no fully automatic mode in where it would not bother the user
> with the really trivial rejects - but it has an automatic mode where you
> basically have to do nothing - maybe a fully automatic one could be
> added that would resolve low-risk rejects?)
>

rej -M skips the merge program, so rej -a -M will give you something like 
this:

coffee:/local/linux.p # rej -a -M drivers/ide/ide.c.rej
        drivers/ide/ide.c: 1 matched, 0 conflicts remain

But I would want to go over the bit that calculates the conflicts remaining 
more carefully if people plan on trusting this ;)  It'll run on unified diffs 
too, although it will be slower then patch since the assumption is the quick 
and easy placement patch does has already failed.  (that's easy enough to fix 
though).

> it's really easy to use (but then again i'm a vim user, so i'm biased),
> just try it on a random .rej file you have ("rej -a kernel/sched.c.rej"
> or whatever).

you can rej -m kdiff3|meld|tkdiff or any program that does a side by side 
comparison of two files. (export REJMERGE=foo sets the diff prog as well)

I use rej frequently to merge patches in here, but that is mostly because 
there is no easy way to get the common ancestor and parent revision of the 
patches I'm merging.

With that info in hand, kdiff3 is pretty nice.  You would have to spoon feed 
it the renames, but it should have most of the other features you're looking 
for, including the 'no gui if all conflicts are auto-solvable'

-chris

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11  6:15     ` Linus Torvalds
  2005-04-11  6:40       ` Ryan Anderson
  2005-04-11  6:47       ` Geert Uytterhoeven
@ 2005-04-11  7:38       ` Ingo Molnar
  2005-04-11 12:51         ` Chris Mason
  2005-04-11 22:50       ` Daniel Barkalow
  2005-04-12  9:52       ` Catalin Marinas
  4 siblings, 1 reply; 24+ messages in thread
From: Ingo Molnar @ 2005-04-11  7:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Benjamin Herrenschmidt, Linux Kernel list,
	James Bottomley, David Woodhouse, Chris Mason


* Linus Torvalds <torvalds@osdl.org> wrote:

> Then the bad news: the merge algorithm is going to suck. It's going to 
> be just plain 3-way merge, the same RCS/CVS thing you've seen before.  
> With no understanding of renames etc. I'll try to find the best parent 
> to base the merge off of, although early testers may have to tell the 
> piece of crud what the most recent common parent was.
> 
> So anything that got modified in just one tree obviously merges to 
> that version. Any file that got modified in two trees will end up just 
> being passed to the "merge" program. See "man merge" and "man diff3".  
> The merger gets to fix up any conflicts by hand.

at that point Chris Mason's "rej" tool is pretty nifty:

  ftp://ftp.suse.com/pub/people/mason/rej/rej-0.13.tar.gz

it gets the trivial rejects right, and is pretty powerful to quickly 
cycle through the nontrivial ones too. It shows the old and new code 
side by side too, etc.

(There is no fully automatic mode in where it would not bother the user 
with the really trivial rejects - but it has an automatic mode where you 
basically have to do nothing - maybe a fully automatic one could be 
added that would resolve low-risk rejects?)

it's really easy to use (but then again i'm a vim user, so i'm biased), 
just try it on a random .rej file you have ("rej -a kernel/sched.c.rej" 
or whatever).

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-10 23:10 Benjamin Herrenschmidt
  2005-04-10 23:26 ` Linus Torvalds
@ 2005-04-11  7:13 ` David Woodhouse
  1 sibling, 0 replies; 24+ messages in thread
From: David Woodhouse @ 2005-04-11  7:13 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linus Torvalds, Linux Kernel list

On Mon, 2005-04-11 at 09:10 +1000, Benjamin Herrenschmidt wrote:
> Do you intend to continue posting "commited" patches to a mailing list
> like bk scripts did to bk-commits-head@vger ? As I said a while ago, I
> find this very useful, especially with the actual patch included in the
> commit message (which isn't the case with most other projects CVS commit
> lists, and I find that annoying).
> 
> If yes, then I would appreciate if you could either keep the same list,
> or if you want to change the list name, keep the subscriber list so
> those of us who actually archive it don't miss anything ;)

The commits lists currently only accept posts from dwmw2@hera, I
believe. That can relatively easily be changed if the mail is going to
come from somewhere else.

I did ask Linus to let me know as soon as possible when he starts to
commit patches, so we can come up with a way to keep the list fed. Since
he thinks I'm James, however, I suspect that part of the message didn't
get through. Perhaps he was just distracted by the Britishness?

-- 
dwmw2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11  6:15     ` Linus Torvalds
  2005-04-11  6:40       ` Ryan Anderson
@ 2005-04-11  6:47       ` Geert Uytterhoeven
  2005-04-11  7:38       ` Ingo Molnar
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Geert Uytterhoeven @ 2005-04-11  6:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Benjamin Herrenschmidt, Linux Kernel list,
	James Bottomley, David Woodhouse

On Sun, 10 Apr 2005, Linus Torvalds wrote:
> Then the bad news: the merge algorithm is going to suck. It's going to be
> just plain 3-way merge, the same RCS/CVS thing you've seen before. With no

Actually 3-way merge is not that bad. It's definitely better than ClearCase's
merge (I always fall back to RCS merge if ClearCase cannot resolve a merge
automatically).

> understanding of renames etc. I'll try to find the best parent to base the
> merge off of, although early testers may have to tell the piece of crud
> what the most recent common parent was.

Yep, finding the best parent is the important part :-)

I guess 3-way merge got a bad name because CVS always uses the branch point as
the parent, which fails miserably for any but the first merge after the branch.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11  6:15     ` Linus Torvalds
@ 2005-04-11  6:40       ` Ryan Anderson
  2005-04-11  6:47       ` Geert Uytterhoeven
                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Ryan Anderson @ 2005-04-11  6:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Benjamin Herrenschmidt, Linux Kernel list,
	James Bottomley, David Woodhouse

On Sun, Apr 10, 2005 at 11:15:20PM -0700, Linus Torvalds wrote:
> On Mon, 11 Apr 2005, Jeff Garzik wrote:
> > > But I hope that I can get non-conflicting merges done fairly soon, and 
> > > maybe I can con James or Jeff or somebody to try out GIT then...
> > 
> > I don't mind being a guinea pig as long as someone else does the hard 
> > work of finding a new way to merge :)
> 
> So I can tell you what merges are going to be like, just to prepare you.
> 
> First, the good news: I think we can make the workflow look like bk, ie
> pretty much like "git pull" and "git push".  And for well-behaved stuff
> (ie minimal changes to the same files on both sides) it will even be fast.  
> I think.

If you can stick something meaningful in a simple text file, overwritten
after each merge completes, similar to the BitKeeper/csets-in file, it
should be trivial to write a wrapper for the basic merge tool that calls
a trigger after each merge and uses csets-in to generate diffs and email
them out.

-- 

Ryan Anderson
  sometimes Pug Majere

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-11  5:53   ` Jeff Garzik
@ 2005-04-11  6:15     ` Linus Torvalds
  2005-04-11  6:40       ` Ryan Anderson
                         ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Linus Torvalds @ 2005-04-11  6:15 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Benjamin Herrenschmidt, Linux Kernel list, James Bottomley,
	David Woodhouse



On Mon, 11 Apr 2005, Jeff Garzik wrote:
> 
> > But I hope that I can get non-conflicting merges done fairly soon, and 
> > maybe I can con James or Jeff or somebody to try out GIT then...
> 
> I don't mind being a guinea pig as long as someone else does the hard 
> work of finding a new way to merge :)

So I can tell you what merges are going to be like, just to prepare you.

First, the good news: I think we can make the workflow look like bk, ie
pretty much like "git pull" and "git push".  And for well-behaved stuff
(ie minimal changes to the same files on both sides) it will even be fast.  
I think.

Then the bad news: the merge algorithm is going to suck. It's going to be
just plain 3-way merge, the same RCS/CVS thing you've seen before. With no
understanding of renames etc. I'll try to find the best parent to base the
merge off of, although early testers may have to tell the piece of crud
what the most recent common parent was.

So anything that got modified in just one tree obviously merges to that 
version. Any file that got modified in two trees will end up just being 
passed to the "merge" program. See "man merge" and "man diff3". The merger 
gets to fix up any conflicts by hand.

Quite frankly, that means that we really want to avoid any "exciting" 
merges with GIT. Maybe somebody can come up with something smarter. 
Eventually. Don't count on it, at least not in the near future.

The good news is that it's not like a three-way file merge is any worse
than many people are used to. The bad news is that BK is just a hell of a
lot better. So anybody who has been depending heavily on BK merges (and
hey, the beauty of them is that you often don't even _know_ that you are
depending on them) will be a bit bummed by the "Welcome back to the
1980's" message from a three-way merge.

		Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-10 23:26 ` Linus Torvalds
  2005-04-11  3:25   ` James Bottomley
@ 2005-04-11  5:53   ` Jeff Garzik
  2005-04-11  6:15     ` Linus Torvalds
  1 sibling, 1 reply; 24+ messages in thread
From: Jeff Garzik @ 2005-04-11  5:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Benjamin Herrenschmidt, Linux Kernel list, James Bottomley,
	David Woodhouse

Linus Torvalds wrote:
> On Mon, 11 Apr 2005, Benjamin Herrenschmidt wrote:
>>If yes, then I would appreciate if you could either keep the same list,
>>or if you want to change the list name, keep the subscriber list so
>>those of us who actually archive it don't miss anything ;)
> 
> 
> I didn't even set up the list. I think it's Bottomley. I'm cc'ing him just 
> so that he sees the message, but I don't actually expect him to do 
> anything about it. I'm not even ready to start _testing_ real merges yet. 

When you think kernel.org and BitKeeper, think either me or David 
Woodhouse.  :)

DaveM / Matti(?) manage the lists (postmaster@vger.kernel.org), but 
largely just create them on request from others, and make sure they 
continue to work.


> But I hope that I can get non-conflicting merges done fairly soon, and 
> maybe I can con James or Jeff or somebody to try out GIT then...

I don't mind being a guinea pig as long as someone else does the hard 
work of finding a new way to merge :)

	Jeff



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-10 23:26 ` Linus Torvalds
@ 2005-04-11  3:25   ` James Bottomley
  2005-04-11 20:53     ` Greg KH
  2005-04-11  5:53   ` Jeff Garzik
  1 sibling, 1 reply; 24+ messages in thread
From: James Bottomley @ 2005-04-11  3:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin Herrenschmidt, Linux Kernel list

On Sun, 2005-04-10 at 16:26 -0700, Linus Torvalds wrote:
> On Mon, 11 Apr 2005, Benjamin Herrenschmidt wrote:
> > If yes, then I would appreciate if you could either keep the same list,
> > or if you want to change the list name, keep the subscriber list so
> > those of us who actually archive it don't miss anything ;)
> 
> I didn't even set up the list. I think it's Bottomley. I'm cc'ing him just 
> so that he sees the message, but I don't actually expect him to do 
> anything about it. I'm not even ready to start _testing_ real merges yet. 
> But I hope that I can get non-conflicting merges done fairly soon, and 
> maybe I can con James or Jeff or somebody to try out GIT then...

Not guilty.  If I remember correctly, the list was set up by the vger
list maintainers (davem and company).  It was tied to a trigger in one
of your trees (which I think Larry did).  It shouldn't be too difficult
to add to git ... it just means traversing all the added patches on a
merge and sending out mail.

I can try out your source control tools ... I have some rc fixes
ready ... when you're ready to try out merges...

James



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: New SCM and commit list
  2005-04-10 23:10 Benjamin Herrenschmidt
@ 2005-04-10 23:26 ` Linus Torvalds
  2005-04-11  3:25   ` James Bottomley
  2005-04-11  5:53   ` Jeff Garzik
  2005-04-11  7:13 ` David Woodhouse
  1 sibling, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2005-04-10 23:26 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Linux Kernel list, James Bottomley



On Mon, 11 Apr 2005, Benjamin Herrenschmidt wrote:
> 
> Do you intend to continue posting "commited" patches to a mailing list
> like bk scripts did to bk-commits-head@vger ? As I said a while ago, I
> find this very useful, especially with the actual patch included in the
> commit message (which isn't the case with most other projects CVS commit
> lists, and I find that annoying).

Absolutely. GIT isn't quite at the point where I can start using it yet,
though.

I could actually start committing patches, but I want to make sure that I
can also do automated simple merges, so that there is any _point_ to doing
this in the first place. My plan is to not be very good at merging (in 
particular, I don't see GIT resolving renames _at_all_), but my hope is 
that the people who used to merge with me using BK might be able to still 
do so using GIT, as long as we try actively to be very careful.

> If yes, then I would appreciate if you could either keep the same list,
> or if you want to change the list name, keep the subscriber list so
> those of us who actually archive it don't miss anything ;)

I didn't even set up the list. I think it's Bottomley. I'm cc'ing him just 
so that he sees the message, but I don't actually expect him to do 
anything about it. I'm not even ready to start _testing_ real merges yet. 
But I hope that I can get non-conflicting merges done fairly soon, and 
maybe I can con James or Jeff or somebody to try out GIT then...

			Linus

^ permalink raw reply	[flat|nested] 24+ messages in thread

* New SCM and commit list
@ 2005-04-10 23:10 Benjamin Herrenschmidt
  2005-04-10 23:26 ` Linus Torvalds
  2005-04-11  7:13 ` David Woodhouse
  0 siblings, 2 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2005-04-10 23:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel list

Hi Linus !

Do you intend to continue posting "commited" patches to a mailing list
like bk scripts did to bk-commits-head@vger ? As I said a while ago, I
find this very useful, especially with the actual patch included in the
commit message (which isn't the case with most other projects CVS commit
lists, and I find that annoying).

If yes, then I would appreciate if you could either keep the same list,
or if you want to change the list name, keep the subscriber list so
those of us who actually archive it don't miss anything ;)

Thanks !

Regards,
Ben.



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2005-04-18  8:18 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-12  3:02 New SCM and commit list Adam J. Richter
2005-04-12 21:54 ` Daniel Barkalow
  -- strict thread matches above, loose matches on Subject: below --
2005-04-11 18:18 Adam J. Richter
2005-04-10 23:10 Benjamin Herrenschmidt
2005-04-10 23:26 ` Linus Torvalds
2005-04-11  3:25   ` James Bottomley
2005-04-11 20:53     ` Greg KH
2005-04-11 21:26       ` Linus Torvalds
2005-04-11 21:31         ` James Bottomley
2005-04-12  4:24           ` Arjan van de Ven
2005-04-13 20:04         ` H. Peter Anvin
2005-04-11  5:53   ` Jeff Garzik
2005-04-11  6:15     ` Linus Torvalds
2005-04-11  6:40       ` Ryan Anderson
2005-04-11  6:47       ` Geert Uytterhoeven
2005-04-11  7:38       ` Ingo Molnar
2005-04-11 12:51         ` Chris Mason
2005-04-11 19:32           ` Chris Mason
2005-04-11 22:50       ` Daniel Barkalow
2005-04-12  8:36         ` Geert Uytterhoeven
2005-04-12  9:52       ` Catalin Marinas
2005-04-16  8:35         ` Paul Jackson
2005-04-18  8:18           ` Catalin Marinas
2005-04-11  7:13 ` David Woodhouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).