All of lore.kernel.org
 help / color / mirror / Atom feed
* Assessing about commit order in upstream Linux
@ 2020-05-26  6:53 Eugeniu Rosca
  2020-05-26 15:21 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Eugeniu Rosca @ 2020-05-26  6:53 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð,
	SZEDER Gábor, Eugeniu Rosca, Eugeniu Rosca

Dear Git community,

Assessing about the correct order of upstream commits is essential
during the backporting process, since we aim to ensure that the
backporting result can be built and bisected at each commit.

However, there appear to be at least two ways to compute the
relative order of "mainline" commits, specifically based on the:

 * index/position of commit summary line in the output of
   'git log --oneline --topo-order upstream/master'

 * 'git describe --contains --match="v*" <SHA1>' of each commit

I've considered both approaches equivalent, until I ran into [A-B].

Judging by the index in the 'git log' output, commit [B] seems to
(topologically) come first and hence would need to be backported first:

$ git log --reverse --oneline --topo-order v4.14..v4.15 | grep -n "mm: slabinfo: remove CONFIG_SLABINFO" | cut -f1 -d:
7261
$ git log --reverse --oneline --topo-order v4.14..v4.15 | grep -n "RDMA/umem: Avoid partial declaration of non-static function" | cut -f1 -d:
7029

Judging by the version returned by 'git describe --contains', commit [A]
seems to (topologically) come first due to '~93' putting it (mentally)
"earlier" in the topological graph compared to '~73':

$ git describe --contains --match="v*" 5b36577109be
  v4.15-rc1~93^2~117
$ git describe --contains --match="v*" fec99ededf6b
  v4.15-rc1~73^2~56

So, the two approaches lead to different results. If you see any false
assumption or mistaken belief, could you please pinpoint that? TIA.

[A] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5b36577109be
  ("mm: slabinfo: remove CONFIG_SLABINFO")
[B] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fec99ededf6b
  ("RDMA/umem: Avoid partial declaration of non-static function")

-- 
Best regards,
Eugeniu Rosca

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Assessing about commit order in upstream Linux
  2020-05-26  6:53 Assessing about commit order in upstream Linux Eugeniu Rosca
@ 2020-05-26 15:21 ` Junio C Hamano
  2020-05-26 17:14   ` Michal Suchánek
  2020-05-28 17:59   ` Eugeniu Rosca
  0 siblings, 2 replies; 6+ messages in thread
From: Junio C Hamano @ 2020-05-26 15:21 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: git, Jeff King, Ævar Arnfjörð,
	SZEDER Gábor, Eugeniu Rosca

Eugeniu Rosca <erosca@de.adit-jv.com> writes:

> So, the two approaches lead to different results. If you see any false
> assumption or mistaken belief, could you please pinpoint that? TIA.

Perhaps the assumption/belief that the set of commits in a history
can be totally ordered is the issue?  When multiple people work
together on a project, especially in a project where "pull --no-ff"
is not enforced, there can exist only partial order among them?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Assessing about commit order in upstream Linux
  2020-05-26 15:21 ` Junio C Hamano
@ 2020-05-26 17:14   ` Michal Suchánek
  2020-05-28 18:12     ` Eugeniu Rosca
  2020-05-28 17:59   ` Eugeniu Rosca
  1 sibling, 1 reply; 6+ messages in thread
From: Michal Suchánek @ 2020-05-26 17:14 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eugeniu Rosca, git, Jeff King, Ævar Arnfjörð,
	SZEDER Gábor, Eugeniu Rosca

On Tue, May 26, 2020 at 08:21:25AM -0700, Junio C Hamano wrote:
> Eugeniu Rosca <erosca@de.adit-jv.com> writes:
> 
> > So, the two approaches lead to different results. If you see any false
> > assumption or mistaken belief, could you please pinpoint that? TIA.
> 
> Perhaps the assumption/belief that the set of commits in a history
> can be totally ordered is the issue?  When multiple people work
> together on a project, especially in a project where "pull --no-ff"
> is not enforced, there can exist only partial order among them?
> 
As in if you have history with two branches

   D
  / \
 B   C
  \ /
   A

commits B and C are not comparable. They are both between A and D but
the order of B and C is arbitrary. Different renderings of the history
may choose different order of B and C. This is a simle example. Linux
history is a spaghetti of tens of branches.

Thanks

Michal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Assessing about commit order in upstream Linux
  2020-05-26 15:21 ` Junio C Hamano
  2020-05-26 17:14   ` Michal Suchánek
@ 2020-05-28 17:59   ` Eugeniu Rosca
  1 sibling, 0 replies; 6+ messages in thread
From: Eugeniu Rosca @ 2020-05-28 17:59 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Eugeniu Rosca, git, Jeff King, Ævar Arnfjörð,
	SZEDER Gábor, Eugeniu Rosca

Hi Junio,

On Tue, May 26, 2020 at 08:21:25AM -0700, Junio C Hamano wrote:
> Eugeniu Rosca <erosca@de.adit-jv.com> writes:
> 
> > So, the two approaches lead to different results. If you see any false
> > assumption or mistaken belief, could you please pinpoint that? TIA.
> 
> Perhaps the assumption/belief that the set of commits in a history
> can be totally ordered is the issue?  When multiple people work
> together on a project, especially in a project where "pull --no-ff"
> is not enforced, there can exist only partial order among them?
> 

IMHO it might be an issue in truly decentralized projects, for which we
can't define an upstream and a downstream. But is it an issue for Linux?

Here is a quick attempt to sketch how commits flow into linux/master,
every development cycle, again and again, respecting the same pattern.

       +-----------------o Linus
       |   +-------------o Maintainers
       |   |           +-o Contributors
       v   v           v

master o
       |   +-------------------+
       |   |                   |
   C(M)o---o Same story as "A" |
       |   |                   |
     B o   +-------------------+
       |
   A(M)o---o A^2 
       |   |   
   A~1 o   o A^2~1
       |   |   
   A~2 o   o A^2~2
           |
           o A^2~3(M)--o A^2~3^2
           |           |
           |           o A^2~3^2~1
           |           |
           |           o A^2~3^2~2
           |
           o A^2~4(M)--o A^2~4^2
           |           |
           o A^2~5     o A^2~4^2~1
                       |
                       o A^2~4^2~2

The order of these commits matter to me because:

 - Commits A^2~4^2~2 through A^2~4^2 likely originate from the same
   series, with a well defined topic/scope and inner sequence. It would
   be ideal to mirror this same order during backporting. Otherwise,
   both the product of porting is questionable and the reviewing
   effort is high.

 - Likewise, commits A^2~3^2~2 through A^2~3^2 probably come from one
   single series. The reviewers would hugely appreciate if these are
   not  scattered during backporting, but are kept together (preferably
   in the exact same succession).

 - Any merge commit (marked with '(M)' above) might carry a conflict
   resolution in itself (aka 'evil merge') which might act as dependency
   to any of its children. So, cherry picking commits in no particular
   order may very likely introduce build and runtime failures, whose
   reasons may be difficult to spot in the downstream projects.

Having said that, I am curious, does anybody resonate with these
statements, based on personal experience (in Linux or other projects)?

-- 
Best regards,
Eugeniu Rosca

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Assessing about commit order in upstream Linux
  2020-05-26 17:14   ` Michal Suchánek
@ 2020-05-28 18:12     ` Eugeniu Rosca
  2020-05-28 20:25       ` Michal Suchánek
  0 siblings, 1 reply; 6+ messages in thread
From: Eugeniu Rosca @ 2020-05-28 18:12 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Junio C Hamano, Eugeniu Rosca, git, Jeff King,
	Ævar Arnfjörð,
	SZEDER Gábor, Eugeniu Rosca

Hi Michal,

On Tue, May 26, 2020 at 07:14:43PM +0200, Michal Suchánek wrote:
> On Tue, May 26, 2020 at 08:21:25AM -0700, Junio C Hamano wrote:
> > Eugeniu Rosca <erosca@de.adit-jv.com> writes:
> > 
> > > So, the two approaches lead to different results. If you see any false
> > > assumption or mistaken belief, could you please pinpoint that? TIA.
> > 
> > Perhaps the assumption/belief that the set of commits in a history
> > can be totally ordered is the issue?  When multiple people work
> > together on a project, especially in a project where "pull --no-ff"
> > is not enforced, there can exist only partial order among them?
> > 
> As in if you have history with two branches
> 
>    D
>   / \
>  B   C
>   \ /
>    A
> 
> commits B and C are not comparable. They are both between A and D but
> the order of B and C is arbitrary. Different renderings of the history
> may choose different order of B and C. This is a simle example. Linux
> history is a spaghetti of tens of branches.

While in theory 'B' and 'C' might look equivalent, IMHO in practice
there is a clear distinction between the two. It's commonly known that
Git refers to 'B' as the 'first parent' of 'D'. Git also provides means
to identify such first parents via 'git log --first-parent'.

A fun fact about first parents is that, unless Linus is on vacation
and hands over his responsibilities to GKH, you will be quite
confident that 'git log --first-parent linux/master' will list
stuff committed by Linus himself. That's why (I bet) in the minds
of people involved in Linux development, the diagram looks like:

    D
    | \
    B  C
    | /
    A

IMHO the fact that 'A' is the parent of 'C' (IOW 'C' has an appropriate
base version) is mostly important to achieve an effortless merge of 'C'
and later on loses its major significance. So, I would say that
(contents-wise) the diagram can be further reduced to:

    D
    | \
    B  D^2
    |
    A

Just visually, a sane backporting order looks A, B an D^2 (A is assumed
non-merge and D is skipped, since cherry picking merges is not common).

I am quite sure people have thought about backporting techniques and
strategies long before I started to ask these questions. So, I am
still looking forward to seeing various experiences shared.

-- 
Best regards,
Eugeniu Rosca

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Assessing about commit order in upstream Linux
  2020-05-28 18:12     ` Eugeniu Rosca
@ 2020-05-28 20:25       ` Michal Suchánek
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Suchánek @ 2020-05-28 20:25 UTC (permalink / raw)
  To: Eugeniu Rosca
  Cc: Junio C Hamano, git, Jeff King, Ævar Arnfjörð,
	SZEDER Gábor, Eugeniu Rosca

On Thu, May 28, 2020 at 08:12:26PM +0200, Eugeniu Rosca wrote:
> Hi Michal,
> 
> On Tue, May 26, 2020 at 07:14:43PM +0200, Michal Suchánek wrote:
> > On Tue, May 26, 2020 at 08:21:25AM -0700, Junio C Hamano wrote:
> > > Eugeniu Rosca <erosca@de.adit-jv.com> writes:
> > > 
> > > > So, the two approaches lead to different results. If you see any false
> > > > assumption or mistaken belief, could you please pinpoint that? TIA.
> > > 
> > > Perhaps the assumption/belief that the set of commits in a history
> > > can be totally ordered is the issue?  When multiple people work
> > > together on a project, especially in a project where "pull --no-ff"
> > > is not enforced, there can exist only partial order among them?
> > > 
> > As in if you have history with two branches
> > 
> >    D
> >   / \
> >  B   C
> >   \ /
> >    A
> > 
> > commits B and C are not comparable. They are both between A and D but
> > the order of B and C is arbitrary. Different renderings of the history
> > may choose different order of B and C. This is a simle example. Linux
> > history is a spaghetti of tens of branches.
> 
> While in theory 'B' and 'C' might look equivalent, IMHO in practice
> there is a clear distinction between the two. It's commonly known that
> Git refers to 'B' as the 'first parent' of 'D'. Git also provides means
> to identify such first parents via 'git log --first-parent'.
> 
> A fun fact about first parents is that, unless Linus is on vacation
> and hands over his responsibilities to GKH, you will be quite
> confident that 'git log --first-parent linux/master' will list
> stuff committed by Linus himself. That's why (I bet) in the minds
> of people involved in Linux development, the diagram looks like:
> 
>     D
>     | \
>     B  C
>     | /
>     A
And that's not the case. Commits B and C will typically com from
different subsystems, and are truly interchangeable. These subsystems,
again, will have number of separate branches that are merged together
before they are submitted to Linus. Often a feature requires cross-merge
between different subsystems which further complicates the history. Even
if B is a commit authored by Linus and you can infer from that it's on
the master branch it says nothing about order of B and C. They are
still not comparable. And you may still need to reconcile the changes in
B and C in D and whatever order you choose for backporting them you will
need to reflect D in both B and C in the conflicting case.

HTH

Michal

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-05-28 20:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-26  6:53 Assessing about commit order in upstream Linux Eugeniu Rosca
2020-05-26 15:21 ` Junio C Hamano
2020-05-26 17:14   ` Michal Suchánek
2020-05-28 18:12     ` Eugeniu Rosca
2020-05-28 20:25       ` Michal Suchánek
2020-05-28 17:59   ` Eugeniu Rosca

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.