backports.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] docs: add backporting and conflict resolution document
@ 2023-08-24  9:23 Vegard Nossum
  2023-10-03 15:19 ` Jonathan Corbet
  0 siblings, 1 reply; 8+ messages in thread
From: Vegard Nossum @ 2023-08-24  9:23 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-doc, linux-kernel, stable, backports, Vegard Nossum,
	Harshit Mogalapalli, Bagas Sanjaya, Greg Kroah-Hartman,
	Stephen Rothwell, Jason A . Donenfeld, Konstantin Ryabitsev

This is a new document based on my 2022 blog post:

  https://blogs.oracle.com/linux/post/backporting-patches-using-git

Although this is aimed at stable contributors and distro maintainers,
it does also contain useful tips and tricks for anybody who needs to
resolve merge conflicts.

By adding this to the kernel as documentation we can more easily point
to it e.g. from stable emails about failed backports, as well as allow
the community to modify it over time if necessary.

I've added this under process/ since it also has
process/applying-patches.rst. Another interesting document is
maintainer/rebasing-and-merging.rst which maybe should eventually refer
to this one, but I'm leaving that as a future cleanup.

Thanks to Harshit Mogalapalli for helping with the original blog post
as well as this updated document and Bagas Sanjaya for providing
thoughtful feedback.

v2: fixed heading style, link style, placeholder style, other comments

Link: https://lore.kernel.org/linux-doc/20230303162553.17212-1-vegard.nossum@oracle.com/
Cc: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
---
 Documentation/process/backporting.rst | 514 ++++++++++++++++++++++++++
 Documentation/process/index.rst       |   1 +
 2 files changed, 515 insertions(+)
 create mode 100644 Documentation/process/backporting.rst

diff --git a/Documentation/process/backporting.rst b/Documentation/process/backporting.rst
new file mode 100644
index 0000000000000..7593980af9659
--- /dev/null
+++ b/Documentation/process/backporting.rst
@@ -0,0 +1,514 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================
+Backporting and conflict resolution
+===================================
+
+:Author: Vegard Nossum <vegard.nossum@oracle.com>
+
+.. contents::
+    :local:
+    :depth: 3
+    :backlinks: none
+
+Introduction
+============
+
+Some developers may never really have to deal with backporting patches,
+merging branches, or resolving conflicts in their day-to-day work, so
+when a merge conflict does pop up, it can be daunting. Luckily,
+resolving conflicts is a skill like any other, and there are many useful
+techniques you can use to make the process smoother and increase your
+confidence in the result.
+
+This document aims to be a comprehensive, step-by-step guide to
+backporting and conflict resolution.
+
+Applying the patch to a tree
+============================
+
+Sometimes the patch you are backporting already exists as a git commit,
+in which case you just cherry-pick it directly using
+``git cherry-pick``. However, if the patch comes from an email, as it
+often does for the Linux kernel, you will need to apply it to a tree
+using ``git am``.
+
+If you've ever used ``git am``, you probably already know that it is
+quite picky about the patch applying perfectly to your source tree. In
+fact, you've probably had nightmares about ``.rej`` files and trying to
+edit the patch to make it apply.
+
+It is strongly recommended to instead find an appropriate base version
+where the patch applies cleanly and *then* cherry-pick it over to your
+destination tree, as this will make git output conflict markers and let
+you resolve conflicts with the help of git and any other conflict
+resolution tools you might prefer to use. For example, if you want to
+apply a patch that just arrived on LKML to an older stable kernel, you
+can apply it to the most recent mainline kernel and then cherry-pick it
+to your older stable branch.
+
+It's generally better to use the exact same base as the one the patch
+was generated from, but it doesn't really matter that much as long as it
+applies cleanly and isn't too far from the original base. The only
+problem with applying the patch to the "wrong" base is that it may pull
+in more unrelated changes in the context of the diff when cherry-picking
+it to the older branch.
+
+If you are using `b4`_. and you are applying the patch directly from an
+email, you can use ``b4 am`` with the options ``-g``/``--guess-base``
+and ``-3``/``--prep-3way`` to do some of this automatically (see the
+`b4 presentation`_ for more information). However, the rest of this
+article will assume that you are doing a plain ``git cherry-pick``.
+
+.. _b4: https://people.kernel.org/monsieuricon/introducing-b4-and-patch-attestation
+.. _b4 presentation: https://youtu.be/mF10hgVIx9o?t=2996
+
+Once you have the patch in git, you can go ahead and cherry-pick it into
+your source tree. Don't forget to cherry-pick with ``-x`` if you want a
+written record of where the patch came from!
+
+Note that if you are submiting a patch for stable, the format is
+slightly different; the first line after the subject line needs tobe
+either::
+
+    commit <upstream commit> upstream
+
+or::
+
+    [ Upstream commit <upstream commit> ]
+
+Resolving conflicts
+===================
+
+Uh-oh; the cherry-pick failed with a vaguely threatening message::
+
+    CONFLICT (content): Merge conflict
+
+What to do now?
+
+In general, conflicts appear when the context of the patch (i.e., the
+lines being changed and/or the lines surrounding the changes) doesn't
+match what's in the tree you are trying to apply the patch *to*.
+
+For backports, what likely happened was that the branch you are
+backporting from contains patches not in the branch you are backporting
+to. However, the reverse is also possible. In any case, the result is a
+conflict that needs to be resolved.
+
+If your attempted cherry-pick fails with a conflict, git automatically
+edits the files to include so-called conflict markers showing you where
+the conflict is and how the two branches have diverged. Resolving the
+conflict typically means editing the end result in such a way that it
+takes into account these other commits.
+
+Resolving the conflict can be done either by hand in a regular text
+editor or using a dedicated conflict resolution tool.
+
+Many people prefer to use their regular text editor and edit the
+conflict directly, as it may be easier to understand what you're doing
+and to control the final result. There are definitely pros and cons to
+each method, and sometimes there's value in using both.
+
+We will not cover using dedicated merge tools here beyond providing some
+pointers to various tools that you could use:
+
+-  `vimdiff/gvimdiff <https://linux.die.net/man/1/vimdiff>`__
+-  `KDiff3 <http://kdiff3.sourceforge.net/>`__
+-  `TortoiseMerge <https://tortoisesvn.net/TortoiseMerge.html>`__
+-  `Meld <https://meldmerge.org/help/>`__
+-  `P4Merge <https://www.perforce.com/products/helix-core-apps/merge-diff-tool-p4merge>`__
+-  `Beyond Compare <https://www.scootersoftware.com/>`__
+-  `IntelliJ <https://www.jetbrains.com/help/idea/resolve-conflicts.html>`__
+-  `VSCode <https://code.visualstudio.com/docs/editor/versioncontrol>`__
+
+To configure git to work with these, see ``git mergetool --help`` or
+the official `git-mergetool documentation`_.
+
+.. _git-mergetool documentation: https://git-scm.com/docs/git-mergetool
+
+Prerequisite patches
+--------------------
+
+Most conflicts happen because the branch you are backporting to is
+missing some patches compared to the branch you are backporting *from*.
+In the more general case (such as merging two independent branches),
+development could have happened on either branch, or the branches have
+simply diverged -- perhaps your older branch had some other backports
+applied to it that themselves needed conflict resolutions, causing a
+divergence.
+
+It's important to always identify the commit or commits that caused the
+conflict, as otherwise you cannot be confident in the correctness of
+your resolution. As an added bonus, especially if the patch is in an
+area you're not that famliar with, the changelogs of these commits will
+often give you the context to understand the code and potential problems
+or pitfalls with your conflict resolution.
+
+git log
+~~~~~~~
+
+A good first step is to look at ``git log`` for the file that has the
+conflict -- this is usually sufficient when there aren't a lot of
+patches to the file, but may get confusing if the file is big and
+frequently patched. You should run ``git log`` on the range of commits
+between your currently checked-out branch (``HEAD``) and the parent of
+the patch you are picking (``<commit>``), i.e.::
+
+    git log HEAD..<commit>^ -- <path>
+
+Even better, if you want to restrict this output to a single function
+(because that's where the conflict appears), you can use the following
+syntax::
+
+    git log -L:'\<function\>':<path> HEAD..<commit>^
+
+.. note::
+     The ``\<`` and ``\>`` around the function name ensure that the
+     matches are anchored on a word boundary. This is important, as this
+     part is actually a regex and git only follows the first match, so
+     if you use ``-L:thread_stack:kernel/fork.c`` it may only give you
+     results for the function ``try_release_thread_stack_to_cache`` even
+     though there are many other functions in that file containing the
+     string ``thread_stack`` in their names.
+
+Another useful option for ``git log`` is ``-G``, which allows you to
+filter on certain strings appearing in the diffs of the commits you are
+listing::
+
+    git log -G'regex' HEAD..<commit>^ -- <path>
+
+This can also be a handy way to quickly find when something (e.g. a
+function call or a variable) was changed, added, or removed. The search
+string is a regular expression, which means you can potentially search
+for more specific things like assignments to a specific struct member::
+
+    git log -G'\->index\>.*='
+
+git blame
+~~~~~~~~~
+
+Another way to find prerequisite commits (albeit only the most recent
+one for a given conflict) is to run ``git blame``. In this case, you
+need to run it against the parent commit of the patch you are
+cherry-picking and the file where the conflict appared, i.e.::
+
+    git blame <commit>^ -- <path>
+
+This command also accepts the ``-L`` argument (for restricting the
+output to a single function), but in this case you specify the filename
+at the end of the command as usual::
+
+    git blame -L:'\<function\>' <commit>^ -- <path>
+
+Navigate to the place where the conflict occurred. The first column of
+the blame output is the commit ID of the patch that added a given line
+of code.
+
+It might be a good idea to ``git show`` these commits and see if they
+look like they might be the source of the conflict. Sometimes there will
+be more than one of these commits, either because multiple commits
+changed different lines of the same conflict area *or* because multiple
+subsequent patches changed the same line (or lines) multiple times. In
+the latter case, you may have to run ``git blame`` again and specify the
+older version of the file to look at in order to dig further back in
+the history of the file.
+
+Prerequisite vs. incidental patches
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Having found the patch that caused the conflict, you need to determine
+whether it is a prerequisite for the patch you are backporting or
+whether it is just incidental and can be skipped. An incidental patch
+would be one that touches the same code as the patch you are
+backporting, but does not change the semantics of the code in any
+material way. For example, a whitespace cleanup patch is completely
+incidental -- likewise, a patch that simply renames a function or a
+variable would be incidental as well. On the other hand, if the function
+being changed does not even exist in your current branch then this would
+not be incidental at all and you need to carefully consider whether the
+patch adding the function should be cherry-picked first.
+
+If you find that there is a necessary prerequisite patch, then you need
+to stop and cherry-pick that instead. If you've already resolved some
+conflicts in a different file and don't want to do it again, you can
+create a temporary copy of that file.
+
+To abort the current cherry-pick, go ahead and run
+``git cherry-pick --abort``, then restart the cherry-picking process
+with the commit ID of the prerequisite patch instead.
+
+Understanding conflict markers
+------------------------------
+
+Combined diffs
+~~~~~~~~~~~~~~
+
+Let's say you've decided against picking (or reverting) additional
+patches and you just want to resolve the conflict. Git will have
+inserted conflict markers into your file. Out of the box, this will look
+something like::
+
+    <<<<<<< HEAD
+    this is what's in your current tree before cherry-picking
+    =======
+    this is what the patch wants it to be after cherry-picking
+    >>>>>>> <commit>... title
+
+This is what you would see if you opened the file in your editor.
+However, if you were to run run ``git diff`` without any arguments, the
+output would look something like this::
+
+    $ git diff
+    [...]
+    ++<<<<<<<< HEAD
+     +this is what's in your current tree before cherry-picking
+    ++========
+    + this is what the patch wants it to be after cherry-picking
+    ++>>>>>>>> <commit>... title
+
+When you are resolving a conflict, the behavior of ``git diff`` differs
+from its normal behavior. Notice the two columns of diff markers
+instead of the usual one; this is a so-called "`combined diff`_", here
+showing the 3-way diff (or diff-of-diffs) between
+
+#. the current branch (before cherry-picking) and the current working
+   directory, and
+#. the current branch (before cherry-picking) and the file as it looks
+   after the original patch has been applied.
+
+.. _combined diff: https://git-scm.com/docs/diff-format#_combined_diff_format
+
+
+Better diffs
+~~~~~~~~~~~~
+
+3-way combined diffs include all the other changes that happened to the
+file between your current branch and the branch you are cherry-picking
+from. While this is useful for spotting other changes that you need to
+take into account, this also makes the output of ``git diff`` somewhat
+intimidating and difficult to read. You may instead prefer to run
+``git diff HEAD`` (or ``git diff --ours``) which shows only the diff
+between the current branch before cherry-picking and the current working
+directory. It looks like this::
+
+    $ git diff HEAD
+    [...]
+    +<<<<<<<< HEAD
+     this is what's in your current tree before cherry-picking
+    +========
+    +this is what the patch wants it to be after cherry-picking
+    +>>>>>>>> <commit>... title
+
+As you can see, this reads just like any other diff and makes it clear
+which lines are in the current branch and which lines are being added
+because they are part of the merge conflict or the patch being
+cherry-picked.
+
+Merge styles and diff3
+~~~~~~~~~~~~~~~~~~~~~~
+
+The default conflict marker style shown above is known as the ``merge``
+style. There is also another style available, known as the ``diff3``
+style, which looks like this::
+
+    <<<<<<< HEAD
+    this is what is in your current tree before cherry-picking
+    ||||||| parent of <commit> (title)
+    this is what the patch expected to find there
+    =======
+    this is what the patch wants it to be after being applied
+    >>>>>>> <commit> (title)
+
+As you can see, this has 3 parts instead of 2, and includes what git
+expected to find there but didn't. Some people vastly prefer this style
+as it makes it much clearer what the patch actually changed; i.e., it
+allows you to compare the before-and-after versions of the file for the
+commit you are cherry-picking. This allows you to make better decisions
+about how to resolve the conflict.
+
+To change conflict marker styles, you can use the following command::
+
+    git config merge.conflictStyle diff3
+
+There is a third option, ``zdiff3``, introduced in `Git 2.35`_,
+which has the same 3 sections as ``diff3``, but where common lines have
+been trimmed off, making the conflict area smaller in some cases.
+
+.. _Git 2.35: https://github.blog/2022-01-24-highlights-from-git-2-35/
+
+Iterating on conflict resolutions
+---------------------------------
+
+The first step in any conflict resolution process is to understand the
+patch you are backporting. For the Linux kernel this is especially
+important, since an incorrect change can lead to the whole system
+crashing -- or worse, an undetected security vulnerability.
+
+Understanding the patch can be easy or difficult depending on the patch
+itself, the changelog, and your familiarity with the code being changed.
+However, a good question for every change (or every hunk of the patch)
+might be: "Why is this hunk in the patch?" The answers to these
+questions will inform your conflict resolution.
+
+Resolution process
+~~~~~~~~~~~~~~~~~~
+
+Sometimes the easiest thing to do is to just remove all but the first
+part of the conflict, leaving the file essentially unchanged, and apply
+the changes by hand. Perhaps the patch is changing a function call
+argument from ``0`` to ``1`` while a conflicting change added an
+entirely new (and insignificant) parameter to the end of the parameter
+list; in that case, it's easy enough to change the argument from ``0``
+to ``1`` by hand and leave the rest of the arguments alone. This
+technique of manually applying changes is mostly useful if the conflict
+pulled in a lot of unrelated context that you don't really need to care
+about.
+
+For particularly nasty conflicts with many conflict markers, you can use
+``git add`` or ``git add -i`` to selectively stage your resolutions to
+get them out of the way; this also lets you use ``git diff HEAD`` to
+always see what remains to be resolved or ``git diff --cached`` to see
+what your patch looks like so far.
+
+Function arguments
+~~~~~~~~~~~~~~~~~~
+
+Pay attention to changing function arguments! It's easy to gloss over
+details and think that two lines are the same but actually they differ
+in some small detail like which variable was passed as an argument
+(especially if the two variables are both a single character that look
+the same, like i and j).
+
+Error handling
+~~~~~~~~~~~~~~
+
+If you cherry-pick a patch that includes a ``goto`` statement (typically
+for error handling), it is absolutely imperative to double check that
+the target label is still correct in the branch you are backporting to.
+Error handling is typically located at the bottom of the function, so it
+may not be part of the conflict even though could have been changed by
+other patches.
+
+A good way to ensure that you review the error paths is to always use
+``git diff -W`` and ``git show -W`` (AKA ``--function-context``) when
+inspecting your changes.  For C code, this will show you the whole
+function that's being changed in a patch. One of the things that often
+go wrong during backports is that something else in the function changed
+on either of the branches that you're backporting from or to. By
+including the whole function in the diff you get more context and can
+more easily spot problems that might otherwise go unnoticed.
+
+Dealing with file renames
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+One of the most annoying things that can happen while backporting a
+patch is discovering that one of the files being patched has been
+renamed, as that typically means git won't even put in conflict markers,
+but will just throw up its hands and say (paraphrased): "Unmerged path!
+You do the work..."
+
+There are generally a few ways to deal with this. If the patch to the
+renamed file is small, like a one-line change, the easiest thing is to
+just go ahead and apply the change by hand and be done with it. On the
+other hand, if the change is big or complicated, you definitely don't
+want to do it by hand.
+
+Sometimes the right thing to do will be to also backport the patch that
+did the rename, but that's definitely not the most common case. Instead,
+what you can do is to temporarily rename the file in the branch you're
+backporting to (using ``git mv`` and committing the result), restart the
+attempt to cherry-pick the patch, rename the file back (``git mv`` and
+committing again), and finally squash the result using ``git rebase -i``
+(see the `rebase tutorial`_) so it appears as a single commit when you
+are done.
+
+.. _rebase tutorial: https://medium.com/@slamflipstrom/a-beginners-guide-to-squashing-commits-with-git-rebase-8185cf6e62ec
+
+Verifying the result
+====================
+
+colordiff
+---------
+
+Having committed a conflict-free new patch, you can now compare your
+patch to the original patch. It is highly recommended that you use a
+tool such as `colordiff`_ that can show two files side by side and color
+them according to the changes between them::
+
+    colordiff -yw -W 200 <(git diff -W <upstream commit>^-) <(git diff -W HEAD^-) | less -SR
+
+.. _colordiff: https://www.colordiff.org/
+
+Here, ``-y`` means to do a side-by-side comparison; ``-w`` ignores
+whitespace, and ``-W 200`` sets the width of the output (as otherwise it
+will use 130 by default, which is often a bit too little).
+
+The ``rev^-`` syntax is a handy shorthand for ``rev^..rev``, essentially
+giving you just the diff for that single commit; also see
+the official `git rev-parse documentation`_.
+
+.. _git rev-parse documentation: https://git-scm.com/docs/git-rev-parse#_other_rev_parent_shorthand_notations
+
+Again, note the inclusion of ``-W`` for ``git diff``; this ensures that
+you will see the full function for any function that has changed.
+
+One incredibly important thing that colordiff does is to highlight lines
+that are different. For example, if an error-handling ``goto`` has
+changed labels between the original and backported patch, colordiff will
+show these side-by-side but highlighted in a different color.  Thus, it
+is easy to see that the two ``goto`` statements are jumping to different
+labels. Likewise, lines that were not modified by either patch but
+differ in the context will also be highlighted and thus stand out during
+a manual inspection.
+
+Of course, this is just a visual inspection; the real test is building
+and running the patched kernel (or program).
+
+Build testing
+-------------
+
+We won't cover runtime testing here, but it can be a good idea to build
+just the files touched by the patch as a quick sanity check. For the
+Linux kernel you can build single files like this, assuming you have the
+``.config`` and build environment set up correctly::
+
+    make path/to/file.o
+
+Note that this won't discover linker errors, so you should still do a
+full build after verifying that the single file compiles. By compiling
+the single file first you can avoid having to wait for a full build *in
+case* there are compiler errors in any of the files you've changed.
+
+Runtime testing
+---------------
+
+Even a successful build or boot test is not necessarily enough to rule
+out a missing dependency somewhere. Even though the chances are small,
+there could be code changes where two independent changes to the same
+file result in no conflicts, no compile-time errors, and runtime errors
+only in exceptional cases.
+
+One concrete example of this was a pair of patches to the system call
+entry code where the first patch saved/restored a register and a later
+patch made use of the same register somewhere in the middle of this
+sequence. Since there was no overlap between the changes, one could
+cherry-pick the second patch, have no conflicts, and believe that
+everything was fine, when in fact the code was now scribbling over an
+unsaved register.
+
+Although the vast majority of errors will be caught during compilation
+or by superficially exercising the code, the only way to *really* verify
+a backport is to review the final patch with the same level of scrutiny
+as you would (or should) give to any other patch. Having unit tests and
+regression tests or other types of automatic testing can help increase
+the confidence in the correctness of a backport.
+
+Examples
+========
+
+The above shows roughly the idealized process of backporting a patch.
+For a more concrete example, see this video tutorial where two patches
+are backported from mainline to stable:
+`Backporting Linux Kernel Patches`_.
+
+.. _Backporting Linux Kernel Patches: https://youtu.be/sBR7R1V2FeA
diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
index b501cd977053f..13e522752f880 100644
--- a/Documentation/process/index.rst
+++ b/Documentation/process/index.rst
@@ -66,6 +66,7 @@ lack of a better place.
    :maxdepth: 1
 
    applying-patches
+   backporting
    adding-syscalls
    magic-number
    volatile-considered-harmful
-- 
2.25.1

--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] docs: add backporting and conflict resolution document
  2023-08-24  9:23 [PATCH v2] docs: add backporting and conflict resolution document Vegard Nossum
@ 2023-10-03 15:19 ` Jonathan Corbet
  2023-10-10 19:57   ` Jonathan Corbet
  0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Corbet @ 2023-10-03 15:19 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: linux-doc, linux-kernel, stable, backports, Vegard Nossum,
	Harshit Mogalapalli, Bagas Sanjaya, Greg Kroah-Hartman,
	Stephen Rothwell, Jason A . Donenfeld, Konstantin Ryabitsev

Vegard Nossum <vegard.nossum@oracle.com> writes:

> This is a new document based on my 2022 blog post:
>
>   https://blogs.oracle.com/linux/post/backporting-patches-using-git
>
> Although this is aimed at stable contributors and distro maintainers,
> it does also contain useful tips and tricks for anybody who needs to
> resolve merge conflicts.
>
> By adding this to the kernel as documentation we can more easily point
> to it e.g. from stable emails about failed backports, as well as allow
> the community to modify it over time if necessary.
>
> I've added this under process/ since it also has
> process/applying-patches.rst. Another interesting document is
> maintainer/rebasing-and-merging.rst which maybe should eventually refer
> to this one, but I'm leaving that as a future cleanup.
>
> Thanks to Harshit Mogalapalli for helping with the original blog post
> as well as this updated document and Bagas Sanjaya for providing
> thoughtful feedback.
>
> v2: fixed heading style, link style, placeholder style, other comments

So this seems generally good and useful.  I have a few small comments,
none of which necessarily block merging it in its current form:

- I would like to see an ack/reviewed-by tag by others with experience
  with this task if possible.  The lack of complaints is a good start,
  but not always indicative of a lack of disagreement...:)

- Might this be better placed in Documentation/maintainer?

- Colordiff looks cool, but I'd at least drop in a mention of the Emacs
  ediff mode, which offers (I believe) a lot of the same functionality.

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] docs: add backporting and conflict resolution document
  2023-10-03 15:19 ` Jonathan Corbet
@ 2023-10-10 19:57   ` Jonathan Corbet
  2023-10-13 15:24     ` Vegard Nossum
  0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Corbet @ 2023-10-10 19:57 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: linux-doc, linux-kernel, stable, backports, Vegard Nossum,
	Harshit Mogalapalli, Bagas Sanjaya, Greg Kroah-Hartman,
	Stephen Rothwell, Jason A . Donenfeld, Konstantin Ryabitsev

Jonathan Corbet <corbet@lwn.net> writes:

> Vegard Nossum <vegard.nossum@oracle.com> writes:
>
>> This is a new document based on my 2022 blog post:
>>
>>   https://blogs.oracle.com/linux/post/backporting-patches-using-git
>>
>> Although this is aimed at stable contributors and distro maintainers,
>> it does also contain useful tips and tricks for anybody who needs to
>> resolve merge conflicts.
>>
>> By adding this to the kernel as documentation we can more easily point
>> to it e.g. from stable emails about failed backports, as well as allow
>> the community to modify it over time if necessary.
>>
>> I've added this under process/ since it also has
>> process/applying-patches.rst. Another interesting document is
>> maintainer/rebasing-and-merging.rst which maybe should eventually refer
>> to this one, but I'm leaving that as a future cleanup.
>>
>> Thanks to Harshit Mogalapalli for helping with the original blog post
>> as well as this updated document and Bagas Sanjaya for providing
>> thoughtful feedback.
>>
>> v2: fixed heading style, link style, placeholder style, other comments
>
> So this seems generally good and useful.  I have a few small comments,
> none of which necessarily block merging it in its current form:
>
> - I would like to see an ack/reviewed-by tag by others with experience
>   with this task if possible.  The lack of complaints is a good start,
>   but not always indicative of a lack of disagreement...:)
>
> - Might this be better placed in Documentation/maintainer?
>
> - Colordiff looks cool, but I'd at least drop in a mention of the Emacs
>   ediff mode, which offers (I believe) a lot of the same functionality.

So I never got an answer on any of this ...  I've gone ahead and applied
the patch on the theory that it clearly hasn't upset anybody; I do still
think we should consider moving it to the maintainer manual, though.

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] docs: add backporting and conflict resolution document
  2023-10-10 19:57   ` Jonathan Corbet
@ 2023-10-13 15:24     ` Vegard Nossum
  2023-10-14  9:43       ` Willy Tarreau
  0 siblings, 1 reply; 8+ messages in thread
From: Vegard Nossum @ 2023-10-13 15:24 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: linux-doc, linux-kernel, stable, backports, Harshit Mogalapalli,
	Bagas Sanjaya, Greg Kroah-Hartman, Stephen Rothwell,
	Jason A . Donenfeld, Konstantin Ryabitsev, Steven Rostedt,
	Willy Tarreau

On 10/10/2023 21:57, Jonathan Corbet wrote:
> Jonathan Corbet <corbet@lwn.net> writes:
> 
>> Vegard Nossum <vegard.nossum@oracle.com> writes:
>>
>>> This is a new document based on my 2022 blog post:
>>>
>>>    https://blogs.oracle.com/linux/post/backporting-patches-using-git
>>>
>>> Although this is aimed at stable contributors and distro maintainers,
>>> it does also contain useful tips and tricks for anybody who needs to
>>> resolve merge conflicts.
>>>
>>> By adding this to the kernel as documentation we can more easily point
>>> to it e.g. from stable emails about failed backports, as well as allow
>>> the community to modify it over time if necessary.
>>>
>>> I've added this under process/ since it also has
>>> process/applying-patches.rst. Another interesting document is
>>> maintainer/rebasing-and-merging.rst which maybe should eventually refer
>>> to this one, but I'm leaving that as a future cleanup.

[...]

>> - I would like to see an ack/reviewed-by tag by others with experience
>>    with this task if possible.  The lack of complaints is a good start,
>>    but not always indicative of a lack of disagreement...:)

I tried to include people and lists who might be interested in Ccs.

I've now added Steven Rostedt and Willy Tarreau as well on the
off-chance that they have something to say about it (Steven presented
his conflict resolution method at Kernel Recipes and I think Willy is
experienced with backporting), but this is in no way meant as pressure
to review this patch. Here's a link to the top of the thread:

https://lore.kernel.org/all/20230824092325.1464227-1-vegard.nossum@oracle.com/

I feel like in the worst case, somebody sees the document down the line
and vehemently disagrees with something and we either fix it or take it
out completely.

I'd like to add that my impression is that a LOT of people *fear*
backporting and conflict resolution -- and it doesn't have to be that
way. We should be talking about merge conflicts and what good workflows
look like (one of the reasons why I was very happy to see Steven's
presentation at KR), instead of leaving everybody to figure it out on
their own. This document is my contribution towards that.

>> - Might this be better placed in Documentation/maintainer?

I can see the justification for that, given that maintainers probably
resolve merge conflicts more than plain contributors. However, this was
intended primarily as a guide to backporting stable patches, which is
not primarily done by subsystem maintainers, as far as I know. I'm not
sure where that leaves us. I thought it kind of fit next to "applying
patches" under process/ but I trust the documentation maintainer's
judgment :-) In either case, we can always move it (again) later.

>> - Colordiff looks cool, but I'd at least drop in a mention of the Emacs
>>    ediff mode, which offers (I believe) a lot of the same functionality.

I don't use emacs personally, but I would welcome this addition if
somebody were to write it!

> So I never got an answer on any of this ...  I've gone ahead and applied
> the patch on the theory that it clearly hasn't upset anybody; I do still
> think we should consider moving it to the maintainer manual, though.

Thanks.

I also saw a bot complaint about a repeated word; can you fix it up, do
I send a v3, or do I send an incremental patch?


Vegard
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] docs: add backporting and conflict resolution document
  2023-10-13 15:24     ` Vegard Nossum
@ 2023-10-14  9:43       ` Willy Tarreau
  2023-10-14 11:48         ` Vegard Nossum
  0 siblings, 1 reply; 8+ messages in thread
From: Willy Tarreau @ 2023-10-14  9:43 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Jonathan Corbet, linux-doc, linux-kernel, stable, backports,
	Harshit Mogalapalli, Bagas Sanjaya, Greg Kroah-Hartman,
	Stephen Rothwell, Jason A . Donenfeld, Konstantin Ryabitsev,
	Steven Rostedt

Hi Vegard,

On Fri, Oct 13, 2023 at 05:24:31PM +0200, Vegard Nossum wrote:
> I've now added Steven Rostedt and Willy Tarreau as well on the
> off-chance that they have something to say about it (Steven presented
> his conflict resolution method at Kernel Recipes and I think Willy is
> experienced with backporting), but this is in no way meant as pressure
> to review this patch. Here's a link to the top of the thread:
> 
> https://lore.kernel.org/all/20230824092325.1464227-1-vegard.nossum@oracle.com/

That's a very nice description, I'm sure it can help (and I learned a
few points there already). There are a few points I'm not seeing there,
though, base on my habits:
  - in my experience, there's a huge difference between backporting
    code you don't know and code you know. I'm dealing with haproxy
    backports several times a week and I tend to know this code, so I
    use my intuition a lot and have no problem adjusting stuff around
    the conflicting parts. However when I was dealing with extended
    kernels, that was not the case at all, because I didn't know that
    code, and worse, I wasn't skilled at all on many of the parts I had
    to deal with. While it's OK to take the blame for a failed backport,
    it's generally not OK to expose users to risks caused by your lack
    of knowledge. In this case it means you need to be extra cautious,
    and very often to actually *ask* authors or maintainers for help.
    If maintainers do not want to help backporting some patches to an
    older version of their code, usually it should be perceived as a
    hint that they'll find it complicated to do it right; then take that
    as a hint that there's little chances you'll get it right by yourself
    while ignoring that code. This may imply dropping the fix, documenting
    the area as broken, or asking for help on various lists until someone
    more knowledgeable can help.

  - the tool that helped me the most in resolving rename conflicts is
    "patch". As you explained, "git am" is a bit stubborn. But patch is
    way more lenient (and will often do mistakes). In the very old 2.6.32
    I used to rely on "git show XX | patch -p1" way more often than
    "git am". For a rename, you do "git show XX -- file |patch otherfile".
    It works the same with file-based patches or mbox: "patch -p1 < mbox".
    Patch however will not place the conflict markers and will leave .rej
    files. I then opened them in an editor next to the file to edit, to
    locate the area and copy the relevant part to its location. Emacs'
    ediff is also extremely convenient to pick select parts of each file.

  - control the patches: after I'm pretty sure I have resolved a patch,
    I compare it side by side with the original one. Normally, backported
    patches should have the same structure as the original. Using whatever
    editor supporting a vertical split helps a lot. Otherwise I also use
    "diff -y --width 200" between them to focus on differences (typically
    line numbers). It's not rare that a part is missing, either because
    I messed up, or because I forgot to process a .rej that I mistakenly
    removed, or because a file was added, I reset the tree and it's left
    untracked. So any difference between the patches should have its own
    explanation (line numbers, function names, file names, occurrences).
    By the way, it can very easily happen that applying a patch will work
    fine but it will apply to the wrong function because some code patterns
    especially in error unrolling paths are often the same between several
    functions. It happened to me quite a few times in the past, and even
    a few of these persisted till the public patch review. That's really
    a risk that must not be minimized!

  - something quite common is that as code evolves, it gets refactored
    so that what used to appear at 3 locations in the past now moved to
    a single function. But when you're backporting, you're doing the
    reverse work, you're taking a patch for a single location that may
    apply to multiple ones. Often the hint is that the function name
    changed. But sometimes it's not even the case. If what you've found
    looks like a nasty bug pattern that looks like it could easily have
    been copy-pasted at other places, it's worth looking for it elsewhere
    using git grep. If you find the same pattern, then you search for it
    into the tree the patch comes from. If you don't find it, it's likely
    that you'll need to adjust it, and git log is your friend to figure
    what happened to these areas. Note that git blame doesn't work for
    removed code. If you find the same pattern elsewhere in mainline, it's
    worth asking the patch author if that one is also an occurrence of the
    same bug or just normal. It's not uncommon to find new bugs during a
    backport.

  - color diff: usually I just rely on:

     [color]
         ui = true

    But I also recently got used to using diff-highlight that will
    highlight different characters between lines. It's nice for complex
    "if" conditions where you don't see the difference, or for spaces
    vs tabs:

    [pager]
        log = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less
        show = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less
        diff = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less

  - git add, git add and git add: when fixing patches by hand, it's very
    common to leave some parts not added (especially with | patch -p1).
    It's useful to work on a clean tree to more easily spot untracked
    files with "git status".

> I feel like in the worst case, somebody sees the document down the line
> and vehemently disagrees with something and we either fix it or take it
> out completely.

No I don't disagree and even find it useful. If at least it could help
people figure the pain it is to backport any single patch, and encourage
them to help stable maintainers, that would already be awesome!

> I'd like to add that my impression is that a LOT of people *fear*
> backporting and conflict resolution -- and it doesn't have to be that
> way. We should be talking about merge conflicts and what good workflows
> look like (one of the reasons why I was very happy to see Steven's
> presentation at KR), instead of leaving everybody to figure it out on
> their own. This document is my contribution towards that.

I'm not completely sold to this. Yes we should teach more people to
perform that task themselves. But there's a big difference between
backporting a few patches and feeling like you could maintain your own
kernel because now you know how to resolve conflicts. What I mentioned
above about dealing with patches you don't understand must not be
underestimated, that's the biggest challenge I faced when working on
stable kernels. There's probably a feeling of shame of not understanding
something, but I can say that many times I asked for help and was helped
even by top-ranked developers, and nobody ever laughed at me for not
understanding a certain area. But doing that in your garage for your
own kernel or for your company's products is a huge problem because it's
unlikely that you'll get help from the maintainers this time, so you're
left on your own with your own understanding of certain patches.

Thus, yes to backports, no to kernel forks being a collection of
backports.

> > > - Colordiff looks cool, but I'd at least drop in a mention of the Emacs
> > >    ediff mode, which offers (I believe) a lot of the same functionality.
> 
> I don't use emacs personally, but I would welcome this addition if
> somebody were to write it!

There's not much to write about it. You start "ediff-buffers" between two
files, or you can probably find "ediff" in one of the menus, or type "ediff"
then press tab to get the list of commands. It's not always completely
intuitive but easy to remember, and fully interactive and supports rolling
back operations.

Hoping this helps,
Willy
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] docs: add backporting and conflict resolution document
  2023-10-14  9:43       ` Willy Tarreau
@ 2023-10-14 11:48         ` Vegard Nossum
  2023-10-14 15:07           ` Willy Tarreau
  2023-10-17 16:42           ` Ben Hutchings
  0 siblings, 2 replies; 8+ messages in thread
From: Vegard Nossum @ 2023-10-14 11:48 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Jonathan Corbet, linux-doc, linux-kernel, stable, backports,
	Harshit Mogalapalli, Bagas Sanjaya, Greg Kroah-Hartman,
	Stephen Rothwell, Jason A . Donenfeld, Konstantin Ryabitsev,
	Steven Rostedt, Ben Hutchings


On 14/10/2023 11:43, Willy Tarreau wrote:
> Hi Vegard,
> 
> On Fri, Oct 13, 2023 at 05:24:31PM +0200, Vegard Nossum wrote:
>> I've now added Steven Rostedt and Willy Tarreau as well on the
>> off-chance that they have something to say about it (Steven presented
>> his conflict resolution method at Kernel Recipes and I think Willy is
>> experienced with backporting), but this is in no way meant as pressure
>> to review this patch. Here's a link to the top of the thread:
>>
>> https://lore.kernel.org/all/20230824092325.1464227-1-vegard.nossum@oracle.com/

(Adding Ben Hutchings to Cc as well for the same reasons.)

> That's a very nice description, I'm sure it can help (and I learned a
> few points there already). There are a few points I'm not seeing there,
> though, base on my habits:

Thanks for the quick and comprehensive response!

>    - in my experience, there's a huge difference between backporting
>      code you don't know and code you know. I'm dealing with haproxy
>      backports several times a week and I tend to know this code, so I
>      use my intuition a lot and have no problem adjusting stuff around
>      the conflicting parts. However when I was dealing with extended
>      kernels, that was not the case at all, because I didn't know that
>      code, and worse, I wasn't skilled at all on many of the parts I had
>      to deal with. While it's OK to take the blame for a failed backport,
>      it's generally not OK to expose users to risks caused by your lack
>      of knowledge. In this case it means you need to be extra cautious,
>      and very often to actually *ask* authors or maintainers for help.
>      If maintainers do not want to help backporting some patches to an
>      older version of their code, usually it should be perceived as a
>      hint that they'll find it complicated to do it right; then take that
>      as a hint that there's little chances you'll get it right by yourself
>      while ignoring that code. This may imply dropping the fix, documenting
>      the area as broken, or asking for help on various lists until someone
>      more knowledgeable can help.

I agree -- backports ARE very easy to get wrong, EVEN when you know the
code well. This is stressed several times in the document, especially in
the last two sections about build and runtime testing, but also in the
section on error handling.

However, I'm wary of being too stern as well. There are already a
million ways to introduce subtle bugs and put users at risk, but we
rarely try to put people off contributing regular patches (at least in
this specific way :-P).

Did you see this meme? https://i.imgur.com/yk5rf13.png

I think conflicts have a bit of a bad reputation exactly because you're
presented with something that's hard to make sense of at first sight.
I'd like to dispel this myth that you need to be an expert to make sense
of conflict markers. I think with the right attitude, the right tools,
and the right approach you can go a LONG way. Also, nobody is born an
expert and we should encourage people to gain experience in this area IMHO.

With that said, how about if we add a short section near the end about
submitting stable backports where we encourage submitters to 1) approach
the backporting process in a humble way, 2) describe the reason for the
conflict in the changelog and their resolution, 3) be honest about their
confidence in their resolution, 4) ask relevant maintainers for an
explicit ack?

I'm open to other ideas, I just want to make sure we strike the right
balance of encouragement vs. discouragement.

>    - the tool that helped me the most in resolving rename conflicts is
>      "patch". As you explained, "git am" is a bit stubborn. But patch is
>      way more lenient (and will often do mistakes). In the very old 2.6.32
>      I used to rely on "git show XX | patch -p1" way more often than
>      "git am". For a rename, you do "git show XX -- file |patch otherfile".
>      It works the same with file-based patches or mbox: "patch -p1 < mbox".
>      Patch however will not place the conflict markers and will leave .rej
>      files. I then opened them in an editor next to the file to edit, to
>      locate the area and copy the relevant part to its location. Emacs'
>      ediff is also extremely convenient to pick select parts of each file.
> 
>    - control the patches: after I'm pretty sure I have resolved a patch,
>      I compare it side by side with the original one. Normally, backported
>      patches should have the same structure as the original. Using whatever
>      editor supporting a vertical split helps a lot. Otherwise I also use
>      "diff -y --width 200" between them to focus on differences (typically
>      line numbers). It's not rare that a part is missing, either because
>      I messed up, or because I forgot to process a .rej that I mistakenly
>      removed, or because a file was added, I reset the tree and it's left
>      untracked. So any difference between the patches should have its own
>      explanation (line numbers, function names, file names, occurrences).
>      By the way, it can very easily happen that applying a patch will work
>      fine but it will apply to the wrong function because some code patterns
>      especially in error unrolling paths are often the same between several
>      functions. It happened to me quite a few times in the past, and even
>      a few of these persisted till the public patch review. That's really
>      a risk that must not be minimized!

There is a section on this: "Verifying the result", and also describes
doing a side-by-side diff of the old and new patches.

The bit about applying the patch to the wrong function -- I doubt this
happens that much when using cherry-pick, since it knows both sides of
the history and can tell when code moves around. You're probably right
that it can easily happen with plain git am/patch though. In my mind,
this is an argument in favour of using cherry-pick whenever possible.

>    - something quite common is that as code evolves, it gets refactored
>      so that what used to appear at 3 locations in the past now moved to
>      a single function. But when you're backporting, you're doing the
>      reverse work, you're taking a patch for a single location that may
>      apply to multiple ones. Often the hint is that the function name
>      changed. But sometimes it's not even the case. If what you've found
>      looks like a nasty bug pattern that looks like it could easily have
>      been copy-pasted at other places, it's worth looking for it elsewhere
>      using git grep. If you find the same pattern, then you search for it
>      into the tree the patch comes from. If you don't find it, it's likely
>      that you'll need to adjust it, and git log is your friend to figure
>      what happened to these areas. Note that git blame doesn't work for
>      removed code. If you find the same pattern elsewhere in mainline, it's
>      worth asking the patch author if that one is also an occurrence of the
>      same bug or just normal. It's not uncommon to find new bugs during a
>      backport.

Very good point.

I think this fits very well alongside the sections on function
arguments, error handling, etc. since it details a specific thing that
can go easily wrong.

Can I take what you wrote above, or do you want to submit your own
incremental patch? I think we could insert it almost verbatim.

> 
>    - color diff: usually I just rely on:
> 
>       [color]
>           ui = true
> 
>      But I also recently got used to using diff-highlight that will
>      highlight different characters between lines. It's nice for complex
>      "if" conditions where you don't see the difference, or for spaces
>      vs tabs:
> 
>      [pager]
>          log = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less
>          show = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less
>          diff = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less

Right, git log/show/diff --word-diff=color can also do this to some degree.

There is also core.whitespace/color.diff.whitespace that will highlight
some common whitespace errors.

I haven't used diff-highlight myself -- I'll give it a try.

In this case, I was using colordiff specifically to do the side-by-side
diff of the two patches since it handles the <() shell syntax _and_ does
the highlighting of differences between the patches.

>    - git add, git add and git add: when fixing patches by hand, it's very
>      common to leave some parts not added (especially with | patch -p1).
>      It's useful to work on a clean tree to more easily spot untracked
>      files with "git status".

Yet another reason to use git cherry-pick instead of manual patch
commands: it keeps track of unmerged files for you. ;-)

So I'm a bit torn on this. I think in this particular document I'd like
to encourage the use of git and doing things "the git way" as much as
possible.

>> I feel like in the worst case, somebody sees the document down the line
>> and vehemently disagrees with something and we either fix it or take it
>> out completely.
> 
> No I don't disagree and even find it useful. If at least it could help
> people figure the pain it is to backport any single patch, and encourage
> them to help stable maintainers, that would already be awesome!
> 
>> I'd like to add that my impression is that a LOT of people *fear*
>> backporting and conflict resolution -- and it doesn't have to be that
>> way. We should be talking about merge conflicts and what good workflows
>> look like (one of the reasons why I was very happy to see Steven's
>> presentation at KR), instead of leaving everybody to figure it out on
>> their own. This document is my contribution towards that.
> 
> I'm not completely sold to this. Yes we should teach more people to
> perform that task themselves. But there's a big difference between
> backporting a few patches and feeling like you could maintain your own
> kernel because now you know how to resolve conflicts. What I mentioned
> above about dealing with patches you don't understand must not be
> underestimated, that's the biggest challenge I faced when working on
> stable kernels. There's probably a feeling of shame of not understanding
> something, but I can say that many times I asked for help and was helped
> even by top-ranked developers, and nobody ever laughed at me for not
> understanding a certain area. But doing that in your garage for your
> own kernel or for your company's products is a huge problem because it's
> unlikely that you'll get help from the maintainers this time, so you're
> left on your own with your own understanding of certain patches.
> 
> Thus, yes to backports, no to kernel forks being a collection of
> backports.

Right; almost every time I talk about backporting it's really in the
context of contributing these backports to stable. I'm not in favour of
forks either and I'm not trying to encourage it.

Let me try to come up with a specific addition related to the changes
you requested above and see if you agree with the wording.

Thanks,


Vegard
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] docs: add backporting and conflict resolution document
  2023-10-14 11:48         ` Vegard Nossum
@ 2023-10-14 15:07           ` Willy Tarreau
  2023-10-17 16:42           ` Ben Hutchings
  1 sibling, 0 replies; 8+ messages in thread
From: Willy Tarreau @ 2023-10-14 15:07 UTC (permalink / raw)
  To: Vegard Nossum
  Cc: Jonathan Corbet, linux-doc, linux-kernel, stable, backports,
	Harshit Mogalapalli, Bagas Sanjaya, Greg Kroah-Hartman,
	Stephen Rothwell, Jason A . Donenfeld, Konstantin Ryabitsev,
	Steven Rostedt, Ben Hutchings

On Sat, Oct 14, 2023 at 01:48:54PM +0200, Vegard Nossum wrote:
> Did you see this meme? https://i.imgur.com/yk5rf13.png

No I didn't know it.

> I think conflicts have a bit of a bad reputation exactly because you're
> presented with something that's hard to make sense of at first sight.
> I'd like to dispel this myth that you need to be an expert to make sense
> of conflict markers. I think with the right attitude, the right tools,
> and the right approach you can go a LONG way. Also, nobody is born an
> expert and we should encourage people to gain experience in this area IMHO.

That's interesting that you mention this because I worked for a company
years ago where there was a person in charge of merges, who was not at
all part of the development team. I thought it made absolutely no sense
to request that someone who doesn't know the project would resolve
conflicts! When I expressed my surprise there, I was told "you know,
merging requires skills developers don't have"! So it's very possible
that there's this myth floating around, indeed.

> With that said, how about if we add a short section near the end about
> submitting stable backports where we encourage submitters to 1) approach
> the backporting process in a humble way, 2) describe the reason for the
> conflict in the changelog and their resolution, 3) be honest about their
> confidence in their resolution, 4) ask relevant maintainers for an
> explicit ack?

Maybe, by the way you remind me something that I've been used to doing
for many years: whenever I adapt a patch to resolve a conflict, I shortly
summarize my changes just before my signed-of-by line in brackets. It
helps me figure which patches needed adaptations, and others figure if
my reasoning is flawed. I seem to remember that Ben did the same. That's
something really convenient we should encourage I think.

> I'm open to other ideas, I just want to make sure we strike the right
> balance of encouragement vs. discouragement.

Sure. Another point which I didn't mention and which has to be considered
when backporting the same patch across multiple versions is that you often
face the choice of starting from the original patch and trying to stuff it
into an old version, or starting from the closest version and trying to
adapt it. I tend to prefer restarting from the closest (but more recent)
version, for multiple reasons:
  - less differences, some renames might already have been handled and
    in general any of the previously mentioned adaptations.

  - it limits the risk of leaving a failed backport in the middle of
    the chain. Indeed, let's that than when I was doing 3.10 I would
    have picked the original patch from 4.19 and adapted it myself.
    Maybe the 4.4 version was broken, and the user later upgrading
    from 3.10 to 4.4 could face a regression. Instead, by adapting
    the 4.4 one to 3.10, if 4.4 was wrong, 3.10 would also likely be
    and if detected there, the broken chain would be detected. This is
    particularly true for rarely used parts (e.g. uncommon drivers)
    where a bug can remain exposed for a long time without anyone
    noticing.

> >    - the tool that helped me the most in resolving rename conflicts is
> >      "patch". As you explained, "git am" is a bit stubborn. But patch is
> >      way more lenient (and will often do mistakes). In the very old 2.6.32
> >      I used to rely on "git show XX | patch -p1" way more often than
> >      "git am". For a rename, you do "git show XX -- file |patch otherfile".
> >      It works the same with file-based patches or mbox: "patch -p1 < mbox".
> >      Patch however will not place the conflict markers and will leave .rej
> >      files. I then opened them in an editor next to the file to edit, to
> >      locate the area and copy the relevant part to its location. Emacs'
> >      ediff is also extremely convenient to pick select parts of each file.
> > 
> >    - control the patches: after I'm pretty sure I have resolved a patch,
> >      I compare it side by side with the original one. Normally, backported
> >      patches should have the same structure as the original. Using whatever
> >      editor supporting a vertical split helps a lot. Otherwise I also use
> >      "diff -y --width 200" between them to focus on differences (typically
> >      line numbers). It's not rare that a part is missing, either because
> >      I messed up, or because I forgot to process a .rej that I mistakenly
> >      removed, or because a file was added, I reset the tree and it's left
> >      untracked. So any difference between the patches should have its own
> >      explanation (line numbers, function names, file names, occurrences).
> >      By the way, it can very easily happen that applying a patch will work
> >      fine but it will apply to the wrong function because some code patterns
> >      especially in error unrolling paths are often the same between several
> >      functions. It happened to me quite a few times in the past, and even
> >      a few of these persisted till the public patch review. That's really
> >      a risk that must not be minimized!
> 
> There is a section on this: "Verifying the result", and also describes
> doing a side-by-side diff of the old and new patches.

Ah yes, I didn't notice it the first time, I thought it meant comparing
the files, not the patches.

> The bit about applying the patch to the wrong function -- I doubt this
> happens that much when using cherry-pick, since it knows both sides of
> the history and can tell when code moves around. You're probably right
> that it can easily happen with plain git am/patch though. In my mind,
> this is an argument in favour of using cherry-pick whenever possible.

Yep I never had it with cherry-pick, only with git-am (rare) and patch
(more common). I totally agree with cherry-picking as much as possible,
and I also often preferred to git-am to the original base to later
cherry-pick, as you mentioned.

> >    - something quite common is that as code evolves, it gets refactored
> >      so that what used to appear at 3 locations in the past now moved to
> >      a single function. But when you're backporting, you're doing the
> >      reverse work, you're taking a patch for a single location that may
> >      apply to multiple ones. Often the hint is that the function name
> >      changed. But sometimes it's not even the case. If what you've found
> >      looks like a nasty bug pattern that looks like it could easily have
> >      been copy-pasted at other places, it's worth looking for it elsewhere
> >      using git grep. If you find the same pattern, then you search for it
> >      into the tree the patch comes from. If you don't find it, it's likely
> >      that you'll need to adjust it, and git log is your friend to figure
> >      what happened to these areas. Note that git blame doesn't work for
> >      removed code. If you find the same pattern elsewhere in mainline, it's
> >      worth asking the patch author if that one is also an occurrence of the
> >      same bug or just normal. It's not uncommon to find new bugs during a
> >      backport.
> 
> Very good point.
> 
> I think this fits very well alongside the sections on function
> arguments, error handling, etc. since it details a specific thing that
> can go easily wrong.
> 
> Can I take what you wrote above, or do you want to submit your own
> incremental patch? I think we could insert it almost verbatim.

Sure, feel free to take whatever I wrote and even rework it as suits you!

> >    - color diff: usually I just rely on:
> > 
> >       [color]
> >           ui = true
> > 
> >      But I also recently got used to using diff-highlight that will
> >      highlight different characters between lines. It's nice for complex
> >      "if" conditions where you don't see the difference, or for spaces
> >      vs tabs:
> > 
> >      [pager]
> >          log = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less
> >          show = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less
> >          diff = /usr/doc/git-2.35.3/contrib/diff-highlight/diff-highlight | less
> 
> Right, git log/show/diff --word-diff=color can also do this to some degree.

I didn't know word-diff. The format is unusual but can be convenient at
times. I think it's worth mentioning various options so that developers
try them, adopt or reject them based on habits and taste.

> There is also core.whitespace/color.diff.whitespace that will highlight
> some common whitespace errors.

yes.

> I haven't used diff-highlight myself -- I'll give it a try.

It gets a little while to get used to it as it changes the background
color around some parts. I find patches less easy to read when only
a few chars change on the line, but you get the info about what changes
and that's the point.

> In this case, I was using colordiff specifically to do the side-by-side
> diff of the two patches since it handles the <() shell syntax _and_ does
> the highlighting of differences between the patches.

Ah, then I understand, I didn't catch that you were using it to compare
the two patches. That's what I'm seeing on your site with the screenshots
indeed. In this case editing tools are of limited use here (except to
compare) since you definitely don't want to pick parts of a patch for
the other one.

> >    - git add, git add and git add: when fixing patches by hand, it's very
> >      common to leave some parts not added (especially with | patch -p1).
> >      It's useful to work on a clean tree to more easily spot untracked
> >      files with "git status".
> 
> Yet another reason to use git cherry-pick instead of manual patch
> commands: it keeps track of unmerged files for you. ;-)

Definitely. I think everyone prefers cherry-pick, but you quickly
learn it's not always an option. For old LTS kernels, they were so
far away from original ones that cherry-pick would systematically
pull in hundreds of context lines that made it very difficult to
locate the useful part :-/

> So I'm a bit torn on this. I think in this particular document I'd like
> to encourage the use of git and doing things "the git way" as much as
> possible.

Well, you asked how people in charge of this used to do it, right ? :-)
Similarly you've seen Steven's approach using Quilt. I had not thought
about using it this way, but the approach here was to keep track of
touched files. I think that in general patch management is ugly from a
software developemnt perspective. It relies on modifications arbitrarily
grouped by line because historically the tools have always worked this
way, and developers adapted to this method. But it doesn't follow any
sane workflow related to the software development cycle. It just
happens to work well enough. Based on this, I see nothing wrong with
doing what works for you. What matters is how much you can trust the
operations (hence the need for post-checks).

When working on old kernels, I used to find those rough ratios:
  - cherry-pick applies 1/3 of the time
  - 3/4 of applied cherry-picks actually build without error
  - 90% of cherry-picks that apply well really fix the problem

As such, as painful as it is, 10% of the patches that apply and build
after a successful cherry-pick do not correctly fix a problem or have
a side effect. Those that require manual fixing impose some analysis
that more often allows to detect that something is going to be wrong.
Several times it happened to me that I thought "uh oh 20 patches applied
in a row without problem, that's getting nasty".

I'm seeing a successful series of cherry-picks like a 100-line C program
that you write at once and that first compiles and starts without error.
You know that something's probably wrong but you don't know where to look.

So in short, yes, I do by far prefer cherry-pick but I don't trust it
that much because less human intervention usually means less validation.

> > > I'd like to add that my impression is that a LOT of people *fear*
> > > backporting and conflict resolution -- and it doesn't have to be that
> > > way. We should be talking about merge conflicts and what good workflows
> > > look like (one of the reasons why I was very happy to see Steven's
> > > presentation at KR), instead of leaving everybody to figure it out on
> > > their own. This document is my contribution towards that.
> > 
> > I'm not completely sold to this. Yes we should teach more people to
> > perform that task themselves. But there's a big difference between
> > backporting a few patches and feeling like you could maintain your own
> > kernel because now you know how to resolve conflicts. What I mentioned
> > above about dealing with patches you don't understand must not be
> > underestimated, that's the biggest challenge I faced when working on
> > stable kernels. There's probably a feeling of shame of not understanding
> > something, but I can say that many times I asked for help and was helped
> > even by top-ranked developers, and nobody ever laughed at me for not
> > understanding a certain area. But doing that in your garage for your
> > own kernel or for your company's products is a huge problem because it's
> > unlikely that you'll get help from the maintainers this time, so you're
> > left on your own with your own understanding of certain patches.
> > 
> > Thus, yes to backports, no to kernel forks being a collection of
> > backports.
> 
> Right; almost every time I talk about backporting it's really in the
> context of contributing these backports to stable. I'm not in favour of
> forks either and I'm not trying to encourage it.
> 
> Let me try to come up with a specific addition related to the changes
> you requested above and see if you agree with the wording.

OK!

Willy
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] docs: add backporting and conflict resolution document
  2023-10-14 11:48         ` Vegard Nossum
  2023-10-14 15:07           ` Willy Tarreau
@ 2023-10-17 16:42           ` Ben Hutchings
  1 sibling, 0 replies; 8+ messages in thread
From: Ben Hutchings @ 2023-10-17 16:42 UTC (permalink / raw)
  To: Vegard Nossum, Willy Tarreau
  Cc: Jonathan Corbet, linux-doc, linux-kernel, stable, backports,
	Harshit Mogalapalli, Bagas Sanjaya, Greg Kroah-Hartman,
	Stephen Rothwell, Jason A . Donenfeld, Konstantin Ryabitsev,
	Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 1164 bytes --]

On Sat, 2023-10-14 at 13:48 +0200, Vegard Nossum wrote:
> On 14/10/2023 11:43, Willy Tarreau wrote:
> > Hi Vegard,
> > 
> > On Fri, Oct 13, 2023 at 05:24:31PM +0200, Vegard Nossum wrote:
> > > I've now added Steven Rostedt and Willy Tarreau as well on the
> > > off-chance that they have something to say about it (Steven presented
> > > his conflict resolution method at Kernel Recipes and I think Willy is
> > > experienced with backporting), but this is in no way meant as pressure
> > > to review this patch. Here's a link to the top of the thread:
> > > 
> > > https://lore.kernel.org/all/20230824092325.1464227-1-vegard.nossum@oracle.com/
> 
> (Adding Ben Hutchings to Cc as well for the same reasons.)
[...]

I previously wrote some text for CIP about reviewing stable backports.
The "Read the code" section should be applicable to writing them as
well:
<https://wiki.linuxfoundation.org/civilinfrastructureplatform/cipkernelmaintenance#read_the_code>.

I didn't compare with your text so it may be that you already covered
all those points.

Ben.


-- 
Ben Hutchings
Who are all these weirdos? - David Bowie, on joining IRC


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-10-17 17:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-24  9:23 [PATCH v2] docs: add backporting and conflict resolution document Vegard Nossum
2023-10-03 15:19 ` Jonathan Corbet
2023-10-10 19:57   ` Jonathan Corbet
2023-10-13 15:24     ` Vegard Nossum
2023-10-14  9:43       ` Willy Tarreau
2023-10-14 11:48         ` Vegard Nossum
2023-10-14 15:07           ` Willy Tarreau
2023-10-17 16:42           ` Ben Hutchings

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).