All of lore.kernel.org
 help / color / mirror / Atom feed
* rejecting patches that have an offset
@ 2011-08-15 23:16 Eric Blake
       [not found] ` <4E49A8EA.5020507-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2011-08-16 23:22 ` Junio C Hamano
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Blake @ 2011-08-15 23:16 UTC (permalink / raw)
  To: bug-patch-mXXj517/zsQ, Git Mailing List

I ran into a case that cost me several hours today, while building an 
rpm file for libvirt.  I have a context patch that only adds lines (no 
deletions), and which had multiple places in the destination file where 
the patch would match context and still apply, although only one of 
those places will compile as correct.  However, the patch file was 
inadvertently generated by git against the wrong version of the 
destination, so the line numbers in the patch did not match the version 
of the file that I was trying to apply it to, and 'patch -p1 --fuzz=0 
-s' ended up triggering patch's sliding algorithm where it applied the 
patch with an offset of 11 lines.  Meanwhile, running the same patch 
through git applied the patch in a different offset: git found the 
offset that matched the function name in the @@ line, which was more 
than 11 lines away, but actually matched the intent of the patch better.

The problem is that the difference in choice between patch and git 
resulted in a patch series that works or fails according to which tool 
you pass it through.  But the whole point of an rpm file is that if the 
patches were generated correctly, none of them should ever have any 
offset - an rpm should be tool-independent.

It would have saved me a lot of time if both 'patch' and 'git apply' 
could be taught a mode of operation where they explicitly reject a patch 
that cannot be applied without relying on an offset.  That is, 'patch 
--fuzz=0' is too weak, and the fact that 'patch -s' squelched the error 
message meant that I had nothing to alert me to the fact that an offset 
even took place.  And no, I don't want to filterdiff from patchutils to 
convert the patch from context-diff over to ed-script-diff just to 
benefit from the fact that patch does not do offset detection on 
ed-script-patches.

If it were possible to optionally reject patches with offsets, then 
building rpm files could use this mode to insist that all patches apply 
offset-free, making for a more robust patch chain (of course, the 
default should remain that the offset algorithm is still applied, and 
only suppressed by explicit request, as the use of offsets is normally a 
very useful feature - my point is that rpm patch chains are an exception 
for the rule where offsets normally make life easier).

It might also be nice if patch could learn the algorithm that appears to 
match the git behavior, where when there are multiple points with 
identical context (viewing just the context in isolation), but where 
those locations differ in function location (as learned by the @@ header 
line in the patch file), then the preferred offset is the one in the 
named function, even if that is not the closes context match to the line 
number given in the patch file.

-- 
Eric Blake   eblake-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rejecting patches that have an offset
       [not found] ` <4E49A8EA.5020507-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2011-08-16 22:48   ` Andreas Gruenbacher
  2011-08-16 23:10     ` [bug-patch] " Eric Blake
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Gruenbacher @ 2011-08-16 22:48 UTC (permalink / raw)
  To: Eric Blake; +Cc: bug-patch-mXXj517/zsQ, Git Mailing List

Eric,

On Mon, 2011-08-15 at 17:16 -0600, Eric Blake wrote:
> It would have saved me a lot of time if both 'patch' and 'git apply' 
> could be taught a mode of operation where they explicitly reject a patch 
> that cannot be applied without relying on an offset.

that sounds reasonable.  Can you send a patch or at least add a bug on
Savannah?

> It might also be nice if patch could learn the algorithm that appears to 
> match the git behavior, where when there are multiple points with 
> identical context (viewing just the context in isolation), but where 
> those locations differ in function location (as learned by the @@ header 
> line in the patch file), then the preferred offset is the one in the 
> named function, even if that is not the closes context match to the line 
> number given in the patch file.

Sounds interesting; a patch for that would be great as well.

Thanks,
Andreas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [bug-patch] rejecting patches that have an offset
  2011-08-16 22:48   ` Andreas Gruenbacher
@ 2011-08-16 23:10     ` Eric Blake
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Blake @ 2011-08-16 23:10 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: bug-patch, Git Mailing List

On 08/16/2011 04:48 PM, Andreas Gruenbacher wrote:
> Eric,
>
> On Mon, 2011-08-15 at 17:16 -0600, Eric Blake wrote:
>> It would have saved me a lot of time if both 'patch' and 'git apply'
>> could be taught a mode of operation where they explicitly reject a patch
>> that cannot be applied without relying on an offset.
>
> that sounds reasonable.  Can you send a patch or at least add a bug on
> Savannah?

Bug opened: https://savannah.gnu.org/bugs/index.php?34031

>
>> It might also be nice if patch could learn the algorithm that appears to
>> match the git behavior, where when there are multiple points with
>> identical context (viewing just the context in isolation), but where
>> those locations differ in function location (as learned by the @@ header
>> line in the patch file), then the preferred offset is the one in the
>> named function, even if that is not the closes context match to the line
>> number given in the patch file.
>
> Sounds interesting; a patch for that would be great as well.

Bug opened: https://savannah.gnu.org/bugs/index.php?34032

-- 
Eric Blake   eblake@redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rejecting patches that have an offset
  2011-08-15 23:16 rejecting patches that have an offset Eric Blake
       [not found] ` <4E49A8EA.5020507-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2011-08-16 23:22 ` Junio C Hamano
       [not found]   ` <7vobzpeybh.fsf-s2KvWo2KEQL18tm6hw+yZpy9Z0UEorGK@public.gmane.org>
  1 sibling, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2011-08-16 23:22 UTC (permalink / raw)
  To: Eric Blake; +Cc: bug-patch, Git Mailing List

Eric Blake <eblake@redhat.com> writes:

> It would have saved me a lot of time if both 'patch' and 'git apply'
> could be taught a mode of operation where they explicitly reject a
> patch that cannot be applied without relying on an offset.

I am not sure about this. I somehow doubt you would want to make sure that
the preimage your patch is to be applied must be bit-for-bit identical to
what you prepared your patch for, IOW, you are using a patchfile merely as
a mean to "compress" the replacement file. You would want your RPM change
to tolerate some changes in the upstream and keep your patch applicable to
the next version of the upstream, no?

Given a patch that is not precise and can apply to multiple places,
"patch" and/or "git apply" can apply it to a place you may not have
intended. It may feel like a bug if that happens to a preimage that is
bit-for-bit identical to the version you prepared your patch is against,
but I would rather think, instead of blaming "patch" and/or "git apply",
it would be more productive to prepare a patch with larger context when
you know that the preimage file you are patching has many similar looking
lines, to make it _impossible_ for it to apply to places different from
what you intended.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rejecting patches that have an offset
       [not found]   ` <7vobzpeybh.fsf-s2KvWo2KEQL18tm6hw+yZpy9Z0UEorGK@public.gmane.org>
@ 2011-08-16 23:41     ` Eric Blake
  0 siblings, 0 replies; 5+ messages in thread
From: Eric Blake @ 2011-08-16 23:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: bug-patch-mXXj517/zsQ, Git Mailing List

On 08/16/2011 05:22 PM, Junio C Hamano wrote:
> Eric Blake<eblake-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>  writes:
>
>> It would have saved me a lot of time if both 'patch' and 'git apply'
>> could be taught a mode of operation where they explicitly reject a
>> patch that cannot be applied without relying on an offset.
>
> I am not sure about this. I somehow doubt you would want to make sure that
> the preimage your patch is to be applied must be bit-for-bit identical to
> what you prepared your patch for, IOW, you are using a patchfile merely as
> a mean to "compress" the replacement file. You would want your RPM change
> to tolerate some changes in the upstream and keep your patch applicable to
> the next version of the upstream, no?

When the RPM file is generated by git->patchfile list conversion, and I 
am trying to recreate a git repository from patchfile list->git, then 
yes, I _do_ want that patchfile list to apply bit-for-bit identical to 
anyone else starting from the same point, whether they use git or patch, 
so that anyone else can regenerate the end sources that were compiled 
into the rpm release.

Remember, the rpm file format includes both the starting point (it 
documents the upstream tarball) and the changes to that starting point 
(a patch stream) that were used to create a given released binary, in a 
format that is independent of git.  The idea is that managing an rpm 
patch series in git is much nicer for day-to-day work (and daily work 
within that git repository greatly benefits from the default of being 
able to assume patch offsets, such as rebasing a patch series to apply 
on top of newer upstream versions), but once converting from git out to 
rpm, the conversion from rpm back to git should give a bit-for-bit 
replay.  If heuristics for how to apply patch offsets change, and an rpm 
file includes an ambiguous patch that requires an offset, then you risk 
the rpm file being broken the moment you upgrade to a newer tool chain 
with a slightly different heuristic for where to resolve offsets; but if 
all patches in the series are 0-offset, then you have isolated the rpm 
from any implicit dependency on the version of the tool used to 
reconstruct the final software from the patch series.  So the question 
is now how to identify whether a patch series meets that 0-offset rule, 
and that's where a new option would be handy.

Hence, I'm requesting an option to reject patches with non-zero offsets, 
but not making it default, as there are only a few limited places (such 
as rebuilding a git repo starting from an rpm patch list) where 
bit-for-bit rebuild is more desirable than accounting for offsets due to 
changes in the starting point.

>
> Given a patch that is not precise and can apply to multiple places,
> "patch" and/or "git apply" can apply it to a place you may not have
> intended. It may feel like a bug if that happens to a preimage that is
> bit-for-bit identical to the version you prepared your patch is against,
> but I would rather think, instead of blaming "patch" and/or "git apply",
> it would be more productive to prepare a patch with larger context when
> you know that the preimage file you are patching has many similar looking
> lines, to make it _impossible_ for it to apply to places different from
> what you intended.

Yes, I know that as well - the particular patch that sparked this thread 
was ambiguous with three lines of context, but unambiguous with 6 lines, 
even when an offset still had to be applied.

So maybe you raise yet another feature proposal: What would it take for 
git to generate unambiguous patches - that is, when generating a patch 
with context, to then ensure that given the file it is being applied to, 
then context is auto-widened until there are no other offsets where the 
patch can possibly be applied?  In other words, if I say 'git diff HEAD^ 
--auto-context', then the resulting patch would have automatically have 
6 context lines for my problematic hunk, while sticking to the default 3 
context lines for other hunks that were already unambiguous.  Of course, 
this only protects you if starting from the same version of the file 
(since any other patch can introduce an ambiguity not present at the 
time you computed the minimal context needed for non-ambiguity in your 
version of the pre-patch file).

-- 
Eric Blake   eblake-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-08-16 23:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-15 23:16 rejecting patches that have an offset Eric Blake
     [not found] ` <4E49A8EA.5020507-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-08-16 22:48   ` Andreas Gruenbacher
2011-08-16 23:10     ` [bug-patch] " Eric Blake
2011-08-16 23:22 ` Junio C Hamano
     [not found]   ` <7vobzpeybh.fsf-s2KvWo2KEQL18tm6hw+yZpy9Z0UEorGK@public.gmane.org>
2011-08-16 23:41     ` Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.