[Qemu-devel] Large patch set advice

* [Qemu-devel] Large patch set advice
@ 2018-04-25 19:57 Warner Losh
  2018-04-26  4:18 ` Thomas Huth
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Warner Losh @ 2018-04-25 19:57 UTC (permalink / raw)
  To: qemu-devel

Greetings,

I’ve foolishly volunteered to rebase all the changes that the bad-user mode folks have done to a recent master rev to get these changes upstreamed. A number of people have been working on this for a long time. It’s possible now to run almost any FreeBSD binary from all the architectures. We use it to do ‘native’ builds of tens of thousands of packages in a chroot (so building FreeBSD/arm packages on a FreeBSD amd64 box). The diffs are quite large (on the order of 42k lines), so I anticipate some bumps in moving this stuff upstream.

I’ve read through the Qemu patch submission material and have been contributing to FreeBSD for the last 20 years or so, so I have a notion of proper patch size. I’m half way through rebasing and curating the branch. Before I “finish” I thought I’d ask for some advise. I’m not the primary author of most of this stuff: it’s been accumulating for the past 4 years as different people come and go on the project...

Every Open Source project has its own quirks. Nobody wants the 520 commits on the branch that I started with (too many merged rather than rebases in the last 5 years). The 320 real commits are a mess. I’ve been preening them to get rid of the silly stuff like (back this out, put it back, all the ‘spelling/typo/white space’ fixes). At the end of the rebase, I still wasn’t to ‘the same’ so I had a bunch of ‘true up’ commits to make it the same...

So as with everything that’s developed over time, the early patches no longer are bistable because there’s later patches that fix the old APIs that were current at the start of the project to use newer APIs. That’s true both on the FreeBSD side as well as the Qemu side. Add to the mix that the original code was done by user X, and the fix by user Y (and sometimes Z and W, though the paper trail is unclear and W may just be who reported it).

Oh, and the number of SoBs in the original code is somewhat lacking. A tedious detail I need to chase down independent of all the other stuff.

So, there’s a bunch of changes at first to un-if-def-ify main.c, elfload.c and others to keep the MD code separate from the MI code. There’s a bunch of changes to implement this batch of system calls, or that batch. There’s some patches to implement PowerPC and then more to finish some architectures. Plus the above mentioned API chasing...

My first question seems almost rhetorical: that’s not an acceptable state of affairs, and curation is needed to upstream. Right?

My curation plans:

(1) Find where each of the bits of my ‘true up’ commits at the end are needed and to merge those bits back to those commits (I’m guessing it’s due to conflicts, or false conflict detection in git’s merging logic, since it was 4 hours and there attepts to get through the ‘git rebase -i master’ phase). These aren’t true changes anyway: just my screw up. Most of them are easy to find where they belong.

(2) back merge the API changes as best I can to as early in the stream as possible. One wrinkle here is that there’s a lot of code motion in the early patches to get the MI/MD stuff right, but even in the original commits, it never was very pure at all. And these changes are 2-3k lines long, but completely confined to bsd-user so maybe that’s OK. (I say completely, but sometimes there’s this or that thing added to a Makefile).

(3) Keep the classes of syscall implementation commits separate, but merge back the API and/or related new syscalls that were added. It’s a tradeoff between having 200 diffs that are minnows and 20 diffs that are related schools of fish vs one huge whale that’s just too big.

(4) Deal with the cases where multiple people have worked on a patch by putting SoBs for all of them (or reworking them if I can’t get a hold of the original authors) on the commits that are merged. I didn’t see a specific policy for this, but this seems sane. The alternative seems worse (having elements that don’t compile)

(5) Send them in small batches. I’d go insane trying to manage 200 concurrent code reviews, and I can’t image that I’m unique in this. I also image that fixes in the early part of the series may have ripples to the later parts if my past experience with these things has any relevance.

Thanks for any help you can give me.

Warner Losh
imp@FreeSBD.org

P.S. Yes, I get that we should have been more engaged all along.

^ permalink raw reply	[flat|nested] 12+ messages in thread