linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
@ 2020-01-31  5:25 Darrick J. Wong
  2020-01-31  7:30 ` [Lsf-pc] " Amir Goldstein
  2020-02-07 22:03 ` Matthew Wilcox
  0 siblings, 2 replies; 27+ messages in thread
From: Darrick J. Wong @ 2020-01-31  5:25 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-fsdevel, xfs, Eric Sandeen, Eryu Guan

Hi everyone,

I would like to discuss how to improve the process of shepherding code
into the kernel to make it more enjoyable for maintainers, reviewers,
and code authors.  Here is a brief summary of how we got here:

Years ago, XFS had one maintainer tending to all four key git repos
(kernel, userspace, documentation, testing).  Like most subsystems, the
maintainer did a lot of review and porting code between the kernel and
userspace, though with help from others.

It turns out that this didn't scale very well, so we split the
responsibilities into three maintainers.  Like most subsystems, the
maintainers still did a lot of review and porting work, though with help
from others.

It turns out that this system doesn't scale very well either.  Even with
three maintainers sharing access to the git trees and working together
to get reviews done, mailing list traffic has been trending upwards for
years, and we still can't keep up.  I fear that many maintainers are
burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
testing of the git trees, but keeping up with the mail and the reviews.

So what do we do about this?  I think we (the XFS project, anyway)
should increase the amount of organizing in our review process.  For
large patchsets, I would like to improve informal communication about
who the author might like to have conduct a review, who might be
interested in conducting a review, estimates of how much time a reviewer
has to spend on a patchset, and of course, feedback about how it went.
This of course is to lay the groundwork for making a case to our bosses
for growing our community, allocating time for reviews and for growing
our skills as reviewers.

---

I want to spend the time between right now and whenever this discussion
happens to make a list of everything that works and that could be made
better about our development process.

I want to spend five minutes at the start of the discussion to
acknowledge everyone's feelings around that list that we will have
compiled.

Then I want to spend the rest of the session breaking up the problems
into small enough pieces to solve, discussing solutions to those
problems, and (ideally) pushing towards a consensus on what series of
small adjustments we can make to arrive at something that works better
for everyone.

--D

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-01-31  5:25 [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale Darrick J. Wong
@ 2020-01-31  7:30 ` Amir Goldstein
  2020-02-01  3:20   ` Allison Collins
  2020-02-12 22:42   ` Darrick J. Wong
  2020-02-07 22:03 ` Matthew Wilcox
  1 sibling, 2 replies; 27+ messages in thread
From: Amir Goldstein @ 2020-01-31  7:30 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: lsf-pc, linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> Hi everyone,
>
> I would like to discuss how to improve the process of shepherding code
> into the kernel to make it more enjoyable for maintainers, reviewers,
> and code authors.  Here is a brief summary of how we got here:
>
> Years ago, XFS had one maintainer tending to all four key git repos
> (kernel, userspace, documentation, testing).  Like most subsystems, the
> maintainer did a lot of review and porting code between the kernel and
> userspace, though with help from others.
>
> It turns out that this didn't scale very well, so we split the
> responsibilities into three maintainers.  Like most subsystems, the
> maintainers still did a lot of review and porting work, though with help
> from others.
>
> It turns out that this system doesn't scale very well either.  Even with
> three maintainers sharing access to the git trees and working together
> to get reviews done, mailing list traffic has been trending upwards for
> years, and we still can't keep up.  I fear that many maintainers are
> burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
> testing of the git trees, but keeping up with the mail and the reviews.
>
> So what do we do about this?  I think we (the XFS project, anyway)
> should increase the amount of organizing in our review process.  For
> large patchsets, I would like to improve informal communication about
> who the author might like to have conduct a review, who might be
> interested in conducting a review, estimates of how much time a reviewer
> has to spend on a patchset, and of course, feedback about how it went.
> This of course is to lay the groundwork for making a case to our bosses
> for growing our community, allocating time for reviews and for growing
> our skills as reviewers.
>

Interesting.

Eryu usually posts a weekly status of xfstests review queue, often with
a call for reviewers, sometimes with specific patch series mentioned.
That helps me as a developer to monitor the status of my own work
and it helps me as a reviewer to put the efforts where the maintainer
needs me the most.

For xfs kernel patches, I can represent the voice of "new blood".
Getting new people to join the review effort is quite a hard barrier.
I have taken a few stabs at doing review for xfs patch series over the
year, but it mostly ends up feeling like it helped me (get to know xfs code
better) more than it helped the maintainer, because the chances of a
new reviewer to catch meaningful bugs are very low and if another reviewer
is going to go over the same patch series, the chances of new reviewer to
catch bugs that novice reviewer will not catch are extremely low.

However, there are quite a few cleanup and refactoring patch series,
especially on the xfs list, where a review from an "outsider" could still
be of value to the xfs community. OTOH, for xfs maintainer, those are
the easy patches to review, so is there a gain in offloading those reviews?

Bottom line - a report of the subsystem review queue status, call for
reviewers and highlighting specific areas in need of review is a good idea.
Developers responding to that report publicly with availability for review,
intention and expected time frame for taking on a review would be helpful
for both maintainers and potential reviewers.

Thanks,
Amir.

> ---
>
> I want to spend the time between right now and whenever this discussion
> happens to make a list of everything that works and that could be made
> better about our development process.
>
> I want to spend five minutes at the start of the discussion to
> acknowledge everyone's feelings around that list that we will have
> compiled.
>
> Then I want to spend the rest of the session breaking up the problems
> into small enough pieces to solve, discussing solutions to those
> problems, and (ideally) pushing towards a consensus on what series of
> small adjustments we can make to arrive at something that works better
> for everyone.
>
> --D
> _______________________________________________
> Lsf-pc mailing list
> Lsf-pc@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/lsf-pc

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-01-31  7:30 ` [Lsf-pc] " Amir Goldstein
@ 2020-02-01  3:20   ` Allison Collins
  2020-02-02 21:46     ` Dave Chinner
  2020-02-12 22:42   ` Darrick J. Wong
  1 sibling, 1 reply; 27+ messages in thread
From: Allison Collins @ 2020-02-01  3:20 UTC (permalink / raw)
  To: Amir Goldstein, Darrick J. Wong
  Cc: lsf-pc, linux-fsdevel, xfs, Eryu Guan, Eric Sandeen



On 1/31/20 12:30 AM, Amir Goldstein wrote:
> On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>>
>> Hi everyone,
>>
>> I would like to discuss how to improve the process of shepherding code
>> into the kernel to make it more enjoyable for maintainers, reviewers,
>> and code authors.  Here is a brief summary of how we got here:
>>
>> Years ago, XFS had one maintainer tending to all four key git repos
>> (kernel, userspace, documentation, testing).  Like most subsystems, the
>> maintainer did a lot of review and porting code between the kernel and
>> userspace, though with help from others.
>>
>> It turns out that this didn't scale very well, so we split the
>> responsibilities into three maintainers.  Like most subsystems, the
>> maintainers still did a lot of review and porting work, though with help
>> from others.
>>
>> It turns out that this system doesn't scale very well either.  Even with
>> three maintainers sharing access to the git trees and working together
>> to get reviews done, mailing list traffic has been trending upwards for
>> years, and we still can't keep up.  I fear that many maintainers are
>> burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
>> testing of the git trees, but keeping up with the mail and the reviews.
>>
>> So what do we do about this?  I think we (the XFS project, anyway)
>> should increase the amount of organizing in our review process.  For
>> large patchsets, I would like to improve informal communication about
>> who the author might like to have conduct a review, who might be
>> interested in conducting a review, estimates of how much time a reviewer
>> has to spend on a patchset, and of course, feedback about how it went.
>> This of course is to lay the groundwork for making a case to our bosses
>> for growing our community, allocating time for reviews and for growing
>> our skills as reviewers.
>>
> 
> Interesting.
> 
> Eryu usually posts a weekly status of xfstests review queue, often with
> a call for reviewers, sometimes with specific patch series mentioned.
> That helps me as a developer to monitor the status of my own work
> and it helps me as a reviewer to put the efforts where the maintainer
> needs me the most.
> 
> For xfs kernel patches, I can represent the voice of "new blood".
> Getting new people to join the review effort is quite a hard barrier.
> I have taken a few stabs at doing review for xfs patch series over the
> year, but it mostly ends up feeling like it helped me (get to know xfs code
> better) more than it helped the maintainer, because the chances of a
> new reviewer to catch meaningful bugs are very low and if another reviewer
> is going to go over the same patch series, the chances of new reviewer to
> catch bugs that novice reviewer will not catch are extremely low.
That sounds like a familiar experience.  Lots of times I'll start a 
review, but then someone else will finish it before I do, and catch more 
things along the way.  So I sort of feel like if it's not something I 
can get through quickly, then it's not a very good distribution of work 
effort and I should shift to something else. Most of the time, I'll 
study it until I feel like I understand what the person is trying to do, 
and I might catch stuff that appears like it may not align with that 
pursuit, but I don't necessarily feel I can deem it void of all 
unforeseen bugs.

> 
> However, there are quite a few cleanup and refactoring patch series,
> especially on the xfs list, where a review from an "outsider" could still
> be of value to the xfs community. OTOH, for xfs maintainer, those are
> the easy patches to review, so is there a gain in offloading those reviews?
> 
> Bottom line - a report of the subsystem review queue status, call for
> reviewers and highlighting specific areas in need of review is a good idea.
> Developers responding to that report publicly with availability for review,
> intention and expected time frame for taking on a review would be helpful
> for both maintainers and potential reviewers.
I definitely think that would help delegate review efforts a little 
more.  That way it's clear what people are working on, and what still 
needs attention.

Allison
> 
> Thanks,
> Amir.
> 
>> ---
>>
>> I want to spend the time between right now and whenever this discussion
>> happens to make a list of everything that works and that could be made
>> better about our development process.
>>
>> I want to spend five minutes at the start of the discussion to
>> acknowledge everyone's feelings around that list that we will have
>> compiled.
>>
>> Then I want to spend the rest of the session breaking up the problems
>> into small enough pieces to solve, discussing solutions to those
>> problems, and (ideally) pushing towards a consensus on what series of
>> small adjustments we can make to arrive at something that works better
>> for everyone.
>>
>> --D
>> _______________________________________________
>> Lsf-pc mailing list
>> Lsf-pc@lists.linux-foundation.org
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.linuxfoundation.org_mailman_listinfo_lsf-2Dpc&d=DwIBaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=LHZQ8fHvy6wDKXGTWcm97burZH5sQKHRDMaY1UthQxc&m=Ql7vKruZTArpiIL8k0b6mdoZIYyOEUFrtFysmO8BZl4&s=Se3_uV_gEF1-YsGVAlu6NVh1KqcEzWExsEy5PCH4BAM&e=

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-01  3:20   ` Allison Collins
@ 2020-02-02 21:46     ` Dave Chinner
  2020-02-09 17:12       ` Allison Collins
  2020-02-12 21:36       ` Darrick J. Wong
  0 siblings, 2 replies; 27+ messages in thread
From: Dave Chinner @ 2020-02-02 21:46 UTC (permalink / raw)
  To: Allison Collins
  Cc: Amir Goldstein, Darrick J. Wong, lsf-pc, linux-fsdevel, xfs,
	Eryu Guan, Eric Sandeen

On Fri, Jan 31, 2020 at 08:20:37PM -0700, Allison Collins wrote:
> 
> 
> On 1/31/20 12:30 AM, Amir Goldstein wrote:
> > On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > 
> > > Hi everyone,
> > > 
> > > I would like to discuss how to improve the process of shepherding code
> > > into the kernel to make it more enjoyable for maintainers, reviewers,
> > > and code authors.  Here is a brief summary of how we got here:
> > > 
> > > Years ago, XFS had one maintainer tending to all four key git repos
> > > (kernel, userspace, documentation, testing).  Like most subsystems, the
> > > maintainer did a lot of review and porting code between the kernel and
> > > userspace, though with help from others.
> > > 
> > > It turns out that this didn't scale very well, so we split the
> > > responsibilities into three maintainers.  Like most subsystems, the
> > > maintainers still did a lot of review and porting work, though with help
> > > from others.
> > > 
> > > It turns out that this system doesn't scale very well either.  Even with
> > > three maintainers sharing access to the git trees and working together
> > > to get reviews done, mailing list traffic has been trending upwards for
> > > years, and we still can't keep up.  I fear that many maintainers are
> > > burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
> > > testing of the git trees, but keeping up with the mail and the reviews.
> > > 
> > > So what do we do about this?  I think we (the XFS project, anyway)
> > > should increase the amount of organizing in our review process.  For
> > > large patchsets, I would like to improve informal communication about
> > > who the author might like to have conduct a review, who might be
> > > interested in conducting a review, estimates of how much time a reviewer
> > > has to spend on a patchset, and of course, feedback about how it went.
> > > This of course is to lay the groundwork for making a case to our bosses
> > > for growing our community, allocating time for reviews and for growing
> > > our skills as reviewers.
> > > 
> > 
> > Interesting.
> > 
> > Eryu usually posts a weekly status of xfstests review queue, often with
> > a call for reviewers, sometimes with specific patch series mentioned.
> > That helps me as a developer to monitor the status of my own work
> > and it helps me as a reviewer to put the efforts where the maintainer
> > needs me the most.
> > 
> > For xfs kernel patches, I can represent the voice of "new blood".
> > Getting new people to join the review effort is quite a hard barrier.
> > I have taken a few stabs at doing review for xfs patch series over the
> > year, but it mostly ends up feeling like it helped me (get to know xfs code
> > better) more than it helped the maintainer, because the chances of a
> > new reviewer to catch meaningful bugs are very low and if another reviewer
> > is going to go over the same patch series, the chances of new reviewer to
> > catch bugs that novice reviewer will not catch are extremely low.
> That sounds like a familiar experience.  Lots of times I'll start a review,
> but then someone else will finish it before I do, and catch more things
> along the way.  So I sort of feel like if it's not something I can get
> through quickly, then it's not a very good distribution of work effort and I
> should shift to something else. Most of the time, I'll study it until I feel
> like I understand what the person is trying to do, and I might catch stuff
> that appears like it may not align with that pursuit, but I don't
> necessarily feel I can deem it void of all unforeseen bugs.

I think you are both underselling yourselves. Imposter syndrome and
all that jazz.

The reality is that we don't need more people doing the sorts of
"how does this work with the rest of XFS" reviews that people like
Darricki or Christoph do. What we really need is more people looking
at whether loops are correctly terminated, the right variable types
are used, we don't have signed vs unsigned issues, 32 bit overflows,
use the right 32/64 bit division functions, the error handling logic
is correct, etc.

It's those sorts of little details that lead to most bugs, and
that's precisely the sort of thing that is typically missed by an
experienced developer doing a "is this the best possible
implemenation of this functionality" review.

A recent personal example: look at the review of Matthew Wilcox's
->readahead() series that I recently did. I noticed problems in the
core change and the erofs and btfrs implementations not because I
knew anything about those filesystems, but because I was checking
whether the new loops iterated the pages in the page cache
correctly. i.e. all I was really looking at was variable counting
and loop initialisation and termination conditions. Experience tells
me this stuff is notoriously difficult to get right, so that's what
I looked at....

IOWs, you don't need to know anything about the subsystem to
perform such a useful review, and a lot of the time you won't find a
problem. But it's still a very useful review to perform, and in
doing so you've validated, to the best of your ability, that the
change is sound. Put simply:

	"I've checked <all these things> and it looks good to me.

	Reviewed-by: Joe Bloggs <joe@blogg.com>"

This is a very useful, valid review, regardless of whether you find
anything. It's also a method of review that you can use when you
have limited time - rather than trying to check everything and
spending hours on a pathset, pick one thing and get the entire
review done in 15 minutes. Then do the same thing for the next patch
set. You'll be surprised how many things you notice that aren't what
you are looking for when you do this.

Hence the fact that other people find (different) issues is
irrelevant - they'll be looking at different things to you, and
there may not even be any overlap in the focus/scope of the reviews
that have been performed. You may find the same things, but that is
also not a bad thing - I intentionally don't read other reviews
before I review a patch series, so that I don't taint my view of the
code before I look at it (e.g., darrick found a bug in this code, so
I don't need to look at it...).

IOWs, if you are starting from the premise that "I don't know this
code well enough to perform a useful review" then you are setting
yourself up for failure right at the start. Read the series
description, think about the change being made, use your experience
to answer the question "what's a mistake I could make performing
this change". Then go looking for that mistake through the
patch(es). In the process of performing this review, more than
likely, you'll notice bugs other than what you are actually looking
for...

This does not require any deep subsystem specific knowledge, but in
doing this sort of review you're going to notice things and learn
about the code and slowly build your knowledge and experience about
that subsystem.

> > However, there are quite a few cleanup and refactoring patch series,
> > especially on the xfs list, where a review from an "outsider" could still
> > be of value to the xfs community. OTOH, for xfs maintainer, those are
> > the easy patches to review, so is there a gain in offloading those reviews?
> > 
> > Bottom line - a report of the subsystem review queue status, call for
> > reviewers and highlighting specific areas in need of review is a good idea.
> > Developers responding to that report publicly with availability for review,
> > intention and expected time frame for taking on a review would be helpful
> > for both maintainers and potential reviewers.
> I definitely think that would help delegate review efforts a little more.
> That way it's clear what people are working on, and what still needs
> attention.

It is not the maintainer's repsonsibility to gather reviewers. That
is entirely the responsibility of the patch submitter. That is, if
the code has gone unreviewed, it is up to the submitter to find
people to review the code, not the maintainer. If you, as a
developer, are unable to find people willing to review your code
then it's a sign you haven't been reviewing enough code yourself.

Good reviewers are a valuable resource - as a developer I rely on
reviewers to get my code merged, so if I don't review code and
everyone else behaves the same way, how can I possibly get my code
merged? IOWs, review is something every developer should be spending
a significant chunk of their time on. IMO, if you are not spending
*at least* a whole day a week reviewing code, you're not actually
doing enough code review to allow other developers to be as
productive as you are.

The more you review other people's code, the more you learn about
the code and the more likely other people will be to review your
code because they know you'll review their code in turn.  It's a
positive reinforcement cycle that benefits both the individual
developers personally and the wider community.

But this positive reinforcemnt cycle just doesn't happen if people
avoid reviewing code because they think "I don't know anything so my
review is not going to be worth anything".  Constructive review, not
matter whether it's performed at a simple or complex level, is
always valuable.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-01-31  5:25 [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale Darrick J. Wong
  2020-01-31  7:30 ` [Lsf-pc] " Amir Goldstein
@ 2020-02-07 22:03 ` Matthew Wilcox
  2020-02-12  3:51   ` Theodore Y. Ts'o
  2020-02-12 22:21   ` Darrick J. Wong
  1 sibling, 2 replies; 27+ messages in thread
From: Matthew Wilcox @ 2020-02-07 22:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: lsf-pc, linux-fsdevel, xfs, Eric Sandeen, Eryu Guan

On Thu, Jan 30, 2020 at 09:25:20PM -0800, Darrick J. Wong wrote:
> It turns out that this system doesn't scale very well either.  Even with
> three maintainers sharing access to the git trees and working together
> to get reviews done, mailing list traffic has been trending upwards for
> years, and we still can't keep up.  I fear that many maintainers are
> burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
> testing of the git trees, but keeping up with the mail and the reviews.

I think the LSFMMBPF conference is part of the problem.  With the best of
intentions, we have set up a system which serves to keep all but the most
dedicated from having a voice at the premier conference for filesystems,
memory management, storage (and now networking).  It wasn't intended to
be that way, but that's what has happened, and it isn't serving us well
as a result.

Three anecdotes.  First, look at Jason's mail from earlier today:
https://lore.kernel.org/linux-mm/20200207194620.GG8731@bombadil.infradead.org/T/#t

There are 11 people on that list, plus Jason, plus three more than I
recommended.  That's 15, just for that one topic.  I think maybe half
of those people will get an invite anyway, but adding on an extra 5-10
people for (what I think is) a critically important topic at the very
nexus of storage, filesystems, memory management, networking and graphics
is almost certainly out of bounds for the scale of the current conference.

Second, I've had Outreachy students who have made meaningful contributions
to the kernel.  Part of their bursary is a travel grant to go to a
conference and they were excited to come to LSFMM.  I've had to tell
them "this conference is invite-only for the top maintainers; you can't
come".  They ended up going to an Open Source Summit conference instead.
By excluding the people who are starting out, we are failing to grow
our community.  I don't think it would have hurt for them to be in the
room; they were unlikely to speak, and perhaps they would have gone on
to make larger contributions.

Third, I hear from people who work on a specific filesystem "Of the
twenty or so slots for the FS part of the conference, there are about
half a dozen generic filesystem people who'll get an invite, then maybe
six filesystems who'll get two slots each, but what we really want to
do is get everybody working on this filesystem in a room and go over
our particular problem areas".

This kills me because LSFMM has been such a critically important part of
Linux development for over a decade, but I think at this point it is at
least not serving us the way we want it to, and may even be doing more
harm than good.  I think it needs to change, and more people need to
be welcomed to the conference.  Maybe it needs to not be invite-only.
Maybe it can stay invite-only, but be twice as large.  Maybe everybody
who's coming needs to front $100 to put towards the costs of a larger
meeting space with more rooms.

Not achievable for this year, I'm sure, but if we start talking now
maybe we can have a better conference in 2021.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-02 21:46     ` Dave Chinner
@ 2020-02-09 17:12       ` Allison Collins
  2020-02-12  0:21         ` NeilBrown
                           ` (2 more replies)
  2020-02-12 21:36       ` Darrick J. Wong
  1 sibling, 3 replies; 27+ messages in thread
From: Allison Collins @ 2020-02-09 17:12 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Amir Goldstein, Darrick J. Wong, lsf-pc, linux-fsdevel, xfs,
	Eryu Guan, Eric Sandeen



On 2/2/20 2:46 PM, Dave Chinner wrote:
> On Fri, Jan 31, 2020 at 08:20:37PM -0700, Allison Collins wrote:
>>
>>
>> On 1/31/20 12:30 AM, Amir Goldstein wrote:
>>> On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> I would like to discuss how to improve the process of shepherding code
>>>> into the kernel to make it more enjoyable for maintainers, reviewers,
>>>> and code authors.  Here is a brief summary of how we got here:
>>>>
>>>> Years ago, XFS had one maintainer tending to all four key git repos
>>>> (kernel, userspace, documentation, testing).  Like most subsystems, the
>>>> maintainer did a lot of review and porting code between the kernel and
>>>> userspace, though with help from others.
>>>>
>>>> It turns out that this didn't scale very well, so we split the
>>>> responsibilities into three maintainers.  Like most subsystems, the
>>>> maintainers still did a lot of review and porting work, though with help
>>>> from others.
>>>>
>>>> It turns out that this system doesn't scale very well either.  Even with
>>>> three maintainers sharing access to the git trees and working together
>>>> to get reviews done, mailing list traffic has been trending upwards for
>>>> years, and we still can't keep up.  I fear that many maintainers are
>>>> burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
>>>> testing of the git trees, but keeping up with the mail and the reviews.
>>>>
>>>> So what do we do about this?  I think we (the XFS project, anyway)
>>>> should increase the amount of organizing in our review process.  For
>>>> large patchsets, I would like to improve informal communication about
>>>> who the author might like to have conduct a review, who might be
>>>> interested in conducting a review, estimates of how much time a reviewer
>>>> has to spend on a patchset, and of course, feedback about how it went.
>>>> This of course is to lay the groundwork for making a case to our bosses
>>>> for growing our community, allocating time for reviews and for growing
>>>> our skills as reviewers.
>>>>
>>>
>>> Interesting.
>>>
>>> Eryu usually posts a weekly status of xfstests review queue, often with
>>> a call for reviewers, sometimes with specific patch series mentioned.
>>> That helps me as a developer to monitor the status of my own work
>>> and it helps me as a reviewer to put the efforts where the maintainer
>>> needs me the most.
>>>
>>> For xfs kernel patches, I can represent the voice of "new blood".
>>> Getting new people to join the review effort is quite a hard barrier.
>>> I have taken a few stabs at doing review for xfs patch series over the
>>> year, but it mostly ends up feeling like it helped me (get to know xfs code
>>> better) more than it helped the maintainer, because the chances of a
>>> new reviewer to catch meaningful bugs are very low and if another reviewer
>>> is going to go over the same patch series, the chances of new reviewer to
>>> catch bugs that novice reviewer will not catch are extremely low.
>> That sounds like a familiar experience.  Lots of times I'll start a review,
>> but then someone else will finish it before I do, and catch more things
>> along the way.  So I sort of feel like if it's not something I can get
>> through quickly, then it's not a very good distribution of work effort and I
>> should shift to something else. Most of the time, I'll study it until I feel
>> like I understand what the person is trying to do, and I might catch stuff
>> that appears like it may not align with that pursuit, but I don't
>> necessarily feel I can deem it void of all unforeseen bugs.
> 
> I think you are both underselling yourselves. Imposter syndrome and
> all that jazz.
> 
> The reality is that we don't need more people doing the sorts of
> "how does this work with the rest of XFS" reviews that people like
> Darricki or Christoph do. What we really need is more people looking
> at whether loops are correctly terminated, the right variable types
> are used, we don't have signed vs unsigned issues, 32 bit overflows,
> use the right 32/64 bit division functions, the error handling logic
> is correct, etc.
> 
> It's those sorts of little details that lead to most bugs, and
> that's precisely the sort of thing that is typically missed by an
> experienced developer doing a "is this the best possible
> implemenation of this functionality" review.
> 
> A recent personal example: look at the review of Matthew Wilcox's
> ->readahead() series that I recently did. I noticed problems in the
> core change and the erofs and btfrs implementations not because I
> knew anything about those filesystems, but because I was checking
> whether the new loops iterated the pages in the page cache
> correctly. i.e. all I was really looking at was variable counting
> and loop initialisation and termination conditions. Experience tells
> me this stuff is notoriously difficult to get right, so that's what
> I looked at....
> 
> IOWs, you don't need to know anything about the subsystem to
> perform such a useful review, and a lot of the time you won't find a
> problem. But it's still a very useful review to perform, and in
> doing so you've validated, to the best of your ability, that the
> change is sound. Put simply:
> 
> 	"I've checked <all these things> and it looks good to me.
> 
> 	Reviewed-by: Joe Bloggs <joe@blogg.com>"
> 
> This is a very useful, valid review, regardless of whether you find
> anything. It's also a method of review that you can use when you
> have limited time - rather than trying to check everything and
> spending hours on a pathset, pick one thing and get the entire
> review done in 15 minutes. Then do the same thing for the next patch
> set. You'll be surprised how many things you notice that aren't what
> you are looking for when you do this.
> 
> Hence the fact that other people find (different) issues is
> irrelevant - they'll be looking at different things to you, and
> there may not even be any overlap in the focus/scope of the reviews
> that have been performed. You may find the same things, but that is
> also not a bad thing - I intentionally don't read other reviews
> before I review a patch series, so that I don't taint my view of the
> code before I look at it (e.g., darrick found a bug in this code, so
> I don't need to look at it...).
> 
> IOWs, if you are starting from the premise that "I don't know this
> code well enough to perform a useful review" then you are setting
> yourself up for failure right at the start. Read the series
> description, think about the change being made, use your experience
> to answer the question "what's a mistake I could make performing
> this change". Then go looking for that mistake through the
> patch(es). In the process of performing this review, more than
> likely, you'll notice bugs other than what you are actually looking
> for...
> 
> This does not require any deep subsystem specific knowledge, but in
> doing this sort of review you're going to notice things and learn
> about the code and slowly build your knowledge and experience about
> that subsystem.
> 
>>> However, there are quite a few cleanup and refactoring patch series,
>>> especially on the xfs list, where a review from an "outsider" could still
>>> be of value to the xfs community. OTOH, for xfs maintainer, those are
>>> the easy patches to review, so is there a gain in offloading those reviews?
>>>
>>> Bottom line - a report of the subsystem review queue status, call for
>>> reviewers and highlighting specific areas in need of review is a good idea.
>>> Developers responding to that report publicly with availability for review,
>>> intention and expected time frame for taking on a review would be helpful
>>> for both maintainers and potential reviewers.
>> I definitely think that would help delegate review efforts a little more.
>> That way it's clear what people are working on, and what still needs
>> attention.
> 
> It is not the maintainer's repsonsibility to gather reviewers. That
> is entirely the responsibility of the patch submitter. That is, if
> the code has gone unreviewed, it is up to the submitter to find
> people to review the code, not the maintainer. If you, as a
> developer, are unable to find people willing to review your code
> then it's a sign you haven't been reviewing enough code yourself.
> 
> Good reviewers are a valuable resource - as a developer I rely on
> reviewers to get my code merged, so if I don't review code and
> everyone else behaves the same way, how can I possibly get my code
> merged? IOWs, review is something every developer should be spending
> a significant chunk of their time on. IMO, if you are not spending
> *at least* a whole day a week reviewing code, you're not actually
> doing enough code review to allow other developers to be as
> productive as you are.
> 
> The more you review other people's code, the more you learn about
> the code and the more likely other people will be to review your
> code because they know you'll review their code in turn.  It's a
> positive reinforcement cycle that benefits both the individual
> developers personally and the wider community.
> 
> But this positive reinforcemnt cycle just doesn't happen if people
> avoid reviewing code because they think "I don't know anything so my
> review is not going to be worth anything".  Constructive review, not
> matter whether it's performed at a simple or complex level, is
> always valuable.
> 
> Cheers,
> 
> Dave.
> 
Well, I can see the response is meant to be encouraging, and you are 
right that everyone needs to give to receive :-)

I have thought a lot about this, and I do have some opinions about it 
how the process is described to work vs how it ends up working though. 
There has quite been a few times I get conflicting reviews from multiple 
reviewers. I suspect either because reviewers are not seeing each others 
reviews, or because it is difficult for people to recall or even find 
discussions on prior revisions.  And so at times, I find myself puzzling 
a bit trying to extrapolate what the community as a whole really wants.

For example: a reviewer may propose a minor change, perhaps a style 
change, and as long as it's not terrible I assume this is just how 
people are used to seeing things implemented.  So I amend it, and in the 
next revision someone expresses that they dislike it and makes a 
different proposition.  Generally I'll mention that this change was 
requested, but if anyone feels particularly strongly about it, to please 
chime in.  Most of the time I don't hear anything, I suspect because 
either the first reviewer isn't around, or they don't have time to 
revisit it?  Maybe they weren't strongly opinionated about it to begin 
with?  It could have been they were feeling pressure to generate 
reviews, or maybe an employer is measuring their engagement?  In any 
case, if it goes around a third time, I'll usually start including links 
to prior reviews to try and get people on the same page, but most of the 
time I've found the result is that it just falls silent.

At this point though it feels unclear to me if everyone is happy?  Did 
we have a constructive review?  Maybe it's not a very big deal and I 
should just move on.  And in many scenarios like the one above, the 
exact outcome appears to be of little concern to people in the greater 
scheme of things.  But this pattern does not always scale well in all 
cases.  Complex issues that persist over time generally do so because no 
one yet has a clear idea of what a correct solution even looks like, or 
perhaps cannot agree on one.  In my experience, getting people to come 
together on a common goal requires a sort of exploratory coding effort. 
Like a prototype that people can look at, learn from, share ideas, and 
then adapt the model from there.  But for that to work, they need to 
have been engaged with the history of it.  They need the common 
experience of seeing what has worked and what hasn't.  It helps people 
to let go of theories that have not performed well in practice, and 
shift to alternate approaches that have.  In a way, reviewers that have 
been historically more involved with a particular effort start to become 
a little integral to it as its reviewers.  Which I *think* is what 
Darrick may be eluding to in his initial proposition.  People request 
for certain reviewers, or perhaps the reviewers can volunteer to be sort 
of assigned to it in an effort to provide more constructive reviews.  In 
this way, reviewers allocate their efforts where they are most 
effective, and in doing so better distribute the work load as well.  Did 
I get that about right?  Thoughts?

Allison

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-09 17:12       ` Allison Collins
@ 2020-02-12  0:21         ` NeilBrown
  2020-02-12  6:58           ` Darrick J. Wong
  2020-02-12 22:06         ` Darrick J. Wong
  2020-02-12 23:39         ` Dave Chinner
  2 siblings, 1 reply; 27+ messages in thread
From: NeilBrown @ 2020-02-12  0:21 UTC (permalink / raw)
  To: Allison Collins, Dave Chinner
  Cc: Amir Goldstein, Darrick J. Wong, lsf-pc, linux-fsdevel, xfs,
	Eryu Guan, Eric Sandeen

[-- Attachment #1: Type: text/plain, Size: 4339 bytes --]

On Sun, Feb 09 2020, Allison Collins wrote:

> Well, I can see the response is meant to be encouraging, and you are 
> right that everyone needs to give to receive :-)
>
> I have thought a lot about this, and I do have some opinions about it 
> how the process is described to work vs how it ends up working though. 
> There has quite been a few times I get conflicting reviews from multiple 
> reviewers. I suspect either because reviewers are not seeing each others 
> reviews, or because it is difficult for people to recall or even find 
> discussions on prior revisions.  And so at times, I find myself puzzling 
> a bit trying to extrapolate what the community as a whole really wants.

The "community as a whole" is not a person and does not have a coherent
opinion.  You will never please everyone and as you've suggested below,
it can be hard to tell how strongly people really hold the opinions they
reveal.

You need to give up trying to please "the community", but instead develop
your own sense of taste that aligns with the concrete practice of the
community, and then please yourself.

Then when someone criticizes your code, you need to decide for yourself
whether it is a useful criticism or not.  This might involve hunting
through the existing body of code to see what patterns are most common.
The end result is that either you defend your code, or you change your
opinion (both can be quite appropriate).  If you change your opinion,
then you probably change your code too.

Your goal isn't to ensure everyone is happy, only to ensure that no-one
is justifiably angry.

NeilBrown

>
> For example: a reviewer may propose a minor change, perhaps a style 
> change, and as long as it's not terrible I assume this is just how 
> people are used to seeing things implemented.  So I amend it, and in the 
> next revision someone expresses that they dislike it and makes a 
> different proposition.  Generally I'll mention that this change was 
> requested, but if anyone feels particularly strongly about it, to please 
> chime in.  Most of the time I don't hear anything, I suspect because 
> either the first reviewer isn't around, or they don't have time to 
> revisit it?  Maybe they weren't strongly opinionated about it to begin 
> with?  It could have been they were feeling pressure to generate 
> reviews, or maybe an employer is measuring their engagement?  In any 
> case, if it goes around a third time, I'll usually start including links 
> to prior reviews to try and get people on the same page, but most of the 
> time I've found the result is that it just falls silent.
>
> At this point though it feels unclear to me if everyone is happy?  Did 
> we have a constructive review?  Maybe it's not a very big deal and I 
> should just move on.  And in many scenarios like the one above, the 
> exact outcome appears to be of little concern to people in the greater 
> scheme of things.  But this pattern does not always scale well in all 
> cases.  Complex issues that persist over time generally do so because no 
> one yet has a clear idea of what a correct solution even looks like, or 
> perhaps cannot agree on one.  In my experience, getting people to come 
> together on a common goal requires a sort of exploratory coding effort. 
> Like a prototype that people can look at, learn from, share ideas, and 
> then adapt the model from there.  But for that to work, they need to 
> have been engaged with the history of it.  They need the common 
> experience of seeing what has worked and what hasn't.  It helps people 
> to let go of theories that have not performed well in practice, and 
> shift to alternate approaches that have.  In a way, reviewers that have 
> been historically more involved with a particular effort start to become 
> a little integral to it as its reviewers.  Which I *think* is what 
> Darrick may be eluding to in his initial proposition.  People request 
> for certain reviewers, or perhaps the reviewers can volunteer to be sort 
> of assigned to it in an effort to provide more constructive reviews.  In 
> this way, reviewers allocate their efforts where they are most 
> effective, and in doing so better distribute the work load as well.  Did 
> I get that about right?  Thoughts?
>
> Allison

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-07 22:03 ` Matthew Wilcox
@ 2020-02-12  3:51   ` Theodore Y. Ts'o
  2020-02-12 22:29     ` Darrick J. Wong
  2020-02-12 22:21   ` Darrick J. Wong
  1 sibling, 1 reply; 27+ messages in thread
From: Theodore Y. Ts'o @ 2020-02-12  3:51 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Darrick J. Wong, lsf-pc, linux-fsdevel, xfs, Eric Sandeen, Eryu Guan

On Fri, Feb 07, 2020 at 02:03:33PM -0800, Matthew Wilcox wrote:
> On Thu, Jan 30, 2020 at 09:25:20PM -0800, Darrick J. Wong wrote:
> > It turns out that this system doesn't scale very well either.  Even with
> > three maintainers sharing access to the git trees,,,
>
> I think the LSFMMBPF conference is part of the problem.  With the best of
> intentions, we have set up a system which serves to keep all but the most
> dedicated from having a voice at the premier conference for filesystems,
> memory management, storage (and now networking).  It wasn't intended to
> be that way, but that's what has happened, and it isn't serving us well
> as a result.
>
> ...
>
> This kills me because LSFMM has been such a critically important part of
> Linux development for over a decade, but I think at this point it is at
> least not serving us the way we want it to, and may even be doing more
> harm than good.  I think it needs to change, and more people need to
> be welcomed to the conference.  Maybe it needs to not be invite-only.
> Maybe it can stay invite-only, but be twice as large.  Maybe everybody
> who's coming needs to front $100 to put towards the costs of a larger
> meeting space with more rooms.

One of the things that I've trying to suggest for at least the last
year or two is that we need colocate LSF/MM with a larger conference.
In my mind, what would be great would be something sort of like
Plumbers, but in the first half of year.  The general idea would be to
have two major systems-level conferences about six months apart.

The LSF/MM conference could still be invite only, much like we have
had the Maintainer's Summit and the Networking Summit colocated with
Plumbers in Lisbon in 2019 and Vancouver in 2018.  But it would be
colocated with other topic specific workshops / summits, and there
would be space for topics like what you described below:

> There are 11 people on that list, plus Jason, plus three more than I
> recommended.  That's 15, just for that one topic.  I think maybe half
> of those people will get an invite anyway, but adding on an extra 5-10
> people for (what I think is) a critically important topic at the very
> nexus of storage, filesystems, memory management, networking and graphics
> is almost certainly out of bounds for the scale of the current conference.

After all, this is *precisely* the scaling problem that we had with
the Kernel Summit.  The LSF/MM summit can really only deal with
subjects that require high-level coordination between maintainers.
For more focused topics, we will need a wider set of developers than
can fit in size constraints of the LSF/MM venue.

This also addresses Darrick's problem, in that most of us can probably
point to more junior engineers that we would like to help to develop,
which means they need to meet other Storage, File System, and MM
developers --- both more senior ones, and other colleagues in the
community.  Right now, we don't have a venue for this except for
Plumbers, and it's suffering from bursting at the seams.  If we can
encourage grow our more junior developers, it will help us delegate
our work to a larger group of talent.  In other words, it will help us
scale.

There are some tradeoffs to doing this; if we are going to combine
LSF/MM with other workshops and summits into a larger "systems-level"
conference in the first half of the year, we're not going to be able
to fit in some of the smaller, "fun" cities, such as Palm Springs, San
Juan, Park City, etc.

One of the things that I had suggested for 2020 was to colocate
LSF/MM/BPF, the Kernel Summit, Maintainer's Summit, and perhaps Linux
Security Symposium to June, in Austin.  (Why Austin?  Because finding
kernel hackers who are interested in planning a conference in a hands
on fashion ala Plumbers is *hard*.  And if we're going to leverage the
LF Events Staff on short notice, holding something in the same city as
OSS was the only real option.)  I thought it made a lot of sense last
year, but a lot of people *hated* Austin, and they didn't want to be
anywhere near the Product Manager "fluff" talks that unfortunately,
are in large supply at OSS.   So that idea fell through.

In any case, this is a problem that has been recently discussed at the
TAB, but this is not an issue where we can force anybody to do
anything.  We need to get the stakeholders who plan all of these
conferences to get together, and figure out something for 2021 or
maybe 2022 that we can all live with.  It's going to require some
compromising on all sides, and we all will have different things that
we consider "must haves" versus "would be nice" as far as conference
venues are concerned, and as well as dealing with financial
constraints.

Assuming I get an invite to LSF/MM (I guess they haven't gone out
yet?), I'd like to have a chance to chat with anyone who has strong
opinions on this issue in Palm Springs.  Maybe we could schedule a BOF
slot to hear from the folks who attend LSF/MM/BPF and learn what
things we all consider important vis-a-vis the technical conferences
that we attend?

Cheers,

							- Ted

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-12  0:21         ` NeilBrown
@ 2020-02-12  6:58           ` Darrick J. Wong
  0 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2020-02-12  6:58 UTC (permalink / raw)
  To: NeilBrown
  Cc: Allison Collins, Dave Chinner, Amir Goldstein, lsf-pc,
	linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Wed, Feb 12, 2020 at 11:21:06AM +1100, NeilBrown wrote:
> On Sun, Feb 09 2020, Allison Collins wrote:
> 
> > Well, I can see the response is meant to be encouraging, and you are 
> > right that everyone needs to give to receive :-)
> >
> > I have thought a lot about this, and I do have some opinions about it 
> > how the process is described to work vs how it ends up working though. 
> > There has quite been a few times I get conflicting reviews from multiple 
> > reviewers. I suspect either because reviewers are not seeing each others 
> > reviews,

Yes, I've been caught in email storms with hch before, where we're both
firing off emails at the same time and not quite seeing each other's
replies to the same thread.

> > or because it is difficult for people to recall or even find
> > discussions on prior revisions.

Oh gosh yes, lore has been a useful tool for that, but we only got lore
working for linux-xfs (and previous lists) recently.

> > And so at times, I find myself puzzling 
> > a bit trying to extrapolate what the community as a whole really wants.

The bigger problem here is that there are multiple reviewers, each
building slightly different conceptions about a problem and how to solve
that problem, and that's how an author ends up ping-ponging between
reviewers.  Sometimes you can appeal to the maintainer to decide if
integrating the proposed patches are better than not having them, but
sometimes the maintainer is trapped in the M***can standoff.

> The "community as a whole" is not a person and does not have a coherent
> opinion.  You will never please everyone and as you've suggested below,
> it can be hard to tell how strongly people really hold the opinions they
> reveal.
> 
> You need to give up trying to please "the community", but instead develop
> your own sense of taste that aligns with the concrete practice of the
> community, and then please yourself.

The "community as a whole" is not a person and does not have a totally
uniform concrete practice, just like we don't collectively have
identical opinions.  Given a problem description, different people will
choose and reject different high level structures based on their own
experience and biases to solve the problem.  That is how we end up in
the pickle.

> Then when someone criticizes your code, you need to decide for yourself
> whether it is a useful criticism or not.  This might involve hunting
> through the existing body of code to see what patterns are most common.

Our design pattern language changes over time.  For example, we used to
sprinkle indirect function calls everywhere, then I***l screwed us all
over and now those are going away.

> The end result is that either you defend your code, or you change your
> opinion (both can be quite appropriate).  If you change your opinion,
> then you probably change your code too.
> 
> Your goal isn't to ensure everyone is happy, only to ensure that no-one
> is justifiably angry.

Counterpoint: "DAX".

(Allison: I will continue working my way through the rest of your reply
tomorrow morning.)

--D

> NeilBrown
> 
> >
> > For example: a reviewer may propose a minor change, perhaps a style 
> > change, and as long as it's not terrible I assume this is just how 
> > people are used to seeing things implemented.  So I amend it, and in the 
> > next revision someone expresses that they dislike it and makes a 
> > different proposition.  Generally I'll mention that this change was 
> > requested, but if anyone feels particularly strongly about it, to please 
> > chime in.  Most of the time I don't hear anything, I suspect because 
> > either the first reviewer isn't around, or they don't have time to 
> > revisit it?  Maybe they weren't strongly opinionated about it to begin 
> > with?  It could have been they were feeling pressure to generate 
> > reviews, or maybe an employer is measuring their engagement?  In any 
> > case, if it goes around a third time, I'll usually start including links 
> > to prior reviews to try and get people on the same page, but most of the 
> > time I've found the result is that it just falls silent.
> >
> > At this point though it feels unclear to me if everyone is happy?  Did 
> > we have a constructive review?  Maybe it's not a very big deal and I 
> > should just move on.  And in many scenarios like the one above, the 
> > exact outcome appears to be of little concern to people in the greater 
> > scheme of things.  But this pattern does not always scale well in all 
> > cases.  Complex issues that persist over time generally do so because no 
> > one yet has a clear idea of what a correct solution even looks like, or 
> > perhaps cannot agree on one.  In my experience, getting people to come 
> > together on a common goal requires a sort of exploratory coding effort. 
> > Like a prototype that people can look at, learn from, share ideas, and 
> > then adapt the model from there.  But for that to work, they need to 
> > have been engaged with the history of it.  They need the common 
> > experience of seeing what has worked and what hasn't.  It helps people 
> > to let go of theories that have not performed well in practice, and 
> > shift to alternate approaches that have.  In a way, reviewers that have 
> > been historically more involved with a particular effort start to become 
> > a little integral to it as its reviewers.  Which I *think* is what 
> > Darrick may be eluding to in his initial proposition.  People request 
> > for certain reviewers, or perhaps the reviewers can volunteer to be sort 
> > of assigned to it in an effort to provide more constructive reviews.  In 
> > this way, reviewers allocate their efforts where they are most 
> > effective, and in doing so better distribute the work load as well.  Did 
> > I get that about right?  Thoughts?
> >
> > Allison



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-02 21:46     ` Dave Chinner
  2020-02-09 17:12       ` Allison Collins
@ 2020-02-12 21:36       ` Darrick J. Wong
  1 sibling, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2020-02-12 21:36 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Allison Collins, Amir Goldstein, lsf-pc, linux-fsdevel, xfs,
	Eryu Guan, Eric Sandeen

On Mon, Feb 03, 2020 at 08:46:20AM +1100, Dave Chinner wrote:
> On Fri, Jan 31, 2020 at 08:20:37PM -0700, Allison Collins wrote:
> > 
> > 
> > On 1/31/20 12:30 AM, Amir Goldstein wrote:
> > > On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > > 
> > > > Hi everyone,
> > > > 
> > > > I would like to discuss how to improve the process of shepherding code
> > > > into the kernel to make it more enjoyable for maintainers, reviewers,
> > > > and code authors.  Here is a brief summary of how we got here:
> > > > 
> > > > Years ago, XFS had one maintainer tending to all four key git repos
> > > > (kernel, userspace, documentation, testing).  Like most subsystems, the
> > > > maintainer did a lot of review and porting code between the kernel and
> > > > userspace, though with help from others.
> > > > 
> > > > It turns out that this didn't scale very well, so we split the
> > > > responsibilities into three maintainers.  Like most subsystems, the
> > > > maintainers still did a lot of review and porting work, though with help
> > > > from others.
> > > > 
> > > > It turns out that this system doesn't scale very well either.  Even with
> > > > three maintainers sharing access to the git trees and working together
> > > > to get reviews done, mailing list traffic has been trending upwards for
> > > > years, and we still can't keep up.  I fear that many maintainers are
> > > > burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
> > > > testing of the git trees, but keeping up with the mail and the reviews.
> > > > 
> > > > So what do we do about this?  I think we (the XFS project, anyway)
> > > > should increase the amount of organizing in our review process.  For
> > > > large patchsets, I would like to improve informal communication about
> > > > who the author might like to have conduct a review, who might be
> > > > interested in conducting a review, estimates of how much time a reviewer
> > > > has to spend on a patchset, and of course, feedback about how it went.
> > > > This of course is to lay the groundwork for making a case to our bosses
> > > > for growing our community, allocating time for reviews and for growing
> > > > our skills as reviewers.
> > > > 
> > > 
> > > Interesting.
> > > 
> > > Eryu usually posts a weekly status of xfstests review queue, often with
> > > a call for reviewers, sometimes with specific patch series mentioned.
> > > That helps me as a developer to monitor the status of my own work
> > > and it helps me as a reviewer to put the efforts where the maintainer
> > > needs me the most.

FWIW, I wasn't aware that he did that.

> > > For xfs kernel patches, I can represent the voice of "new blood".
> > > Getting new people to join the review effort is quite a hard barrier.
> > > I have taken a few stabs at doing review for xfs patch series over the
> > > year, but it mostly ends up feeling like it helped me (get to know xfs code
> > > better) more than it helped the maintainer, because the chances of a
> > > new reviewer to catch meaningful bugs are very low and if another reviewer
> > > is going to go over the same patch series, the chances of new reviewer to
> > > catch bugs that novice reviewer will not catch are extremely low.
> > That sounds like a familiar experience.  Lots of times I'll start a review,
> > but then someone else will finish it before I do, and catch more things
> > along the way.  So I sort of feel like if it's not something I can get
> > through quickly, then it's not a very good distribution of work effort and I
> > should shift to something else. Most of the time, I'll study it until I feel
> > like I understand what the person is trying to do, and I might catch stuff
> > that appears like it may not align with that pursuit, but I don't
> > necessarily feel I can deem it void of all unforeseen bugs.
> 
> I think you are both underselling yourselves. Imposter syndrome and
> all that jazz.
> 
> The reality is that we don't need more people doing the sorts of
> "how does this work with the rest of XFS" reviews that people like
> Darricki or Christoph do.

I 90% agree with that, except for the complexity that is the logging
system and how the cil & ail interact with each other.  I /really/
appreciate Brian digging into those kinds of things. :)

> What we really need is more people looking
> at whether loops are correctly terminated, the right variable types
> are used, we don't have signed vs unsigned issues, 32 bit overflows,
> use the right 32/64 bit division functions, the error handling logic
> is correct, etc.

Yeah, someone running sparse/smatch regularly would also help.  Granted,
Dan Carpenter does a good job of staying on top of that, but still...

> It's those sorts of little details that lead to most bugs, and
> that's precisely the sort of thing that is typically missed by an
> experienced developer doing a "is this the best possible
> implemenation of this functionality" review.

I would also add (to state this explicitly) that it was easier for me
earlier in my xfs career to spot certain kinds of bugs because I had to
figure out what the code was doing on my own and did not have to
compensate for the mental shortcut of "fmeh, it worked last time I went
here".

> A recent personal example: look at the review of Matthew Wilcox's
> ->readahead() series that I recently did. I noticed problems in the
> core change and the erofs and btfrs implementations not because I
> knew anything about those filesystems, but because I was checking
> whether the new loops iterated the pages in the page cache
> correctly. i.e. all I was really looking at was variable counting
> and loop initialisation and termination conditions. Experience tells
> me this stuff is notoriously difficult to get right, so that's what
> I looked at....
> 
> IOWs, you don't need to know anything about the subsystem to
> perform such a useful review, and a lot of the time you won't find a
> problem.

The other thing that would be helpful is reviewing code submission
w.r.t. test cases.  Did the author submit a testcase to go with the
patch?  Does the code patch instead reference finding the bug by
studying failures of specific fstests?  (Or LTP, or blktests, or...)

> But it's still a very useful review to perform, and in
> doing so you've validated, to the best of your ability, that the
> change is sound. Put simply:
> 
> 	"I've checked <all these things> and it looks good to me.
> 
> 	Reviewed-by: Joe Bloggs <joe@blogg.com>"
> 
> This is a very useful, valid review, regardless of whether you find
> anything. It's also a method of review that you can use when you
> have limited time - rather than trying to check everything and
> spending hours on a pathset, pick one thing and get the entire
> review done in 15 minutes. Then do the same thing for the next patch
> set. You'll be surprised how many things you notice that aren't what
> you are looking for when you do this.

> Hence the fact that other people find (different) issues is
> irrelevant - they'll be looking at different things to you, and
> there may not even be any overlap in the focus/scope of the reviews
> that have been performed. You may find the same things, but that is
> also not a bad thing - I intentionally don't read other reviews
> before I review a patch series, so that I don't taint my view of the
> code before I look at it (e.g., darrick found a bug in this code, so
> I don't need to look at it...).
> 
> IOWs, if you are starting from the premise that "I don't know this
> code well enough to perform a useful review" then you are setting

TBH I've wondered if this goes hand in hand with "I fear/figure/guess
the maintainer will probably review this and prove better at spotting
the bugs than me" and "Oh gosh what if the maintainer pushes this into
for-next before anyone even realizes I was reading this"?

This is why I want to push for /slightly/ more explicit coordination of
authors and reviewers for patch sets.  Let's say that Pavel sends a
patchset to the list.  If (for example) Allison immediately emails a
reply to the cover letter saying that she'll look at test coverage and
Bill sends his own email saying that he'll check the error paths, then I
will know expect further discussion (or RVBs) between the three of them,
and will wait for the reviews to wrap up before pushing to -next.

Obviously if this is some nasty bug that needs fixing quickly then cc me
to get my attention and I'll flag myself as conducting the review.

> yourself up for failure right at the start. Read the series
> description, think about the change being made, use your experience
> to answer the question "what's a mistake I could make performing
> this change". Then go looking for that mistake through the
> patch(es). In the process of performing this review, more than
> likely, you'll notice bugs other than what you are actually looking
> for...
> 
> This does not require any deep subsystem specific knowledge, but in
> doing this sort of review you're going to notice things and learn
> about the code and slowly build your knowledge and experience about
> that subsystem.

(Agreed, most of my familiarity from xfs comes either from reading
patches coming in or examining previously submitted patchsets, even if
it's been a while since the last submission.)

> > > However, there are quite a few cleanup and refactoring patch series,
> > > especially on the xfs list, where a review from an "outsider" could still
> > > be of value to the xfs community. OTOH, for xfs maintainer, those are
> > > the easy patches to review, so is there a gain in offloading those reviews?
> > > 
> > > Bottom line - a report of the subsystem review queue status, call for
> > > reviewers and highlighting specific areas in need of review is a good idea.
> > > Developers responding to that report publicly with availability for review,
> > > intention and expected time frame for taking on a review would be helpful
> > > for both maintainers and potential reviewers.
> > I definitely think that would help delegate review efforts a little more.
> > That way it's clear what people are working on, and what still needs
> > attention.
> 
> It is not the maintainer's repsonsibility to gather reviewers. That

No, but I'm conflating the two here because I'm engaged in all three
modes. :)

> is entirely the responsibility of the patch submitter. That is, if
> the code has gone unreviewed, it is up to the submitter to find
> people to review the code, not the maintainer. If you, as a
> developer, are unable to find people willing to review your code
> then it's a sign you haven't been reviewing enough code yourself.

I don't entirely agree with this statement.  I have loads of patches
waiting for review, but I also don't feel like I'm not reviewing enough
code.  I swear I'm not trying to guilt trip anyone by that statement. :)

Recently I was talking to Eric and he pointed out that certain things
like the NYE mass patchbombings aren't so helpful because 400 patches is
a lot to absorb.  I dump those patchbombs (both to linux-xfs and to
kernel.org) to try to keep my own bus factor as high as possible.  I
think I should shift to pushing the git trees to kernel.org and /only/
sending the cover letters with links, at least when I'm in "distributed
backup mode".

For everything else, maybe I should be more explicit about identifying
that which I'd like to get reviewed for the next cycle, e.g. "I would
like to try to get the btree bulk load code pushed for 5.7."  I think
some of the kernel subprojects actually do something like that, but it
doesn't seem like this is universally acknowledged as a strategy.

> Good reviewers are a valuable resource - as a developer I rely on
> reviewers to get my code merged, so if I don't review code and
> everyone else behaves the same way, how can I possibly get my code
> merged? IOWs, review is something every developer should be spending
> a significant chunk of their time on. IMO, if you are not spending
> *at least* a whole day a week reviewing code, you're not actually
> doing enough code review to allow other developers to be as
> productive as you are.

This kinda gets to my next topic, which is how do we make the business
case for more people to jump in and review code?  AFAICT people who work
for the longtime Usual Suspects (RH, IBM, SUSE, Oracle, etc.) are /very/
fortunate that their free software offices have enough sway that they
don't need to push very hard for paid review time, but I have little
clue if that's true for the more recent players.  If you work for a
company that doesn't let you review time as part of your workday, I
would like to know that.

> The more you review other people's code, the more you learn about
> the code and the more likely other people will be to review your
> code because they know you'll review their code in turn.  It's a
> positive reinforcement cycle that benefits both the individual
> developers personally and the wider community.
> 
> But this positive reinforcemnt cycle just doesn't happen if people
> avoid reviewing code because they think "I don't know anything so my
> review is not going to be worth anything".  Constructive review, not
> matter whether it's performed at a simple or complex level, is
> always valuable.

Also sometimes /this/ reviewer is running in stupid/harried/burnout mode
and misses obvious things. :(

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-09 17:12       ` Allison Collins
  2020-02-12  0:21         ` NeilBrown
@ 2020-02-12 22:06         ` Darrick J. Wong
  2020-02-12 22:19           ` Dan Williams
  2020-02-13 15:11           ` Brian Foster
  2020-02-12 23:39         ` Dave Chinner
  2 siblings, 2 replies; 27+ messages in thread
From: Darrick J. Wong @ 2020-02-12 22:06 UTC (permalink / raw)
  To: Allison Collins
  Cc: Dave Chinner, Amir Goldstein, lsf-pc, linux-fsdevel, xfs,
	Eryu Guan, Eric Sandeen

On Sun, Feb 09, 2020 at 10:12:03AM -0700, Allison Collins wrote:
> 
> 
> On 2/2/20 2:46 PM, Dave Chinner wrote:
> > On Fri, Jan 31, 2020 at 08:20:37PM -0700, Allison Collins wrote:
> > > 
> > > 
> > > On 1/31/20 12:30 AM, Amir Goldstein wrote:
> > > > On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > > > 
> > > > > Hi everyone,
> > > > > 
> > > > > I would like to discuss how to improve the process of shepherding code
> > > > > into the kernel to make it more enjoyable for maintainers, reviewers,
> > > > > and code authors.  Here is a brief summary of how we got here:
> > > > > 
> > > > > Years ago, XFS had one maintainer tending to all four key git repos
> > > > > (kernel, userspace, documentation, testing).  Like most subsystems, the
> > > > > maintainer did a lot of review and porting code between the kernel and
> > > > > userspace, though with help from others.
> > > > > 
> > > > > It turns out that this didn't scale very well, so we split the
> > > > > responsibilities into three maintainers.  Like most subsystems, the
> > > > > maintainers still did a lot of review and porting work, though with help
> > > > > from others.
> > > > > 
> > > > > It turns out that this system doesn't scale very well either.  Even with
> > > > > three maintainers sharing access to the git trees and working together
> > > > > to get reviews done, mailing list traffic has been trending upwards for
> > > > > years, and we still can't keep up.  I fear that many maintainers are
> > > > > burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
> > > > > testing of the git trees, but keeping up with the mail and the reviews.
> > > > > 
> > > > > So what do we do about this?  I think we (the XFS project, anyway)
> > > > > should increase the amount of organizing in our review process.  For
> > > > > large patchsets, I would like to improve informal communication about
> > > > > who the author might like to have conduct a review, who might be
> > > > > interested in conducting a review, estimates of how much time a reviewer
> > > > > has to spend on a patchset, and of course, feedback about how it went.
> > > > > This of course is to lay the groundwork for making a case to our bosses
> > > > > for growing our community, allocating time for reviews and for growing
> > > > > our skills as reviewers.
> > > > > 
> > > > 
> > > > Interesting.
> > > > 
> > > > Eryu usually posts a weekly status of xfstests review queue, often with
> > > > a call for reviewers, sometimes with specific patch series mentioned.
> > > > That helps me as a developer to monitor the status of my own work
> > > > and it helps me as a reviewer to put the efforts where the maintainer
> > > > needs me the most.
> > > > 
> > > > For xfs kernel patches, I can represent the voice of "new blood".
> > > > Getting new people to join the review effort is quite a hard barrier.
> > > > I have taken a few stabs at doing review for xfs patch series over the
> > > > year, but it mostly ends up feeling like it helped me (get to know xfs code
> > > > better) more than it helped the maintainer, because the chances of a
> > > > new reviewer to catch meaningful bugs are very low and if another reviewer
> > > > is going to go over the same patch series, the chances of new reviewer to
> > > > catch bugs that novice reviewer will not catch are extremely low.
> > > That sounds like a familiar experience.  Lots of times I'll start a review,
> > > but then someone else will finish it before I do, and catch more things
> > > along the way.  So I sort of feel like if it's not something I can get
> > > through quickly, then it's not a very good distribution of work effort and I
> > > should shift to something else. Most of the time, I'll study it until I feel
> > > like I understand what the person is trying to do, and I might catch stuff
> > > that appears like it may not align with that pursuit, but I don't
> > > necessarily feel I can deem it void of all unforeseen bugs.
> > 
> > I think you are both underselling yourselves. Imposter syndrome and
> > all that jazz.
> > 
> > The reality is that we don't need more people doing the sorts of
> > "how does this work with the rest of XFS" reviews that people like
> > Darricki or Christoph do. What we really need is more people looking
> > at whether loops are correctly terminated, the right variable types
> > are used, we don't have signed vs unsigned issues, 32 bit overflows,
> > use the right 32/64 bit division functions, the error handling logic
> > is correct, etc.
> > 
> > It's those sorts of little details that lead to most bugs, and
> > that's precisely the sort of thing that is typically missed by an
> > experienced developer doing a "is this the best possible
> > implemenation of this functionality" review.
> > 
> > A recent personal example: look at the review of Matthew Wilcox's
> > ->readahead() series that I recently did. I noticed problems in the
> > core change and the erofs and btfrs implementations not because I
> > knew anything about those filesystems, but because I was checking
> > whether the new loops iterated the pages in the page cache
> > correctly. i.e. all I was really looking at was variable counting
> > and loop initialisation and termination conditions. Experience tells
> > me this stuff is notoriously difficult to get right, so that's what
> > I looked at....
> > 
> > IOWs, you don't need to know anything about the subsystem to
> > perform such a useful review, and a lot of the time you won't find a
> > problem. But it's still a very useful review to perform, and in
> > doing so you've validated, to the best of your ability, that the
> > change is sound. Put simply:
> > 
> > 	"I've checked <all these things> and it looks good to me.
> > 
> > 	Reviewed-by: Joe Bloggs <joe@blogg.com>"
> > 
> > This is a very useful, valid review, regardless of whether you find
> > anything. It's also a method of review that you can use when you
> > have limited time - rather than trying to check everything and
> > spending hours on a pathset, pick one thing and get the entire
> > review done in 15 minutes. Then do the same thing for the next patch
> > set. You'll be surprised how many things you notice that aren't what
> > you are looking for when you do this.
> > 
> > Hence the fact that other people find (different) issues is
> > irrelevant - they'll be looking at different things to you, and
> > there may not even be any overlap in the focus/scope of the reviews
> > that have been performed. You may find the same things, but that is
> > also not a bad thing - I intentionally don't read other reviews
> > before I review a patch series, so that I don't taint my view of the
> > code before I look at it (e.g., darrick found a bug in this code, so
> > I don't need to look at it...).
> > 
> > IOWs, if you are starting from the premise that "I don't know this
> > code well enough to perform a useful review" then you are setting
> > yourself up for failure right at the start. Read the series
> > description, think about the change being made, use your experience
> > to answer the question "what's a mistake I could make performing
> > this change". Then go looking for that mistake through the
> > patch(es). In the process of performing this review, more than
> > likely, you'll notice bugs other than what you are actually looking
> > for...
> > 
> > This does not require any deep subsystem specific knowledge, but in
> > doing this sort of review you're going to notice things and learn
> > about the code and slowly build your knowledge and experience about
> > that subsystem.
> > 
> > > > However, there are quite a few cleanup and refactoring patch series,
> > > > especially on the xfs list, where a review from an "outsider" could still
> > > > be of value to the xfs community. OTOH, for xfs maintainer, those are
> > > > the easy patches to review, so is there a gain in offloading those reviews?
> > > > 
> > > > Bottom line - a report of the subsystem review queue status, call for
> > > > reviewers and highlighting specific areas in need of review is a good idea.
> > > > Developers responding to that report publicly with availability for review,
> > > > intention and expected time frame for taking on a review would be helpful
> > > > for both maintainers and potential reviewers.
> > > I definitely think that would help delegate review efforts a little more.
> > > That way it's clear what people are working on, and what still needs
> > > attention.
> > 
> > It is not the maintainer's repsonsibility to gather reviewers. That
> > is entirely the responsibility of the patch submitter. That is, if
> > the code has gone unreviewed, it is up to the submitter to find
> > people to review the code, not the maintainer. If you, as a
> > developer, are unable to find people willing to review your code
> > then it's a sign you haven't been reviewing enough code yourself.
> > 
> > Good reviewers are a valuable resource - as a developer I rely on
> > reviewers to get my code merged, so if I don't review code and
> > everyone else behaves the same way, how can I possibly get my code
> > merged? IOWs, review is something every developer should be spending
> > a significant chunk of their time on. IMO, if you are not spending
> > *at least* a whole day a week reviewing code, you're not actually
> > doing enough code review to allow other developers to be as
> > productive as you are.
> > 
> > The more you review other people's code, the more you learn about
> > the code and the more likely other people will be to review your
> > code because they know you'll review their code in turn.  It's a
> > positive reinforcement cycle that benefits both the individual
> > developers personally and the wider community.
> > 
> > But this positive reinforcemnt cycle just doesn't happen if people
> > avoid reviewing code because they think "I don't know anything so my
> > review is not going to be worth anything".  Constructive review, not
> > matter whether it's performed at a simple or complex level, is
> > always valuable.
> > 
> > Cheers,
> > 
> > Dave.
> > 
> Well, I can see the response is meant to be encouraging, and you are right
> that everyone needs to give to receive :-)
> 
> I have thought a lot about this, and I do have some opinions about it how
> the process is described to work vs how it ends up working though. There has
> quite been a few times I get conflicting reviews from multiple reviewers. I
> suspect either because reviewers are not seeing each others reviews, or
> because it is difficult for people to recall or even find discussions on
> prior revisions.  And so at times, I find myself puzzling a bit trying to
> extrapolate what the community as a whole really wants.

<and now jumping back in from the reply I sent to Neil Brown earlier...>

> For example: a reviewer may propose a minor change, perhaps a style change,
> and as long as it's not terrible I assume this is just how people are used
> to seeing things implemented.  So I amend it, and in the next revision
> someone expresses that they dislike it and makes a different proposition.
> Generally I'll mention that this change was requested, but if anyone feels
> particularly strongly about it, to please chime in.  Most of the time I
> don't hear anything, I suspect because either the first reviewer isn't
> around, or they don't have time to revisit it?

Definitely guilty of that here. :(

I've noticed that my own reviews decline in quality and coherency of
pickiness as the version count increases.  It's not easy to tell what's
the real difference between v2 and v3 of a series without git-am'ing
both series, building branches, and then git diffing the branches, and
that is /very/ time consuming.

I've really really wanted to be able to tell people to just send a pull
request for large series and skip all the email patch review stuff, but
I'm well aware that will start a popular revolt.  But maybe we can do
both?  Is it legit to ask that if you're sending more than a simple
quickfix, to please push a branch somewhere so that I can just yank it
down and have a look?  I try to do that with every series that I send, I
think Allison has been doing that, Christoph does it sometimes, etc.

> Maybe they weren't strongly
> opinionated about it to begin with?  It could have been they were feeling
> pressure to generate reviews, or maybe an employer is measuring their
> engagement?  In any case, if it goes around a third time, I'll usually start
> including links to prior reviews to try and get people on the same page, but
> most of the time I've found the result is that it just falls silent.

(And that's where we end up at this annoying thing of "Well the
reviewers don't agree, the maintainer notices the awkward silence, and
eventually gives in and 'resolves' the conflict by reading the whole
series...)

> At this point though it feels unclear to me if everyone is happy?  Did we
> have a constructive review?  Maybe it's not a very big deal and I should
> just move on.

Only you can't, unless the maintainer either pulls the patches or sends
back a forever-nak.  The whole process of shepherding code into the
kernel grinds to a halt because we can't agree on things like "two
function with similar argument lists or one function with flags?".

> And in many scenarios like the one above, the exact outcome
> appears to be of little concern to people in the greater scheme of things.
> But this pattern does not always scale well in all cases.  Complex issues
> that persist over time generally do so because no one yet has a clear idea
> of what a correct solution even looks like, or perhaps cannot agree on one.
> In my experience, getting people to come together on a common goal requires
> a sort of exploratory coding effort.

Ok, so there's this yearslong D** mess where we can't quite agree on how
the user interface should work.

TBH, I've wondered if we could reduce the amount of resistance to new
code and ABIs by making /every/ new ioctl and syscall announce itself as
EXPERIMENTAL for the first 6-9mo to give people time to prototype how
these things will work for real, and while we won't change the behavior
capriciously during that time, if we have to break things to prevent a
greater harm, we can do so without having to rev the whole interface.

I've also wished we had a better way to prototype new things in XFS
while not affecting the stable codebase, but various people have warned
me not to repeat the mistake of ext4dev. :/

> Like a prototype that people can look
> at, learn from, share ideas, and then adapt the model from there.  But for
> that to work, they need to have been engaged with the history of it.  They
> need the common experience of seeing what has worked and what hasn't.  It
> helps people to let go of theories that have not performed well in practice,
> and shift to alternate approaches that have.  In a way, reviewers that have
> been historically more involved with a particular effort start to become a
> little integral to it as its reviewers.  Which I *think* is what Darrick may
> be eluding to in his initial proposition.

Yes.

> People request for certain
> reviewers, or perhaps the reviewers can volunteer to be sort of assigned to
> it in an effort to provide more constructive reviews.  In this way,
> reviewers allocate their efforts where they are most effective, and in doing
> so better distribute the work load as well.  Did I get that about right?
> Thoughts?

And yes.  We of course have zero ability to coerce anyone to do things,
but at the same time, we're all collaborating to build something better,
and hoping that people organically decide to do things really doesn't
scale any more, given that we're burning out a second generation of
maintainers. :/

--D

> Allison

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-12 22:06         ` Darrick J. Wong
@ 2020-02-12 22:19           ` Dan Williams
  2020-02-12 22:36             ` Darrick J. Wong
  2020-02-13 15:11           ` Brian Foster
  1 sibling, 1 reply; 27+ messages in thread
From: Dan Williams @ 2020-02-12 22:19 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Allison Collins, Eric Sandeen, Amir Goldstein, Dave Chinner, xfs,
	Eryu Guan, linux-fsdevel, lsf-pc

On Wed, Feb 12, 2020 at 2:07 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
[..]
> I've really really wanted to be able to tell people to just send a pull
> request for large series and skip all the email patch review stuff, but
> I'm well aware that will start a popular revolt.  But maybe we can do
> both?  Is it legit to ask that if you're sending more than a simple
> quickfix, to please push a branch somewhere so that I can just yank it
> down and have a look?  I try to do that with every series that I send, I
> think Allison has been doing that, Christoph does it sometimes, etc.

This is begging me to point out that Konstantin has automated this
with his get-lore-mbox tool [1]. As long as the submitter uses "git
format-patch --base..." then the tool can automate recreating a local
branch from a mail series.

[1]: https://lore.kernel.org/workflows/20200201030105.k6akvbjpmlpcuiky@chatter.i7.local/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-07 22:03 ` Matthew Wilcox
  2020-02-12  3:51   ` Theodore Y. Ts'o
@ 2020-02-12 22:21   ` Darrick J. Wong
  2020-02-13  1:23     ` Dave Chinner
  1 sibling, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2020-02-12 22:21 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: lsf-pc, linux-fsdevel, xfs, Eric Sandeen, Eryu Guan

On Fri, Feb 07, 2020 at 02:03:33PM -0800, Matthew Wilcox wrote:
> On Thu, Jan 30, 2020 at 09:25:20PM -0800, Darrick J. Wong wrote:
> > It turns out that this system doesn't scale very well either.  Even with
> > three maintainers sharing access to the git trees and working together
> > to get reviews done, mailing list traffic has been trending upwards for
> > years, and we still can't keep up.  I fear that many maintainers are
> > burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
> > testing of the git trees, but keeping up with the mail and the reviews.
> 
> I think the LSFMMBPF conference is part of the problem.  With the best of
> intentions, we have set up a system which serves to keep all but the most
> dedicated from having a voice at the premier conference for filesystems,
> memory management, storage (and now networking).  It wasn't intended to
> be that way, but that's what has happened, and it isn't serving us well
> as a result.

Yeah... I gather that we now have BPF because it ties in with networking
and we added networking because it ties in with mm, but ... frankly I'd
like to be able to include a broader selection of people from each
community even if that rsults either in a larger event or splitting
things up again.

> Three anecdotes.  First, look at Jason's mail from earlier today:
> https://lore.kernel.org/linux-mm/20200207194620.GG8731@bombadil.infradead.org/T/#t
> 
> There are 11 people on that list, plus Jason, plus three more than I
> recommended.  That's 15, just for that one topic.  I think maybe half
> of those people will get an invite anyway, but adding on an extra 5-10
> people for (what I think is) a critically important topic at the very
> nexus of storage, filesystems, memory management, networking and graphics
> is almost certainly out of bounds for the scale of the current conference.
> 
> Second, I've had Outreachy students who have made meaningful contributions
> to the kernel.  Part of their bursary is a travel grant to go to a
> conference and they were excited to come to LSFMM.  I've had to tell
> them "this conference is invite-only for the top maintainers; you can't
> come".  They ended up going to an Open Source Summit conference instead.
> By excluding the people who are starting out, we are failing to grow
> our community.

Agree.

> I don't think it would have hurt for them to be in the room; they were
> unlikely to speak, and perhaps they would have gone on to make larger
> contributions.

Agree.  I've met all the Big Names, but I've never met most of the
people who are not full time contributors, and I've been attending LSFMM
for years.

> Third, I hear from people who work on a specific filesystem "Of the
> twenty or so slots for the FS part of the conference, there are about
> half a dozen generic filesystem people who'll get an invite, then maybe
> six filesystems who'll get two slots each, but what we really want to
> do is get everybody working on this filesystem in a room and go over
> our particular problem areas".

Yes!  One thousand times yes!  The best value I've gotten from LSF has
been the in-person interlocks with the XFS/ext4/btrfs developers, even
if the thing we discuss in the hallway BOFs have not really been
"cross-subsystem topics".

> This kills me because LSFMM has been such a critically important part of
> Linux development for over a decade, but I think at this point it is at
> least not serving us the way we want it to, and may even be doing more
> harm than good.

I don't think I'd go quite that far, but it's definitely underserving
the people who can't get in, the people who can't go, and the people who
are too far away but gosh it would be nice to pull them in even if it's
only for 30 minutes over a conference call.

> I think it needs to change, and more people need to
> be welcomed to the conference.  Maybe it needs to not be invite-only.
> Maybe it can stay invite-only, but be twice as large.  Maybe everybody
> who's coming needs to front $100 to put towards the costs of a larger
> meeting space with more rooms.
> 
> Not achievable for this year, I'm sure, but if we start talking now
> maybe we can have a better conference in 2021.

<nod>

--D

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-12  3:51   ` Theodore Y. Ts'o
@ 2020-02-12 22:29     ` Darrick J. Wong
  0 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2020-02-12 22:29 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Matthew Wilcox, lsf-pc, linux-fsdevel, xfs, Eric Sandeen, Eryu Guan

On Tue, Feb 11, 2020 at 10:51:39PM -0500, Theodore Y. Ts'o wrote:
> On Fri, Feb 07, 2020 at 02:03:33PM -0800, Matthew Wilcox wrote:
> > On Thu, Jan 30, 2020 at 09:25:20PM -0800, Darrick J. Wong wrote:
> > > It turns out that this system doesn't scale very well either.  Even with
> > > three maintainers sharing access to the git trees,,,
> >
> > I think the LSFMMBPF conference is part of the problem.  With the best of
> > intentions, we have set up a system which serves to keep all but the most
> > dedicated from having a voice at the premier conference for filesystems,
> > memory management, storage (and now networking).  It wasn't intended to
> > be that way, but that's what has happened, and it isn't serving us well
> > as a result.
> >
> > ...
> >
> > This kills me because LSFMM has been such a critically important part of
> > Linux development for over a decade, but I think at this point it is at
> > least not serving us the way we want it to, and may even be doing more
> > harm than good.  I think it needs to change, and more people need to
> > be welcomed to the conference.  Maybe it needs to not be invite-only.
> > Maybe it can stay invite-only, but be twice as large.  Maybe everybody
> > who's coming needs to front $100 to put towards the costs of a larger
> > meeting space with more rooms.
> 
> One of the things that I've trying to suggest for at least the last
> year or two is that we need colocate LSF/MM with a larger conference.
> In my mind, what would be great would be something sort of like
> Plumbers, but in the first half of year.  The general idea would be to
> have two major systems-level conferences about six months apart.
> 
> The LSF/MM conference could still be invite only, much like we have
> had the Maintainer's Summit and the Networking Summit colocated with
> Plumbers in Lisbon in 2019 and Vancouver in 2018.  But it would be
> colocated with other topic specific workshops / summits, and there
> would be space for topics like what you described below:
> 
> > There are 11 people on that list, plus Jason, plus three more than I
> > recommended.  That's 15, just for that one topic.  I think maybe half
> > of those people will get an invite anyway, but adding on an extra 5-10
> > people for (what I think is) a critically important topic at the very
> > nexus of storage, filesystems, memory management, networking and graphics
> > is almost certainly out of bounds for the scale of the current conference.
> 
> After all, this is *precisely* the scaling problem that we had with
> the Kernel Summit.  The LSF/MM summit can really only deal with
> subjects that require high-level coordination between maintainers.
> For more focused topics, we will need a wider set of developers than
> can fit in size constraints of the LSF/MM venue.

<nod>

> This also addresses Darrick's problem, in that most of us can probably
> point to more junior engineers that we would like to help to develop,
> which means they need to meet other Storage, File System, and MM
> developers --- both more senior ones, and other colleagues in the
> community.  Right now, we don't have a venue for this except for
> Plumbers, and it's suffering from bursting at the seams.  If we can
> encourage grow our more junior developers, it will help us delegate
> our work to a larger group of talent.  In other words, it will help us
> scale.

Agreed.  The other downside of Plumbers is that there often isn't a
storage/fs track associated with it, which in the past has made getting
funding for my own participation very difficult.  If I have to choose
between LSFMM and Plumbers, LSF probably wins.

> There are some tradeoffs to doing this; if we are going to combine
> LSF/MM with other workshops and summits into a larger "systems-level"
> conference in the first half of the year, we're not going to be able
> to fit in some of the smaller, "fun" cities, such as Palm Springs, San
> Juan, Park City, etc.
> 
> One of the things that I had suggested for 2020 was to colocate
> LSF/MM/BPF, the Kernel Summit, Maintainer's Summit, and perhaps Linux
> Security Symposium to June, in Austin.  (Why Austin?  Because finding
> kernel hackers who are interested in planning a conference in a hands
> on fashion ala Plumbers is *hard*.  And if we're going to leverage the
> LF Events Staff on short notice, holding something in the same city as
> OSS was the only real option.)  I thought it made a lot of sense last
> year, but a lot of people *hated* Austin, and they didn't want to be
> anywhere near the Product Manager "fluff" talks that unfortunately,
> are in large supply at OSS.   So that idea fell through.
> 
> In any case, this is a problem that has been recently discussed at the
> TAB, but this is not an issue where we can force anybody to do
> anything.  We need to get the stakeholders who plan all of these
> conferences to get together, and figure out something for 2021 or
> maybe 2022 that we can all live with.  It's going to require some
> compromising on all sides, and we all will have different things that
> we consider "must haves" versus "would be nice" as far as conference
> venues are concerned, and as well as dealing with financial
> constraints.
> 
> Assuming I get an invite to LSF/MM (I guess they haven't gone out
> yet?), I'd like to have a chance to chat with anyone who has strong
> opinions on this issue in Palm Springs.  Maybe we could schedule a BOF
> slot to hear from the folks who attend LSF/MM/BPF and learn what
> things we all consider important vis-a-vis the technical conferences
> that we attend?

It seems like Future of LSF Planning has enough interest for its own
BOF, yes.  I'd attend that. :)

--D

> Cheers,
> 
> 							- Ted

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-12 22:19           ` Dan Williams
@ 2020-02-12 22:36             ` Darrick J. Wong
  0 siblings, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2020-02-12 22:36 UTC (permalink / raw)
  To: Dan Williams
  Cc: Allison Collins, Eric Sandeen, Amir Goldstein, Dave Chinner, xfs,
	Eryu Guan, linux-fsdevel, lsf-pc

On Wed, Feb 12, 2020 at 02:19:40PM -0800, Dan Williams wrote:
> On Wed, Feb 12, 2020 at 2:07 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> [..]
> > I've really really wanted to be able to tell people to just send a pull
> > request for large series and skip all the email patch review stuff, but
> > I'm well aware that will start a popular revolt.  But maybe we can do
> > both?  Is it legit to ask that if you're sending more than a simple
> > quickfix, to please push a branch somewhere so that I can just yank it
> > down and have a look?  I try to do that with every series that I send, I
> > think Allison has been doing that, Christoph does it sometimes, etc.
> 
> This is begging me to point out that Konstantin has automated this
> with his get-lore-mbox tool [1]. As long as the submitter uses "git
> format-patch --base..." then the tool can automate recreating a local
> branch from a mail series.
> 
> [1]: https://lore.kernel.org/workflows/20200201030105.k6akvbjpmlpcuiky@chatter.i7.local/

Yes, I'm aware of the development of git-lore-mbox, but I'd rather just
pull directly from a developer's git repo than create branches on my
computer.  I /already/ have my own script to extract patches and
apply/paste/staple them onto a git tree, and xfsprogs already has
scripts to automate porting of libxfs change from the kernel.

My goal here is more to change the balance of who does what work a
little bit back towards patch authors, than it is to spend less time
putting together git trees.  I mean, most of you already use git already
anyway, right?  :)

--D

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-01-31  7:30 ` [Lsf-pc] " Amir Goldstein
  2020-02-01  3:20   ` Allison Collins
@ 2020-02-12 22:42   ` Darrick J. Wong
  2020-02-13 10:21     ` Amir Goldstein
  1 sibling, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2020-02-12 22:42 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: lsf-pc, linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Fri, Jan 31, 2020 at 09:30:02AM +0200, Amir Goldstein wrote:
> On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > Hi everyone,
> >
> > I would like to discuss how to improve the process of shepherding code
> > into the kernel to make it more enjoyable for maintainers, reviewers,
> > and code authors.  Here is a brief summary of how we got here:
> >
> > Years ago, XFS had one maintainer tending to all four key git repos
> > (kernel, userspace, documentation, testing).  Like most subsystems, the
> > maintainer did a lot of review and porting code between the kernel and
> > userspace, though with help from others.
> >
> > It turns out that this didn't scale very well, so we split the
> > responsibilities into three maintainers.  Like most subsystems, the
> > maintainers still did a lot of review and porting work, though with help
> > from others.
> >
> > It turns out that this system doesn't scale very well either.  Even with
> > three maintainers sharing access to the git trees and working together
> > to get reviews done, mailing list traffic has been trending upwards for
> > years, and we still can't keep up.  I fear that many maintainers are
> > burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
> > testing of the git trees, but keeping up with the mail and the reviews.
> >
> > So what do we do about this?  I think we (the XFS project, anyway)
> > should increase the amount of organizing in our review process.  For
> > large patchsets, I would like to improve informal communication about
> > who the author might like to have conduct a review, who might be
> > interested in conducting a review, estimates of how much time a reviewer
> > has to spend on a patchset, and of course, feedback about how it went.
> > This of course is to lay the groundwork for making a case to our bosses
> > for growing our community, allocating time for reviews and for growing
> > our skills as reviewers.
> >
> 
> Interesting.
> 
> Eryu usually posts a weekly status of xfstests review queue, often with
> a call for reviewers, sometimes with specific patch series mentioned.
> That helps me as a developer to monitor the status of my own work
> and it helps me as a reviewer to put the efforts where the maintainer
> needs me the most.

I wasn't aware of that, I'll ask him to put me on the list.

> For xfs kernel patches, I can represent the voice of "new blood".
> Getting new people to join the review effort is quite a hard barrier.
> I have taken a few stabs at doing review for xfs patch series over the
> year, but it mostly ends up feeling like it helped me (get to know xfs code
> better) more than it helped the maintainer, because the chances of a
> new reviewer to catch meaningful bugs are very low and if another reviewer
> is going to go over the same patch series, the chances of new reviewer to
> catch bugs that novice reviewer will not catch are extremely low.

Probably not, I miss small things too. :)

Especially if one has to twist through a bunch of different functions to
figure out if there's really a bug there.

> However, there are quite a few cleanup and refactoring patch series,
> especially on the xfs list, where a review from an "outsider" could still
> be of value to the xfs community. OTOH, for xfs maintainer, those are
> the easy patches to review, so is there a gain in offloading those reviews?

Yes, because there's still a ton of things I suck at -- running sparse
and smatch on the git trees, figuring out how a given patch will affect
xfsprogs, etc.

> Bottom line - a report of the subsystem review queue status, call for
> reviewers and highlighting specific areas in need of review is a good idea.
> Developers responding to that report publicly with availability for review,
> intention and expected time frame for taking on a review would be helpful
> for both maintainers and potential reviewers.

<nod>

--D

> Thanks,
> Amir.
> 
> > ---
> >
> > I want to spend the time between right now and whenever this discussion
> > happens to make a list of everything that works and that could be made
> > better about our development process.
> >
> > I want to spend five minutes at the start of the discussion to
> > acknowledge everyone's feelings around that list that we will have
> > compiled.
> >
> > Then I want to spend the rest of the session breaking up the problems
> > into small enough pieces to solve, discussing solutions to those
> > problems, and (ideally) pushing towards a consensus on what series of
> > small adjustments we can make to arrive at something that works better
> > for everyone.
> >
> > --D
> > _______________________________________________
> > Lsf-pc mailing list
> > Lsf-pc@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/lsf-pc

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-09 17:12       ` Allison Collins
  2020-02-12  0:21         ` NeilBrown
  2020-02-12 22:06         ` Darrick J. Wong
@ 2020-02-12 23:39         ` Dave Chinner
  2020-02-13 15:19           ` Brian Foster
  2 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2020-02-12 23:39 UTC (permalink / raw)
  To: Allison Collins
  Cc: Amir Goldstein, Darrick J. Wong, lsf-pc, linux-fsdevel, xfs,
	Eryu Guan, Eric Sandeen

Hi Allison,

These are some very good observations and questions. I've found it
hard to write a meaningful response because there are many things I
just take for granted having done this for many many years...

I don't have all the answers. Lots of what I say is opinion (see my
comments about opinions below!) based on experience and the best I
can do to help improve the situation is to share what I think and
let people reading this decide what might be useful to their own
situations.

On Sun, Feb 09, 2020 at 10:12:03AM -0700, Allison Collins wrote:
> On 2/2/20 2:46 PM, Dave Chinner wrote:
> > On Fri, Jan 31, 2020 at 08:20:37PM -0700, Allison Collins wrote:
> > > On 1/31/20 12:30 AM, Amir Goldstein wrote:
> > > > On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > > Bottom line - a report of the subsystem review queue status, call for
> > > > reviewers and highlighting specific areas in need of review is a good idea.
> > > > Developers responding to that report publicly with availability for review,
> > > > intention and expected time frame for taking on a review would be helpful
> > > > for both maintainers and potential reviewers.
> > > I definitely think that would help delegate review efforts a little more.
> > > That way it's clear what people are working on, and what still needs
> > > attention.
> > 
> > It is not the maintainer's repsonsibility to gather reviewers. That
> > is entirely the responsibility of the patch submitter. That is, if
> > the code has gone unreviewed, it is up to the submitter to find
> > people to review the code, not the maintainer. If you, as a
> > developer, are unable to find people willing to review your code
> > then it's a sign you haven't been reviewing enough code yourself.
> > 
> > Good reviewers are a valuable resource - as a developer I rely on
> > reviewers to get my code merged, so if I don't review code and
> > everyone else behaves the same way, how can I possibly get my code
> > merged? IOWs, review is something every developer should be spending
> > a significant chunk of their time on. IMO, if you are not spending
> > *at least* a whole day a week reviewing code, you're not actually
> > doing enough code review to allow other developers to be as
> > productive as you are.
> > 
> > The more you review other people's code, the more you learn about
> > the code and the more likely other people will be to review your
> > code because they know you'll review their code in turn.  It's a
> > positive reinforcement cycle that benefits both the individual
> > developers personally and the wider community.
> > 
> > But this positive reinforcemnt cycle just doesn't happen if people
> > avoid reviewing code because they think "I don't know anything so my
> > review is not going to be worth anything".  Constructive review, not
> > matter whether it's performed at a simple or complex level, is
> > always valuable.
> > 
> > Cheers,
> > 
> > Dave.
> > 
> Well, I can see the response is meant to be encouraging, and you are right
> that everyone needs to give to receive :-)
> 
> I have thought a lot about this, and I do have some opinions about it how
> the process is described to work vs how it ends up working though. There has
> quite been a few times I get conflicting reviews from multiple reviewers. I
> suspect either because reviewers are not seeing each others reviews, or
> because it is difficult for people to recall or even find discussions on
> prior revisions.  And so at times, I find myself puzzling a bit trying to
> extrapolate what the community as a whole really wants.

IMO, "what the community wants" really doesn't matter in the end;
the community really doesn't know what it wants.  Note: I'm not
saying that "the community doesn't matter", what I'm trying to say
is that the community is full of diverse opinions and that they are
just that: opinions. And in the end, it's the code that is preserved
forever and the opinions get forgotten.

It's also worth reminding yourself that the author of the code is
also part of the community, and that they are the only community
member that has direct control of the changes that will be made to
the code.  Hence, to me, the only thing that really matters is
answering this question: "What am I, the author of this code,
willing to compromise on to get this code merged?"

That comes down to separating fact from opinion - review comments
are often opinion. e.g. A reviewer finds a bug then suggests how to
fix it.  There are two things here - the bug is fact, the suggested
fix is opinion. The fact needs to be addressed (i.e. the bug must be
fixed), but it does not need to be fixed the way the reviewer
suggested as that is an opinion.

Hence it can be perfectly reasonable to both agree with and disagree
with the reviewer on the same topic. Separate the fact from the
opinion, address the facts and then decide if the opinion improves
the code you have or not.  You may end up may using some, all or
none of the reviewer's suggestion, but just because it was suggested
it doesn't mean you have to change the code that way. The code
author is ultimately in control of the process here.

IMO, people need to be better at saying "no" and accepting "no" as a
valid answer (both code authors and reviewers). "No" is not a
personal rejection, it's just a simple, clear statement that
further progress will not be made by continuing down that path and a
different solution/compromise will have to be found.

> For example: a reviewer may propose a minor change, perhaps a style change,
> and as long as it's not terrible I assume this is just how people are used
> to seeing things implemented.  So I amend it, and in the next revision
> someone expresses that they dislike it and makes a different proposition.

At which point you need to say "no". :) Probably with the
justification that this is "rearranging the deck chairs/painting the
bikeshed again".

I do understand that it might be difficult to ignore suggestions
that people like Darrick, myself, Christoph, etc might make.
However, you should really ignore who the suggestion came from while
thinking about whether it is actually a valuable or necessary
change. ie. don't make a change just because of the _reputation_ of
the person who suggested it.

> Generally I'll mention that this change was requested, but if
> anyone feels particularly strongly about it, to please chime in.
> Most of the time I don't hear anything, I suspect because either
> the first reviewer isn't around, or they don't have time to
> revisit it?  Maybe they weren't strongly opinionated about it to
> begin with?  It could have been they were feeling pressure to
> generate reviews, or maybe an employer is measuring their
> engagement?  In any case, if it goes around a third time, I'll
> usually start including links to prior reviews to try and get
> people on the same page, but most of the time I've found the
> result is that it just falls silent.
> 
> At this point though it feels unclear to me if everyone is happy?

IMO, "everyone is happy" is not acheivable. Trying to make everyone
happy simply leads to stalemate situations where the necessary
compromises to make progress cannot be made because they'll always
make someone "unhappy".

Hence for a significant patchset I always strive for "everyone is
a bit unhappy" as the end result. If everyone is unhappy, it means
everyone has had to compromise in some way to get the work merged.
And I most certainly include myself in the "everyone is unhappy"
group, too.

To reinforce that point, I often issue rvb tags for code that is
functionally correct and does what is needed, but I'm not totally
happy with for some reason (e.g. structure, not the way I'd solve
the problem, etc).  Preventing that code from being merged because
it's not written exactly the way I'd write/solve it is petty and
nobody wins when that happens. Yes, I'd be happier if it was
different, but it's unreasonable to expect the author to do things
exactly the way I'd like it to be done.

This sort of "unhappy but OK" compromise across the board is
generally the best you can hope for. And it's much easier to deal
with competing reviews if you aim for such an "everyone unhappy"
compromise rather than searching for the magic "everyone happy"
unicorn.

> Did we have a constructive review?  Maybe it's not a very big deal
> and I should just move on.  And in many scenarios like the one
> above, the exact outcome appears to be of little concern to people
> in the greater scheme of things.

Exactly - making everyone happy about the code doesn't really
matter. What matters is finding the necessary compromise that will
get the code merged. Indeed, it's a bit of a conundrum - getting the
code merged is what makes people happy, but before that happens you
have to work to share the unhappiness around. :/

> But this pattern does not always
> scale well in all cases.  Complex issues that persist over time
> generally do so because no one yet has a clear idea of what a
> correct solution even looks like, or perhaps cannot agree on one.
> In my experience, getting people to come together on a common goal
> requires a sort of exploratory coding effort. Like a prototype
> that people can look at, learn from, share ideas, and then adapt
> the model from there.

Right, that's pretty common. Often I'll go through 3 or 4 private
prototypes before I even get to posting something on the mailing
list, and then it will take another couple of iterations because
other people start to understand the code. And then another couple
of iterations for them to become comfortable with the code and start
to have really meaningful suggestions for improvement.

> But for that to work, they need to have
> been engaged with the history of it.  They need the common
> experience of seeing what has worked and what hasn't.  It helps
> people to let go of theories that have not performed well in
> practice, and shift to alternate approaches that have.  In a way,
> reviewers that have been historically more involved with a
> particular effort start to become a little integral to it as its
> reviewers.

"reviewers turn into collaborators".

That, in itself, is not a bad thing.

However if you get no other reviewers you'll never get a fresh
persepective, and if your reviewers don't have the necessary
expertise the collaboration won't result in better code. It also
tends to sideline people who think the problem should be tackled a
different way. We don't want review to be a rubber stamp issued by
"yes people". We want people who disagree to voice their
disagreement and ask for change because that's how we improve the
code.

Personally, I do not review every version of every patchset simply
because I don't have the time to do that.  Hence I tend to do design
review on early patchset iterations rather than code reviews.
I need to understand the design and structure of the
change before I'll be able to look at the detail of the code and
understand whether it is correct or not. Hence I might do an early
design review from reading the patches and make high level comments,
then skip the next 3-4 iterations where the code changes
significantly as bugs are fixed and code is cleaned up. Then I'll go
back and look at the code in detail and perform an actual code
review, and I will not have any idea about what was said in the
previous review iterations. And I'll come back every 3-4 versions of
the patchset, too.

IMO, expecting reviewers to know everything that happened in all the
previous reviews is unrealistic.  If there is some reason for the
code being the way it is and it is not documented in the patch, then
the change needs more work. "Why is the code like this?" followed
by "please document this in a comment" is a reasonable review
request and you won't get those questions from reviewers who look at
every version of the patchset because they are familiary with the
history of the code.

If this sort of question is not asked during review, we end up in
the situation where someone looks at the code a year after it is
merged and there's no documentation in the code or commit messages
that explains why the code is written the way it is. This is what
makes drive-by reviews valuable - people unfamiliar with the code
see it differently and ask different questions....

IOWs, everyone is free to review code at any time with no
constraints, past requirements or future liabilities.  If the code
author wants someone to review every version of a patchset, then
they need to co-ordinate those people to set aside the time to
review the code in a prompt and efficient manner. However, expecting
that other developers cannot review a V5 patchset without first
having read all of the reviews of v1-v4 is unrealistic, especially
as this reduces the value of their review substantially as their
viewpoint is now "tainted" by knowing the history of the code up to
this point....

> Which I *think* is what Darrick may be eluding to in
> his initial proposition.  People request for certain reviewers, or
> perhaps the reviewers can volunteer to be sort of assigned to it
> in an effort to provide more constructive reviews.  In this way,
> reviewers allocate their efforts where they are most effective,
> and in doing so better distribute the work load as well.  Did I
> get that about right?  Thoughts?

[ speaking as someone who was almost completely burnt out by
having to perform most of the review, testing and maintanence tasks
for kernel, userspace and test infrastructue for several years ]

Again, I don't think the maintainer should have anything to do with
this process. Distributed development requires peers to collaborate
and self organise without central coordination of their efforts.
Hence it is the responsibility of the code author to organise review
for code they are working on. The code author wants the code merged,
therefore they are the people with the motivation to get the code
reviewed promptly.

And, quite frankly, the maintainer has no power to force someone to
review code. Hence asking the maintainer to manage reviewers on
behalf of code authors won't work any better than
distributing/delegating that responsibility to the code author...

Historically speaking, the XFS maintainer's prime reponsibility has
been to merge community reviewed code into the upstream tree, not to
review code or tell people what code they should be reviewing. The
fact that we are still placing the burden of review and
co-ordinating review on the maintainer despite ample evidence that
it causes burn-out is a failure of our community to self-organise
and protect our valuable resources....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-12 22:21   ` Darrick J. Wong
@ 2020-02-13  1:23     ` Dave Chinner
  0 siblings, 0 replies; 27+ messages in thread
From: Dave Chinner @ 2020-02-13  1:23 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Matthew Wilcox, lsf-pc, linux-fsdevel, xfs, Eric Sandeen, Eryu Guan

On Wed, Feb 12, 2020 at 02:21:18PM -0800, Darrick J. Wong wrote:
> On Fri, Feb 07, 2020 at 02:03:33PM -0800, Matthew Wilcox wrote:
> > Third, I hear from people who work on a specific filesystem "Of the
> > twenty or so slots for the FS part of the conference, there are about
> > half a dozen generic filesystem people who'll get an invite, then maybe
> > six filesystems who'll get two slots each, but what we really want to
> > do is get everybody working on this filesystem in a room and go over
> > our particular problem areas".
> 
> Yes!  One thousand times yes!  The best value I've gotten from LSF has
> been the in-person interlocks with the XFS/ext4/btrfs developers, even
> if the thing we discuss in the hallway BOFs have not really been
> "cross-subsystem topics".

On that note, I think the rigid "3 streams and half hour timeslot"
format of LSFMM is really the biggest issues LSFMM has. Most of the
time there are only 3-4 people discussing whatever topic is
scheduled, and the other 15-20 people in the room either have no
knowledge, no interest or no interactions with the issue/code being
discussed. That has always struck me as a massive waste of valuable
face-to-face time that could be put to much better use.

Making LSFMM bigger doesn't fix this problem, either. it just makes
it worse because there's more people sitting around twiddling their
thumbs while the same 3-4 people talk across the room at each other.

Then there's the stream topics that overlap and the complete lack of
recording of the discussions. i.e. you get a schedule conflict, and
you miss out on a critically important discussion completely. You
can't even go watch it back later in the day when you're sitting
waiting for some talk you have no interest in to complete....

Further, there is no real scope to allow groups of developers to
self organise and sit down and solve a problem they need solved that
is not on the schedule. There are no small "breakout" rooms with
tables, power, and whiteboards, etc, and hence people who aren't
engaged in the scheduled topics have nowhere they can get together
and work through problems they need solved with other developers.

If you do need to skip scheduled discussions to get a group together
to solve problems not on the schedule, then everyone loses because
there's no recordings of the discussions they missed....

Unfortunately LSFMM hasn't really attempted to facilitate this sort
of face-to-face collaboration for some time - LSFMM is for talking,
not doing. What we need are better ways of doing, not talking...

> > This kills me because LSFMM has been such a critically important part of
> > Linux development for over a decade, but I think at this point it is at
> > least not serving us the way we want it to, and may even be doing more
> > harm than good.
> 
> I don't think I'd go quite that far, but it's definitely underserving
> the people who can't get in, the people who can't go, and the people who
> are too far away but gosh it would be nice to pull them in even if it's
> only for 30 minutes over a conference call.

A live stream and a restricted access IRC channel for all local and
remote participants would be just fine for all the scheduled
"talking heads" discussions....

e.g. a small group of devs are in a breakout room working on
something directly relevant to them, and they a laptop playing the
live stream of a discussion they are interested in but less
important than what they are currently doing.  Someone hears
something they need to comment on, so they quickly jump onto IRC and
their comment is noticed by everyone in the main discussion room....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-12 22:42   ` Darrick J. Wong
@ 2020-02-13 10:21     ` Amir Goldstein
  0 siblings, 0 replies; 27+ messages in thread
From: Amir Goldstein @ 2020-02-13 10:21 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-fsdevel, xfs, lsf-pc, Eric Sandeen, Eryu Guan

> >
> > Eryu usually posts a weekly status of xfstests review queue, often with
> > a call for reviewers, sometimes with specific patch series mentioned.
> > That helps me as a developer to monitor the status of my own work
> > and it helps me as a reviewer to put the efforts where the maintainer
> > needs me the most.
>
> I wasn't aware of that, I'll ask him to put me on the list.
>

I was talking about the weekly xfstests ANNOUNCE email.
Most of the time, review information amounts to "there are still patches
in my review backlog", but sometimes the information is more
specific with call for specific reviews:
https://lore.kernel.org/fstests/5ddaafc5.1c69fb81.7e284.ad99@mx.google.com/

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-12 22:06         ` Darrick J. Wong
  2020-02-12 22:19           ` Dan Williams
@ 2020-02-13 15:11           ` Brian Foster
  2020-02-13 15:46             ` Matthew Wilcox
  1 sibling, 1 reply; 27+ messages in thread
From: Brian Foster @ 2020-02-13 15:11 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Allison Collins, Dave Chinner, Amir Goldstein, lsf-pc,
	linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Wed, Feb 12, 2020 at 02:06:00PM -0800, Darrick J. Wong wrote:
> On Sun, Feb 09, 2020 at 10:12:03AM -0700, Allison Collins wrote:
> > 
> > 
> > On 2/2/20 2:46 PM, Dave Chinner wrote:
> > > On Fri, Jan 31, 2020 at 08:20:37PM -0700, Allison Collins wrote:
> > > > 
> > > > 
> > > > On 1/31/20 12:30 AM, Amir Goldstein wrote:
> > > > > On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > > > > 
> > > > > > Hi everyone,
> > > > > > 
> > > > > > I would like to discuss how to improve the process of shepherding code
> > > > > > into the kernel to make it more enjoyable for maintainers, reviewers,
> > > > > > and code authors.  Here is a brief summary of how we got here:
> > > > > > 
> > > > > > Years ago, XFS had one maintainer tending to all four key git repos
> > > > > > (kernel, userspace, documentation, testing).  Like most subsystems, the
> > > > > > maintainer did a lot of review and porting code between the kernel and
> > > > > > userspace, though with help from others.
> > > > > > 
> > > > > > It turns out that this didn't scale very well, so we split the
> > > > > > responsibilities into three maintainers.  Like most subsystems, the
> > > > > > maintainers still did a lot of review and porting work, though with help
> > > > > > from others.
> > > > > > 
> > > > > > It turns out that this system doesn't scale very well either.  Even with
> > > > > > three maintainers sharing access to the git trees and working together
> > > > > > to get reviews done, mailing list traffic has been trending upwards for
> > > > > > years, and we still can't keep up.  I fear that many maintainers are
> > > > > > burning out.  For XFS, the biggest pain point (AFAICT) is not assembly and
> > > > > > testing of the git trees, but keeping up with the mail and the reviews.
> > > > > > 
> > > > > > So what do we do about this?  I think we (the XFS project, anyway)
> > > > > > should increase the amount of organizing in our review process.  For
> > > > > > large patchsets, I would like to improve informal communication about
> > > > > > who the author might like to have conduct a review, who might be
> > > > > > interested in conducting a review, estimates of how much time a reviewer
> > > > > > has to spend on a patchset, and of course, feedback about how it went.
> > > > > > This of course is to lay the groundwork for making a case to our bosses
> > > > > > for growing our community, allocating time for reviews and for growing
> > > > > > our skills as reviewers.
> > > > > > 
> > > > > 
> > > > > Interesting.
> > > > > 
> > > > > Eryu usually posts a weekly status of xfstests review queue, often with
> > > > > a call for reviewers, sometimes with specific patch series mentioned.
> > > > > That helps me as a developer to monitor the status of my own work
> > > > > and it helps me as a reviewer to put the efforts where the maintainer
> > > > > needs me the most.
> > > > > 
> > > > > For xfs kernel patches, I can represent the voice of "new blood".
> > > > > Getting new people to join the review effort is quite a hard barrier.
> > > > > I have taken a few stabs at doing review for xfs patch series over the
> > > > > year, but it mostly ends up feeling like it helped me (get to know xfs code
> > > > > better) more than it helped the maintainer, because the chances of a
> > > > > new reviewer to catch meaningful bugs are very low and if another reviewer
> > > > > is going to go over the same patch series, the chances of new reviewer to
> > > > > catch bugs that novice reviewer will not catch are extremely low.
> > > > That sounds like a familiar experience.  Lots of times I'll start a review,
> > > > but then someone else will finish it before I do, and catch more things
> > > > along the way.  So I sort of feel like if it's not something I can get
> > > > through quickly, then it's not a very good distribution of work effort and I
> > > > should shift to something else. Most of the time, I'll study it until I feel
> > > > like I understand what the person is trying to do, and I might catch stuff
> > > > that appears like it may not align with that pursuit, but I don't
> > > > necessarily feel I can deem it void of all unforeseen bugs.
> > > 
> > > I think you are both underselling yourselves. Imposter syndrome and
> > > all that jazz.
> > > 
> > > The reality is that we don't need more people doing the sorts of
> > > "how does this work with the rest of XFS" reviews that people like
> > > Darricki or Christoph do. What we really need is more people looking
> > > at whether loops are correctly terminated, the right variable types
> > > are used, we don't have signed vs unsigned issues, 32 bit overflows,
> > > use the right 32/64 bit division functions, the error handling logic
> > > is correct, etc.
> > > 
> > > It's those sorts of little details that lead to most bugs, and
> > > that's precisely the sort of thing that is typically missed by an
> > > experienced developer doing a "is this the best possible
> > > implemenation of this functionality" review.
> > > 
> > > A recent personal example: look at the review of Matthew Wilcox's
> > > ->readahead() series that I recently did. I noticed problems in the
> > > core change and the erofs and btfrs implementations not because I
> > > knew anything about those filesystems, but because I was checking
> > > whether the new loops iterated the pages in the page cache
> > > correctly. i.e. all I was really looking at was variable counting
> > > and loop initialisation and termination conditions. Experience tells
> > > me this stuff is notoriously difficult to get right, so that's what
> > > I looked at....
> > > 
> > > IOWs, you don't need to know anything about the subsystem to
> > > perform such a useful review, and a lot of the time you won't find a
> > > problem. But it's still a very useful review to perform, and in
> > > doing so you've validated, to the best of your ability, that the
> > > change is sound. Put simply:
> > > 
> > > 	"I've checked <all these things> and it looks good to me.
> > > 
> > > 	Reviewed-by: Joe Bloggs <joe@blogg.com>"
> > > 
> > > This is a very useful, valid review, regardless of whether you find
> > > anything. It's also a method of review that you can use when you
> > > have limited time - rather than trying to check everything and
> > > spending hours on a pathset, pick one thing and get the entire
> > > review done in 15 minutes. Then do the same thing for the next patch
> > > set. You'll be surprised how many things you notice that aren't what
> > > you are looking for when you do this.
> > > 
> > > Hence the fact that other people find (different) issues is
> > > irrelevant - they'll be looking at different things to you, and
> > > there may not even be any overlap in the focus/scope of the reviews
> > > that have been performed. You may find the same things, but that is
> > > also not a bad thing - I intentionally don't read other reviews
> > > before I review a patch series, so that I don't taint my view of the
> > > code before I look at it (e.g., darrick found a bug in this code, so
> > > I don't need to look at it...).
> > > 
> > > IOWs, if you are starting from the premise that "I don't know this
> > > code well enough to perform a useful review" then you are setting
> > > yourself up for failure right at the start. Read the series
> > > description, think about the change being made, use your experience
> > > to answer the question "what's a mistake I could make performing
> > > this change". Then go looking for that mistake through the
> > > patch(es). In the process of performing this review, more than
> > > likely, you'll notice bugs other than what you are actually looking
> > > for...
> > > 
> > > This does not require any deep subsystem specific knowledge, but in
> > > doing this sort of review you're going to notice things and learn
> > > about the code and slowly build your knowledge and experience about
> > > that subsystem.
> > > 
> > > > > However, there are quite a few cleanup and refactoring patch series,
> > > > > especially on the xfs list, where a review from an "outsider" could still
> > > > > be of value to the xfs community. OTOH, for xfs maintainer, those are
> > > > > the easy patches to review, so is there a gain in offloading those reviews?
> > > > > 
> > > > > Bottom line - a report of the subsystem review queue status, call for
> > > > > reviewers and highlighting specific areas in need of review is a good idea.
> > > > > Developers responding to that report publicly with availability for review,
> > > > > intention and expected time frame for taking on a review would be helpful
> > > > > for both maintainers and potential reviewers.
> > > > I definitely think that would help delegate review efforts a little more.
> > > > That way it's clear what people are working on, and what still needs
> > > > attention.
> > > 
> > > It is not the maintainer's repsonsibility to gather reviewers. That
> > > is entirely the responsibility of the patch submitter. That is, if
> > > the code has gone unreviewed, it is up to the submitter to find
> > > people to review the code, not the maintainer. If you, as a
> > > developer, are unable to find people willing to review your code
> > > then it's a sign you haven't been reviewing enough code yourself.
> > > 
> > > Good reviewers are a valuable resource - as a developer I rely on
> > > reviewers to get my code merged, so if I don't review code and
> > > everyone else behaves the same way, how can I possibly get my code
> > > merged? IOWs, review is something every developer should be spending
> > > a significant chunk of their time on. IMO, if you are not spending
> > > *at least* a whole day a week reviewing code, you're not actually
> > > doing enough code review to allow other developers to be as
> > > productive as you are.
> > > 
> > > The more you review other people's code, the more you learn about
> > > the code and the more likely other people will be to review your
> > > code because they know you'll review their code in turn.  It's a
> > > positive reinforcement cycle that benefits both the individual
> > > developers personally and the wider community.
> > > 
> > > But this positive reinforcemnt cycle just doesn't happen if people
> > > avoid reviewing code because they think "I don't know anything so my
> > > review is not going to be worth anything".  Constructive review, not
> > > matter whether it's performed at a simple or complex level, is
> > > always valuable.
> > > 
> > > Cheers,
> > > 
> > > Dave.
> > > 
> > Well, I can see the response is meant to be encouraging, and you are right
> > that everyone needs to give to receive :-)
> > 
> > I have thought a lot about this, and I do have some opinions about it how
> > the process is described to work vs how it ends up working though. There has
> > quite been a few times I get conflicting reviews from multiple reviewers. I
> > suspect either because reviewers are not seeing each others reviews, or
> > because it is difficult for people to recall or even find discussions on
> > prior revisions.  And so at times, I find myself puzzling a bit trying to
> > extrapolate what the community as a whole really wants.
> 
> <and now jumping back in from the reply I sent to Neil Brown earlier...>
> 
> > For example: a reviewer may propose a minor change, perhaps a style change,
> > and as long as it's not terrible I assume this is just how people are used
> > to seeing things implemented.  So I amend it, and in the next revision
> > someone expresses that they dislike it and makes a different proposition.
> > Generally I'll mention that this change was requested, but if anyone feels
> > particularly strongly about it, to please chime in.  Most of the time I
> > don't hear anything, I suspect because either the first reviewer isn't
> > around, or they don't have time to revisit it?
> 
> Definitely guilty of that here. :(
> 

I think we all are to some degree. :) If enough time passes between
review cycles, I've seen such conflicting feedback come from the same
person (and I'm pretty sure I'm guilty of that one too). ;P

It can be daunting or intimidating at first because the developer is
more trying to please everybody, but I think over time you kind of
realize that you aren't ever going to please everybody and that the
patch author has just as much say on subjective things as reviewers
might (I think Dave touched on this more thoroughly in his reply).
Personally, when I see such conflicting suggestions, I tend to take it
as a confirmation that either approach is generally reasonable. :)

> I've noticed that my own reviews decline in quality and coherency of
> pickiness as the version count increases.  It's not easy to tell what's
> the real difference between v2 and v3 of a series without git-am'ing
> both series, building branches, and then git diffing the branches, and
> that is /very/ time consuming.
> 
> I've really really wanted to be able to tell people to just send a pull
> request for large series and skip all the email patch review stuff, but
> I'm well aware that will start a popular revolt.  But maybe we can do
> both?  Is it legit to ask that if you're sending more than a simple
> quickfix, to please push a branch somewhere so that I can just yank it
> down and have a look?  I try to do that with every series that I send, I
> think Allison has been doing that, Christoph does it sometimes, etc.
> 

I'm not a fan of "skipping all the email patch review stuff," but I
think it's reasonable to ask for a public repo to pull from for larger
series if that makes things easier and provided we have a host generally
available to developers. Not needing to set one up is really the only
reason I haven't done that to this point.

> > Maybe they weren't strongly
> > opinionated about it to begin with?  It could have been they were feeling
> > pressure to generate reviews, or maybe an employer is measuring their
> > engagement?  In any case, if it goes around a third time, I'll usually start
> > including links to prior reviews to try and get people on the same page, but
> > most of the time I've found the result is that it just falls silent.
> 
> (And that's where we end up at this annoying thing of "Well the
> reviewers don't agree, the maintainer notices the awkward silence, and
> eventually gives in and 'resolves' the conflict by reading the whole
> series...)
> 

I think it's also fair for the maintainer to indicate that the overhead
to resolve some focused conflict in the middle of some huge series is
too high to be worth the time commitment and that the author/reviewers
should resolve it amongst themselves to get the code merged. ;)

> > At this point though it feels unclear to me if everyone is happy?  Did we
> > have a constructive review?  Maybe it's not a very big deal and I should
> > just move on.
> 
> Only you can't, unless the maintainer either pulls the patches or sends
> back a forever-nak.  The whole process of shepherding code into the
> kernel grinds to a halt because we can't agree on things like "two
> function with similar argument lists or one function with flags?".
> 
> > And in many scenarios like the one above, the exact outcome
> > appears to be of little concern to people in the greater scheme of things.
> > But this pattern does not always scale well in all cases.  Complex issues
> > that persist over time generally do so because no one yet has a clear idea
> > of what a correct solution even looks like, or perhaps cannot agree on one.
> > In my experience, getting people to come together on a common goal requires
> > a sort of exploratory coding effort.
> 
> Ok, so there's this yearslong D** mess where we can't quite agree on how
> the user interface should work.
> 
> TBH, I've wondered if we could reduce the amount of resistance to new
> code and ABIs by making /every/ new ioctl and syscall announce itself as
> EXPERIMENTAL for the first 6-9mo to give people time to prototype how
> these things will work for real, and while we won't change the behavior
> capriciously during that time, if we have to break things to prevent a
> greater harm, we can do so without having to rev the whole interface.
> 
> I've also wished we had a better way to prototype new things in XFS
> while not affecting the stable codebase, but various people have warned
> me not to repeat the mistake of ext4dev. :/
> 
> > Like a prototype that people can look
> > at, learn from, share ideas, and then adapt the model from there.  But for
> > that to work, they need to have been engaged with the history of it.  They
> > need the common experience of seeing what has worked and what hasn't.  It
> > helps people to let go of theories that have not performed well in practice,
> > and shift to alternate approaches that have.  In a way, reviewers that have
> > been historically more involved with a particular effort start to become a
> > little integral to it as its reviewers.  Which I *think* is what Darrick may
> > be eluding to in his initial proposition.
> 
> Yes.
> 
> > People request for certain
> > reviewers, or perhaps the reviewers can volunteer to be sort of assigned to
> > it in an effort to provide more constructive reviews.  In this way,
> > reviewers allocate their efforts where they are most effective, and in doing
> > so better distribute the work load as well.  Did I get that about right?
> > Thoughts?
> 
> And yes.  We of course have zero ability to coerce anyone to do things,
> but at the same time, we're all collaborating to build something better,
> and hoping that people organically decide to do things really doesn't
> scale any more, given that we're burning out a second generation of
> maintainers. :/
> 

I wonder if we're conflating scaling with development process
management/expectations (or whatever) a little bit here. The broader
kernel has a (fairly frequent) rolling, time-based release cycle and
generally no explicit plan as to what features go into what release over
others. IOW, patches are either ready and merged in time or not. If they
aren't, thankfully there's another release deadline to miss right around
the corner. ;)

It seems a little strange (to me) for us to try and build some complex
development process on top of that, where we'd assign reviewers, target
features to specific releases before ready, etc. I see that as just
creating more overhead than actually fixing anything, but that's just my
initial reaction..

With regard to the burnout thing, ISTM the core functionality of the
maintainer is to maintain the integrity of the subtree. That involves
things like enforcing development process (i.e., requiring r-b tags on
all patches to merge), but not necessarily being obligated to resolve
conflicts or to review every patch that comes through in detail, or
guarantee that everything sent to the list makes it into the next
release, etc. If certain things aren't being reviewed in time or making
progress and that somehow results in additional pressure on the
maintainer, ISTM that something is wrong there..?

On a more logistical note, didn't we (XFS) discuss the idea of a
rotating maintainership at one point? I know Dave had dealt with burnout
after doing this job for quite some time, Darrick stepped in and it
sounds like he is now feeling it as well (and maybe Eric, having to hold
down the xfsprogs fort). I'm not maintainer nor do I really want to be,
but I'd be willing to contribute to maintainer like duties on a limited
basis if there's a need. For example, if we had a per-release rotation
of 3+ people willing to contribute, perhaps that could spread the pain
around sufficiently..? Just a thought, and of course not being a
maintainer I have no idea how realistic something like that might be..

Brian

> --D
> 
> > Allison
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-12 23:39         ` Dave Chinner
@ 2020-02-13 15:19           ` Brian Foster
  2020-02-17  0:11             ` Dave Chinner
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2020-02-13 15:19 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Allison Collins, Amir Goldstein, Darrick J. Wong, lsf-pc,
	linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Thu, Feb 13, 2020 at 10:39:29AM +1100, Dave Chinner wrote:
> Hi Allison,
> 
> These are some very good observations and questions. I've found it
> hard to write a meaningful response because there are many things I
> just take for granted having done this for many many years...
> 
> I don't have all the answers. Lots of what I say is opinion (see my
> comments about opinions below!) based on experience and the best I
> can do to help improve the situation is to share what I think and
> let people reading this decide what might be useful to their own
> situations.
> 
> On Sun, Feb 09, 2020 at 10:12:03AM -0700, Allison Collins wrote:
> > On 2/2/20 2:46 PM, Dave Chinner wrote:
> > > On Fri, Jan 31, 2020 at 08:20:37PM -0700, Allison Collins wrote:
> > > > On 1/31/20 12:30 AM, Amir Goldstein wrote:
> > > > > On Fri, Jan 31, 2020 at 7:25 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > > > Bottom line - a report of the subsystem review queue status, call for
> > > > > reviewers and highlighting specific areas in need of review is a good idea.
> > > > > Developers responding to that report publicly with availability for review,
> > > > > intention and expected time frame for taking on a review would be helpful
> > > > > for both maintainers and potential reviewers.
> > > > I definitely think that would help delegate review efforts a little more.
> > > > That way it's clear what people are working on, and what still needs
> > > > attention.
> > > 
> > > It is not the maintainer's repsonsibility to gather reviewers. That
> > > is entirely the responsibility of the patch submitter. That is, if
> > > the code has gone unreviewed, it is up to the submitter to find
> > > people to review the code, not the maintainer. If you, as a
> > > developer, are unable to find people willing to review your code
> > > then it's a sign you haven't been reviewing enough code yourself.
> > > 
> > > Good reviewers are a valuable resource - as a developer I rely on
> > > reviewers to get my code merged, so if I don't review code and
> > > everyone else behaves the same way, how can I possibly get my code
> > > merged? IOWs, review is something every developer should be spending
> > > a significant chunk of their time on. IMO, if you are not spending
> > > *at least* a whole day a week reviewing code, you're not actually
> > > doing enough code review to allow other developers to be as
> > > productive as you are.
> > > 
> > > The more you review other people's code, the more you learn about
> > > the code and the more likely other people will be to review your
> > > code because they know you'll review their code in turn.  It's a
> > > positive reinforcement cycle that benefits both the individual
> > > developers personally and the wider community.
> > > 
> > > But this positive reinforcemnt cycle just doesn't happen if people
> > > avoid reviewing code because they think "I don't know anything so my
> > > review is not going to be worth anything".  Constructive review, not
> > > matter whether it's performed at a simple or complex level, is
> > > always valuable.
> > > 
> > > Cheers,
> > > 
> > > Dave.
> > > 
> > Well, I can see the response is meant to be encouraging, and you are right
> > that everyone needs to give to receive :-)
> > 
> > I have thought a lot about this, and I do have some opinions about it how
> > the process is described to work vs how it ends up working though. There has
> > quite been a few times I get conflicting reviews from multiple reviewers. I
> > suspect either because reviewers are not seeing each others reviews, or
> > because it is difficult for people to recall or even find discussions on
> > prior revisions.  And so at times, I find myself puzzling a bit trying to
> > extrapolate what the community as a whole really wants.
> 
> IMO, "what the community wants" really doesn't matter in the end;
> the community really doesn't know what it wants.  Note: I'm not
> saying that "the community doesn't matter", what I'm trying to say
> is that the community is full of diverse opinions and that they are
> just that: opinions. And in the end, it's the code that is preserved
> forever and the opinions get forgotten.
> 
> It's also worth reminding yourself that the author of the code is
> also part of the community, and that they are the only community
> member that has direct control of the changes that will be made to
> the code.  Hence, to me, the only thing that really matters is
> answering this question: "What am I, the author of this code,
> willing to compromise on to get this code merged?"
> 
> That comes down to separating fact from opinion - review comments
> are often opinion. e.g. A reviewer finds a bug then suggests how to
> fix it.  There are two things here - the bug is fact, the suggested
> fix is opinion. The fact needs to be addressed (i.e. the bug must be
> fixed), but it does not need to be fixed the way the reviewer
> suggested as that is an opinion.
> 
> Hence it can be perfectly reasonable to both agree with and disagree
> with the reviewer on the same topic. Separate the fact from the
> opinion, address the facts and then decide if the opinion improves
> the code you have or not.  You may end up may using some, all or
> none of the reviewer's suggestion, but just because it was suggested
> it doesn't mean you have to change the code that way. The code
> author is ultimately in control of the process here.
> 
> IMO, people need to be better at saying "no" and accepting "no" as a
> valid answer (both code authors and reviewers). "No" is not a
> personal rejection, it's just a simple, clear statement that
> further progress will not be made by continuing down that path and a
> different solution/compromise will have to be found.
> 
> > For example: a reviewer may propose a minor change, perhaps a style change,
> > and as long as it's not terrible I assume this is just how people are used
> > to seeing things implemented.  So I amend it, and in the next revision
> > someone expresses that they dislike it and makes a different proposition.
> 
> At which point you need to say "no". :) Probably with the
> justification that this is "rearranging the deck chairs/painting the
> bikeshed again".
> 
> I do understand that it might be difficult to ignore suggestions
> that people like Darrick, myself, Christoph, etc might make.
> However, you should really ignore who the suggestion came from while
> thinking about whether it is actually a valuable or necessary
> change. ie. don't make a change just because of the _reputation_ of
> the person who suggested it.
> 
> > Generally I'll mention that this change was requested, but if
> > anyone feels particularly strongly about it, to please chime in.
> > Most of the time I don't hear anything, I suspect because either
> > the first reviewer isn't around, or they don't have time to
> > revisit it?  Maybe they weren't strongly opinionated about it to
> > begin with?  It could have been they were feeling pressure to
> > generate reviews, or maybe an employer is measuring their
> > engagement?  In any case, if it goes around a third time, I'll
> > usually start including links to prior reviews to try and get
> > people on the same page, but most of the time I've found the
> > result is that it just falls silent.
> > 
> > At this point though it feels unclear to me if everyone is happy?
> 
> IMO, "everyone is happy" is not acheivable. Trying to make everyone
> happy simply leads to stalemate situations where the necessary
> compromises to make progress cannot be made because they'll always
> make someone "unhappy".
> 
> Hence for a significant patchset I always strive for "everyone is
> a bit unhappy" as the end result. If everyone is unhappy, it means
> everyone has had to compromise in some way to get the work merged.
> And I most certainly include myself in the "everyone is unhappy"
> group, too.
> 
> To reinforce that point, I often issue rvb tags for code that is
> functionally correct and does what is needed, but I'm not totally
> happy with for some reason (e.g. structure, not the way I'd solve
> the problem, etc).  Preventing that code from being merged because
> it's not written exactly the way I'd write/solve it is petty and
> nobody wins when that happens. Yes, I'd be happier if it was
> different, but it's unreasonable to expect the author to do things
> exactly the way I'd like it to be done.
> 
> This sort of "unhappy but OK" compromise across the board is
> generally the best you can hope for. And it's much easier to deal
> with competing reviews if you aim for such an "everyone unhappy"
> compromise rather than searching for the magic "everyone happy"
> unicorn.
> 
> > Did we have a constructive review?  Maybe it's not a very big deal
> > and I should just move on.  And in many scenarios like the one
> > above, the exact outcome appears to be of little concern to people
> > in the greater scheme of things.
> 
> Exactly - making everyone happy about the code doesn't really
> matter. What matters is finding the necessary compromise that will
> get the code merged. Indeed, it's a bit of a conundrum - getting the
> code merged is what makes people happy, but before that happens you
> have to work to share the unhappiness around. :/
> 
> > But this pattern does not always
> > scale well in all cases.  Complex issues that persist over time
> > generally do so because no one yet has a clear idea of what a
> > correct solution even looks like, or perhaps cannot agree on one.
> > In my experience, getting people to come together on a common goal
> > requires a sort of exploratory coding effort. Like a prototype
> > that people can look at, learn from, share ideas, and then adapt
> > the model from there.
> 
> Right, that's pretty common. Often I'll go through 3 or 4 private
> prototypes before I even get to posting something on the mailing
> list, and then it will take another couple of iterations because
> other people start to understand the code. And then another couple
> of iterations for them to become comfortable with the code and start
> to have really meaningful suggestions for improvement.
> 
> > But for that to work, they need to have
> > been engaged with the history of it.  They need the common
> > experience of seeing what has worked and what hasn't.  It helps
> > people to let go of theories that have not performed well in
> > practice, and shift to alternate approaches that have.  In a way,
> > reviewers that have been historically more involved with a
> > particular effort start to become a little integral to it as its
> > reviewers.
> 
> "reviewers turn into collaborators".
> 
> That, in itself, is not a bad thing.
> 
> However if you get no other reviewers you'll never get a fresh
> persepective, and if your reviewers don't have the necessary
> expertise the collaboration won't result in better code. It also
> tends to sideline people who think the problem should be tackled a
> different way. We don't want review to be a rubber stamp issued by
> "yes people". We want people who disagree to voice their
> disagreement and ask for change because that's how we improve the
> code.
> 
> Personally, I do not review every version of every patchset simply
> because I don't have the time to do that.  Hence I tend to do design
> review on early patchset iterations rather than code reviews.
> I need to understand the design and structure of the
> change before I'll be able to look at the detail of the code and
> understand whether it is correct or not. Hence I might do an early
> design review from reading the patches and make high level comments,
> then skip the next 3-4 iterations where the code changes
> significantly as bugs are fixed and code is cleaned up. Then I'll go
> back and look at the code in detail and perform an actual code
> review, and I will not have any idea about what was said in the
> previous review iterations. And I'll come back every 3-4 versions of
> the patchset, too.
> 
> IMO, expecting reviewers to know everything that happened in all the
> previous reviews is unrealistic.  If there is some reason for the
> code being the way it is and it is not documented in the patch, then
> the change needs more work. "Why is the code like this?" followed
> by "please document this in a comment" is a reasonable review
> request and you won't get those questions from reviewers who look at
> every version of the patchset because they are familiary with the
> history of the code.
> 
> If this sort of question is not asked during review, we end up in
> the situation where someone looks at the code a year after it is
> merged and there's no documentation in the code or commit messages
> that explains why the code is written the way it is. This is what
> makes drive-by reviews valuable - people unfamiliar with the code
> see it differently and ask different questions....
> 
> IOWs, everyone is free to review code at any time with no
> constraints, past requirements or future liabilities.  If the code
> author wants someone to review every version of a patchset, then
> they need to co-ordinate those people to set aside the time to
> review the code in a prompt and efficient manner. However, expecting
> that other developers cannot review a V5 patchset without first
> having read all of the reviews of v1-v4 is unrealistic, especially
> as this reduces the value of their review substantially as their
> viewpoint is now "tainted" by knowing the history of the code up to
> this point....
> 
> > Which I *think* is what Darrick may be eluding to in
> > his initial proposition.  People request for certain reviewers, or
> > perhaps the reviewers can volunteer to be sort of assigned to it
> > in an effort to provide more constructive reviews.  In this way,
> > reviewers allocate their efforts where they are most effective,
> > and in doing so better distribute the work load as well.  Did I
> > get that about right?  Thoughts?
> 
> [ speaking as someone who was almost completely burnt out by
> having to perform most of the review, testing and maintanence tasks
> for kernel, userspace and test infrastructue for several years ]
> 
> Again, I don't think the maintainer should have anything to do with
> this process. Distributed development requires peers to collaborate
> and self organise without central coordination of their efforts.
> Hence it is the responsibility of the code author to organise review
> for code they are working on. The code author wants the code merged,
> therefore they are the people with the motivation to get the code
> reviewed promptly.
> 
> And, quite frankly, the maintainer has no power to force someone to
> review code. Hence asking the maintainer to manage reviewers on
> behalf of code authors won't work any better than
> distributing/delegating that responsibility to the code author...
> 
> Historically speaking, the XFS maintainer's prime reponsibility has
> been to merge community reviewed code into the upstream tree, not to
> review code or tell people what code they should be reviewing. The
> fact that we are still placing the burden of review and
> co-ordinating review on the maintainer despite ample evidence that
> it causes burn-out is a failure of our community to self-organise
> and protect our valuable resources....
> 

I agree with just about all of this mail, but I do have a question for
Darrick and Dave.. in what ways do we (XFS community) place the burden
on the maintainer to garner/coordinate review? I get the impression that
some of this is implied or unintentional simply by reviews coming in
slow or whatever, and the maintainer feeling the need to fill the gap. I
can sort of understand that behavior as being human nature, but OTOH
there's only so many developers with so much time available for review.

That's not generally something we can control. Given that, I don't think
it's necessarily useful to say that review throughput is the fundamental
problem vs. this unfortunate dynamic of limited reviews not meeting
demand leading to excess maintainer pressure. I'm trying to think about
how we can deal with this more fundamentally beyond just saying "do more
review" or having some annoying ownership process, but I think this
requires more input from you guys who have experienced it...

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-13 15:11           ` Brian Foster
@ 2020-02-13 15:46             ` Matthew Wilcox
  2020-02-16 21:55               ` Dave Chinner
  0 siblings, 1 reply; 27+ messages in thread
From: Matthew Wilcox @ 2020-02-13 15:46 UTC (permalink / raw)
  To: Brian Foster
  Cc: Darrick J. Wong, Allison Collins, Dave Chinner, Amir Goldstein,
	lsf-pc, linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Thu, Feb 13, 2020 at 10:11:00AM -0500, Brian Foster wrote:
> With regard to the burnout thing, ISTM the core functionality of the
> maintainer is to maintain the integrity of the subtree. That involves
> things like enforcing development process (i.e., requiring r-b tags on
> all patches to merge), but not necessarily being obligated to resolve
> conflicts or to review every patch that comes through in detail, or
> guarantee that everything sent to the list makes it into the next
> release, etc. If certain things aren't being reviewed in time or making
> progress and that somehow results in additional pressure on the
> maintainer, ISTM that something is wrong there..?
> 
> On a more logistical note, didn't we (XFS) discuss the idea of a
> rotating maintainership at one point? I know Dave had dealt with burnout
> after doing this job for quite some time, Darrick stepped in and it
> sounds like he is now feeling it as well (and maybe Eric, having to hold
> down the xfsprogs fort). I'm not maintainer nor do I really want to be,
> but I'd be willing to contribute to maintainer like duties on a limited
> basis if there's a need. For example, if we had a per-release rotation
> of 3+ people willing to contribute, perhaps that could spread the pain
> around sufficiently..? Just a thought, and of course not being a
> maintainer I have no idea how realistic something like that might be..

Not being an XFS person, I don't want to impose anything, but have
you read/seen Dan Vetter's talk on this subject?
https://blog.ffwll.ch/2017/01/maintainers-dont-scale.html (plenty of
links to follow, including particularly https://lwn.net/Articles/705228/ )

It seems like the XFS development community might benefit from a
group maintainer model.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-13 15:46             ` Matthew Wilcox
@ 2020-02-16 21:55               ` Dave Chinner
  2020-02-19  0:29                 ` Darrick J. Wong
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2020-02-16 21:55 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Brian Foster, Darrick J. Wong, Allison Collins, Amir Goldstein,
	lsf-pc, linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Thu, Feb 13, 2020 at 07:46:32AM -0800, Matthew Wilcox wrote:
> On Thu, Feb 13, 2020 at 10:11:00AM -0500, Brian Foster wrote:
> > With regard to the burnout thing, ISTM the core functionality of the
> > maintainer is to maintain the integrity of the subtree. That involves
> > things like enforcing development process (i.e., requiring r-b tags on
> > all patches to merge), but not necessarily being obligated to resolve
> > conflicts or to review every patch that comes through in detail, or
> > guarantee that everything sent to the list makes it into the next
> > release, etc. If certain things aren't being reviewed in time or making
> > progress and that somehow results in additional pressure on the
> > maintainer, ISTM that something is wrong there..?
> > 
> > On a more logistical note, didn't we (XFS) discuss the idea of a
> > rotating maintainership at one point? I know Dave had dealt with burnout
> > after doing this job for quite some time, Darrick stepped in and it
> > sounds like he is now feeling it as well (and maybe Eric, having to hold
> > down the xfsprogs fort). I'm not maintainer nor do I really want to be,
> > but I'd be willing to contribute to maintainer like duties on a limited
> > basis if there's a need. For example, if we had a per-release rotation
> > of 3+ people willing to contribute, perhaps that could spread the pain
> > around sufficiently..? Just a thought, and of course not being a
> > maintainer I have no idea how realistic something like that might be..
> 
> Not being an XFS person, I don't want to impose anything, but have
> you read/seen Dan Vetter's talk on this subject?
> https://blog.ffwll.ch/2017/01/maintainers-dont-scale.html (plenty of
> links to follow, including particularly https://lwn.net/Articles/705228/ )

Yes, and I've talked to Dan in great detail about it over several
past LCAs... :)

> It seems like the XFS development community might benefit from a
> group maintainer model.

Yes, it may well do. The problem is the pool of XFS developers is
*much smaller* than the graphics subsystem and so a "group
maintainership" if kinda what we do already. I mean, I do have
commit rights to the XFS trees kernel.org, even though I'm not a
maintainer. I'm essentially the backup at this point - if someone
needs to take time off, I'll take over.

I think the biggest barrier to maintaining the XFS tree is the
amount of integration testing that the maintainer does on the merged
tree.  That's non-trivial, especially with how long it takes to run
fstests these days. If you're not set up to run QA 24x7 across a
bunch of different configs, then you need to get that into place
before being able to do the job of maintaining the XFS kernel
tree...

If everyone had that capability and resources at their disposal, then
rotating the tree maintenance responsibilities would be much
easier...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-13 15:19           ` Brian Foster
@ 2020-02-17  0:11             ` Dave Chinner
  2020-02-17 15:01               ` Brian Foster
  0 siblings, 1 reply; 27+ messages in thread
From: Dave Chinner @ 2020-02-17  0:11 UTC (permalink / raw)
  To: Brian Foster
  Cc: Allison Collins, Amir Goldstein, Darrick J. Wong, lsf-pc,
	linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Thu, Feb 13, 2020 at 10:19:28AM -0500, Brian Foster wrote:
> I agree with just about all of this mail, but I do have a question for
> Darrick and Dave.. in what ways do we (XFS community) place the burden
> on the maintainer to garner/coordinate review? I get the impression that
> some of this is implied or unintentional simply by reviews coming in
> slow or whatever, and the maintainer feeling the need to fill the gap. I

That's pretty much it. If nobody else reviews the code, if has
fallen to the maintainer to review it so that it can be merged. i.e.
review is needed to make forwards progress, to get stuff merged
we need review, but if nobody reviews the code for weeks, it
essentially falls to the maintainer to do a review to determine if
the patchset should be merged or not.

> can sort of understand that behavior as being human nature, but OTOH
> there's only so many developers with so much time available for review.

Yet all those developers seem to be happen to write code under the
assumption that someone will review it....

> That's not generally something we can control. Given that, I don't think
> it's necessarily useful to say that review throughput is the fundamental
> problem vs. this unfortunate dynamic of limited reviews not meeting
> demand leading to excess maintainer pressure. I'm trying to think about
> how we can deal with this more fundamentally beyond just saying "do more
> review" or having some annoying ownership process, but I think this
> requires more input from you guys who have experienced it...

/me rocks back and forth on his rocking chair....

Back in the days of being an SGI engineer working on IRIX - which is
where this process of review came from - each 3 monthly OS release
had a couple of "release tech leads" whose only responsibility for
those entire 3 months of the release was to review and accept code
into the Irix tree from the entire of engineering.  Every 3 month
release had different engineers performing the release lead role. It
was not a fixed position, and when you were assigned this role you
were given leave from your team responsibilities. i.e. the release
lead performed duties for the entire engineering department, not
just a single team.

Nothing got to the release leads without first having been peer
reviewed, unit tested and signed off by the reviewer. It was the
responsibility of the person who wrote the code to find a reviewer
and to get the review done, and only then could they tick the box
in the bug system that said "here's the change, here's the tree to
pull from, please review for inclusion". That's when the release
lead is notified that a change is ready to be merged.

IOWs, the release lead only ever saw reviewed, tested, completed
code ready for merge. Everything else was the responsibility of the
engineer tasks with implementing the change/fix via the bug system.
The release lead would check all the commit metadata was correct,
that the code merged cleanly, looked sane and, finally, passed
basic integration tests. They typically could not do more review
than this because most of what they looked at was outside of their
specific area of expertise.

Once the the release lead approved the change, the engineer would
then run the "commit script" which would check approvals in the bug
system and then merge the code into the main tree (i.e. the release
lead didn't actually do the commits themselves, just approved them).

We've always held that the position of "XFS Maintainer" was
effectively that of a "Release Lead" - someone who checks over
reviewed, tested, completed code, except with the added requirement
that they maintain the tree that everyone works from and sends
requests for the tree to be merged with upstream.

That is, organising review and/or reviewing every patch at great
detail was completely outside the scope of the Release Leads' role.
They look for integration and merge issues, not the fine details of
how a feature or fix is implemented. IOWs, the Release Lead role is
one of high level staging co-ordination, testing, oversight and tree
management.

Now, compare that to the traditional role of a Lead Developer. A LD
tends to spend a lot of their time on knowledge transfer, review,
architecture and project direction. The LD typically only looks at
code in their domain of expertise, and works with other LDs at
architectural/design levels.

But when the shit hits the fan, it's the LD who digs in to that
problem and does what is necessary to Get Stuff Done or find that
difficult bug. Hence if nobody is reviewing code and that is
creating a backlog, then the LD has a responsibility to the team to
make sure that review gets done in a timely fashion so code meets
the requirement for merging.

This is a very different role to a release lead role.

The problem is that Linux Maintainer role combines the roles of
"Lead Developer" with "Release Lead". i.e. the person who has the
release lead role for maintaining the tree for the team is also
expected to perform the LD role, and that means they are also the
"reviewer of last resort".  Hence if no-one else reviews code, it
comes down to the maintainer to do that review.

IOWs, the Linux Maintainer has the responsibility to the team to
keep making forwards progress by merging code, but they also have
the responsibility to be the person who reviews code when nobody
else does. Rather than a separation of these roles, it makes the
assumption that someone who is a good enough engineer to reach a
senior position in a team is also proficient at managing source
trees, automating testing, dealing with upstream merges.

Making your best developer also responsible for doing largely the
same thing over and over again (merge, test, push) is not a good use
of their time. Often it's a step too far - talented engineers often
only want to build things and don't respond well to being asked to
do mundane repetitive tasks. It's kinda like asking an areospace
engineer who designs planes to also be the mechanic that makes sure
every bolt on every plane is tight and won't come undone during
flight...

In reality, anyone who can use git and has a decent understanding of
the code base, team rules and git workflow can perform the Release
Lead role. But a fair few people have said they don't want to do it
because they are scared of making a mistake and being yelled at by
Our Mighty Leaders.

That is a result of the fact that a Linux Maintainer is seen as a
_powerful position_ because of the _control and influence_ it gives
that person. It's also treated like an exclusive club (e.g.
invite-only conferences for Maintainers) and it's effectively a "job
for life". i.e. Once people get to this level, they don't want to
step away from the role even if they are bad at it or it stresses
them out severely.  How many people do you know who have voluntarily
given up a Maintainer position because they really didn't want to do
it or they thought someone else could do a better job?

Personally, I never wanted to be a Maintainer. For a long time I was
simply a "Lead Developer" and through no choice of my own I ended up
as the community elected Maintainer. Performing the roles of LD + TL
for XFS kernel, xfsprogs and xfstests was too much for me - I
dislike the mechanical, repetitive process of managing patches and
branches for a release and when combined with doing review for so
many different things, it eventually that burnt the care-factor out
of me.

Which brings us to today - when I stepped back, we ended up with
separate maintainers for kernel, progs and fstests. fstests is
relatively low volume compared to the XFS kernel and xfsprogs so
that's been fine. However, we're seeing that the maintainers of both
of the XFS trees are showing signs of the stress of the
responsibility of maintaining both the trees, the reviewer of last
resort, doing all the integration testing and trying to keep up with
all the craziness that is happening throughout the rest of the
kernel.

So, really, it sounds like splitting what I was doing into 3
separate Maintainers hasn't really been that successful. It hasn't
solved the burnout issues, but we've known that for quite some time.
The concept of a rotating Maintainer role was born out of the idea
that we'd move it back closer to a Release Lead role and separate it
from the reviewer of last resort problem. We haven't really got past
talking about it, though, so perhaps we need to actually do
something about it.

I know this probably doesn't really answer your question, Brian, but
I thought it might be useful to given some further background on
how/why we have the review processes we do for XFS, and where the
maintainer has tended to fit in to the picture. Up until I was
elected maintainer, we'd kept the role of release lead separate to
the role of lead developer and we didn't have a burnout problem. It
certainly seems to me like we took a step in the wrong direction by
trying to conform to the Linux norms for selecting our most
productive developers to perform the official Linux Maintainer
role....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-17  0:11             ` Dave Chinner
@ 2020-02-17 15:01               ` Brian Foster
  0 siblings, 0 replies; 27+ messages in thread
From: Brian Foster @ 2020-02-17 15:01 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Allison Collins, Amir Goldstein, Darrick J. Wong, lsf-pc,
	linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Mon, Feb 17, 2020 at 11:11:54AM +1100, Dave Chinner wrote:
> On Thu, Feb 13, 2020 at 10:19:28AM -0500, Brian Foster wrote:
> > I agree with just about all of this mail, but I do have a question for
> > Darrick and Dave.. in what ways do we (XFS community) place the burden
> > on the maintainer to garner/coordinate review? I get the impression that
> > some of this is implied or unintentional simply by reviews coming in
> > slow or whatever, and the maintainer feeling the need to fill the gap. I
> 
> That's pretty much it. If nobody else reviews the code, if has
> fallen to the maintainer to review it so that it can be merged. i.e.
> review is needed to make forwards progress, to get stuff merged
> we need review, but if nobody reviews the code for weeks, it
> essentially falls to the maintainer to do a review to determine if
> the patchset should be merged or not.
> 
> > can sort of understand that behavior as being human nature, but OTOH
> > there's only so many developers with so much time available for review.
> 
> Yet all those developers seem to be happen to write code under the
> assumption that someone will review it....
> 

I think all of our regular patch contributers also do a fair amount of
review. I'm sure we could all stand to do more review at times, but that
doesn't strike me as the fundamental problem here...

> > That's not generally something we can control. Given that, I don't think
> > it's necessarily useful to say that review throughput is the fundamental
> > problem vs. this unfortunate dynamic of limited reviews not meeting
> > demand leading to excess maintainer pressure. I'm trying to think about
> > how we can deal with this more fundamentally beyond just saying "do more
> > review" or having some annoying ownership process, but I think this
> > requires more input from you guys who have experienced it...
> 
> /me rocks back and forth on his rocking chair....
> 

Heh. ;)

> Back in the days of being an SGI engineer working on IRIX - which is
> where this process of review came from - each 3 monthly OS release
> had a couple of "release tech leads" whose only responsibility for
> those entire 3 months of the release was to review and accept code
> into the Irix tree from the entire of engineering.  Every 3 month
> release had different engineers performing the release lead role. It
> was not a fixed position, and when you were assigned this role you
> were given leave from your team responsibilities. i.e. the release
> lead performed duties for the entire engineering department, not
> just a single team.
> 
> Nothing got to the release leads without first having been peer
> reviewed, unit tested and signed off by the reviewer. It was the
> responsibility of the person who wrote the code to find a reviewer
> and to get the review done, and only then could they tick the box
> in the bug system that said "here's the change, here's the tree to
> pull from, please review for inclusion". That's when the release
> lead is notified that a change is ready to be merged.
> 
> IOWs, the release lead only ever saw reviewed, tested, completed
> code ready for merge. Everything else was the responsibility of the
> engineer tasks with implementing the change/fix via the bug system.
> The release lead would check all the commit metadata was correct,
> that the code merged cleanly, looked sane and, finally, passed
> basic integration tests. They typically could not do more review
> than this because most of what they looked at was outside of their
> specific area of expertise.
> 
> Once the the release lead approved the change, the engineer would
> then run the "commit script" which would check approvals in the bug
> system and then merge the code into the main tree (i.e. the release
> lead didn't actually do the commits themselves, just approved them).
> 
> We've always held that the position of "XFS Maintainer" was
> effectively that of a "Release Lead" - someone who checks over
> reviewed, tested, completed code, except with the added requirement
> that they maintain the tree that everyone works from and sends
> requests for the tree to be merged with upstream.
> 

Yep, this has always been my general perception of the XFS maintainer
role. Merge code while acting somewhat as a gatekeeper, send pull
requests, perform some very basic coordination/communication between
developers when necessary (i.e., if working on conflicting things, a
controversial change, etc.). The maintainer also sends and reviews code,
but I've always viewed those as developer tasks as the maintainer
usually happens to also be a developer.

> That is, organising review and/or reviewing every patch at great
> detail was completely outside the scope of the Release Leads' role.
> They look for integration and merge issues, not the fine details of
> how a feature or fix is implemented. IOWs, the Release Lead role is
> one of high level staging co-ordination, testing, oversight and tree
> management.
> 
> Now, compare that to the traditional role of a Lead Developer. A LD
> tends to spend a lot of their time on knowledge transfer, review,
> architecture and project direction. The LD typically only looks at
> code in their domain of expertise, and works with other LDs at
> architectural/design levels.
> 
> But when the shit hits the fan, it's the LD who digs in to that
> problem and does what is necessary to Get Stuff Done or find that
> difficult bug. Hence if nobody is reviewing code and that is
> creating a backlog, then the LD has a responsibility to the team to
> make sure that review gets done in a timely fashion so code meets
> the requirement for merging.
> 
> This is a very different role to a release lead role.
> 
> The problem is that Linux Maintainer role combines the roles of
> "Lead Developer" with "Release Lead". i.e. the person who has the
> release lead role for maintaining the tree for the team is also
> expected to perform the LD role, and that means they are also the
> "reviewer of last resort".  Hence if no-one else reviews code, it
> comes down to the maintainer to do that review.
> 

Hmm.. I can only give my .02, but I've never really considered a generic
maintainer as necessarily an LD by default. I wonder a bit how much of
this is a general expectation of the role, particularly since I'm not
convinced other subsystems even have as stringent review requirements as
we happen to have in XFS.

Though perhaps that is part and parcel of the same thing.. if other
subsystems don't have some form of an independent Reviewed-By:
requirement for merging patches, then the maintainer might feel more
pressure to (at least informally) review everything that comes through
by virtue of being responsible for the subtree.

> IOWs, the Linux Maintainer has the responsibility to the team to
> keep making forwards progress by merging code, but they also have
> the responsibility to be the person who reviews code when nobody
> else does. Rather than a separation of these roles, it makes the
> assumption that someone who is a good enough engineer to reach a
> senior position in a team is also proficient at managing source
> trees, automating testing, dealing with upstream merges.
> 
> Making your best developer also responsible for doing largely the
> same thing over and over again (merge, test, push) is not a good use
> of their time. Often it's a step too far - talented engineers often
> only want to build things and don't respond well to being asked to
> do mundane repetitive tasks. It's kinda like asking an areospace
> engineer who designs planes to also be the mechanic that makes sure
> every bolt on every plane is tight and won't come undone during
> flight...
> 
> In reality, anyone who can use git and has a decent understanding of
> the code base, team rules and git workflow can perform the Release
> Lead role. But a fair few people have said they don't want to do it
> because they are scared of making a mistake and being yelled at by
> Our Mighty Leaders.
> 
> That is a result of the fact that a Linux Maintainer is seen as a
> _powerful position_ because of the _control and influence_ it gives
> that person. It's also treated like an exclusive club (e.g.
> invite-only conferences for Maintainers) and it's effectively a "job
> for life". i.e. Once people get to this level, they don't want to
> step away from the role even if they are bad at it or it stresses
> them out severely.  How many people do you know who have voluntarily
> given up a Maintainer position because they really didn't want to do
> it or they thought someone else could do a better job?
> 

I've always found that a bit strange. I've never considered it a
glamorous position. ;P

> Personally, I never wanted to be a Maintainer. For a long time I was
> simply a "Lead Developer" and through no choice of my own I ended up
> as the community elected Maintainer. Performing the roles of LD + TL
> for XFS kernel, xfsprogs and xfstests was too much for me - I
> dislike the mechanical, repetitive process of managing patches and
> branches for a release and when combined with doing review for so
> many different things, it eventually that burnt the care-factor out
> of me.
> 
> Which brings us to today - when I stepped back, we ended up with
> separate maintainers for kernel, progs and fstests. fstests is
> relatively low volume compared to the XFS kernel and xfsprogs so
> that's been fine. However, we're seeing that the maintainers of both
> of the XFS trees are showing signs of the stress of the
> responsibility of maintaining both the trees, the reviewer of last
> resort, doing all the integration testing and trying to keep up with
> all the craziness that is happening throughout the rest of the
> kernel.
> 
> So, really, it sounds like splitting what I was doing into 3
> separate Maintainers hasn't really been that successful. It hasn't
> solved the burnout issues, but we've known that for quite some time.
> The concept of a rotating Maintainer role was born out of the idea
> that we'd move it back closer to a Release Lead role and separate it
> from the reviewer of last resort problem. We haven't really got past
> talking about it, though, so perhaps we need to actually do
> something about it.
> 
> I know this probably doesn't really answer your question, Brian, but
> I thought it might be useful to given some further background on
> how/why we have the review processes we do for XFS, and where the
> maintainer has tended to fit in to the picture. Up until I was
> elected maintainer, we'd kept the role of release lead separate to
> the role of lead developer and we didn't have a burnout problem. It
> certainly seems to me like we took a step in the wrong direction by
> trying to conform to the Linux norms for selecting our most
> productive developers to perform the official Linux Maintainer
> role....
> 

It kind of does and it doesn't, but a good discussion nonetheless...
While I agree with much of what you've written here, you've touched on
something that I suspected was missing up until the very end. I do
recall the time before your maintainership, but I don't recall any
explicit decision to fundamentally combine the LD and maintainer roles.
We just gave the maintainer role to somebody who also has LD
responsibilities, which is subtely different IMO. ;)

To me, the LD is more something that describes the regular contributions
of certain people than an explicit role (in Linux, anyways). The broader
point is that perhaps some of this is self-imposed by virtue of the
maintainer being somebody who is instinctively an LD (beyond saying this
is just "the Linux maintainer way of doing things," which also has an
element of truth to it).

For example, as an individual XFS developer, I have no real perception
of the natural strain that should be fed back from the maintainer in a
situation where there's a bunch of pending code, not enough review and
there's a hurdle to progress to the point where the maintainer is
feeling undue pressure. That's presumably because the maintainer jumps
in to fill the gap, but even though I do try to do my part and review
as much as possible, I don't recall being asked to look at some
particular thing or for thoughts on the priority of some particular
work, etc. since.. as far as I can remember..? So it's kind of hard for
others to help out without any kind of feedback or communication along
those lines. Somewhat related, how often do we not land patches in a
release due to lack of review? ISTM that _should_ happen from time to
time simply as a function of prioritization, and I wonder if we're
resisting that in some way.

I also think lack of communication (in both directions) makes
prioritization of patch series difficult. I.e., I might see something
pop up on the list and explicitly prioritize it as something that I
probably wouldn't get to within the next couple/few weeks, where the
maintainer might consider it unreviewed in the meantime and feel the
need to review it sooner. Unfortunately as it stands, neither person
really has any window into the other's priorities.

I'm not sure what the answer is to that. I hate the idea of "assigning"
reviewers, but at the same time it might be nice to have a way to say "I
acknowledge your series, I don't think it's critical for this release
but I plan to review it." At least then the maintainer could make a more
informed decision or poke reviewers about such things before
reviewing/merging. Tracking stuff like that is probably where it gets
difficult.

Anyways, ISTM that _something_ that separates maintainership tasks from
this implied individual responsibility to make progress might help
mitigate this. To me, maintainership is really just a set of tasks that
need to be done periodically in service to the broader project. It would
be nice if we found a way for everybody to be able to contribute in some
form from time to time, but I'm not sure how realistic that is. Perhaps
something like rotating, or further subdivision of work like per-release
tree maintenance vs. integration/regression testing, etc. might help
force more communication between us and isolate the shared
responsibility to review/test code from the maintainer responsibility to
merge it.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-16 21:55               ` Dave Chinner
@ 2020-02-19  0:29                 ` Darrick J. Wong
  2020-02-19  1:17                   ` Theodore Y. Ts'o
  0 siblings, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2020-02-19  0:29 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Matthew Wilcox, Brian Foster, Allison Collins, Amir Goldstein,
	lsf-pc, linux-fsdevel, xfs, Eryu Guan, Eric Sandeen

On Mon, Feb 17, 2020 at 08:55:56AM +1100, Dave Chinner wrote:
> On Thu, Feb 13, 2020 at 07:46:32AM -0800, Matthew Wilcox wrote:
> > On Thu, Feb 13, 2020 at 10:11:00AM -0500, Brian Foster wrote:
> > > With regard to the burnout thing, ISTM the core functionality of the
> > > maintainer is to maintain the integrity of the subtree. That involves
> > > things like enforcing development process (i.e., requiring r-b tags on
> > > all patches to merge), but not necessarily being obligated to resolve
> > > conflicts or to review every patch that comes through in detail, or
> > > guarantee that everything sent to the list makes it into the next
> > > release, etc. If certain things aren't being reviewed in time or making
> > > progress and that somehow results in additional pressure on the
> > > maintainer, ISTM that something is wrong there..?
> > > 
> > > On a more logistical note, didn't we (XFS) discuss the idea of a
> > > rotating maintainership at one point? I know Dave had dealt with burnout
> > > after doing this job for quite some time, Darrick stepped in and it
> > > sounds like he is now feeling it as well (and maybe Eric, having to hold
> > > down the xfsprogs fort). I'm not maintainer nor do I really want to be,
> > > but I'd be willing to contribute to maintainer like duties on a limited
> > > basis if there's a need. For example, if we had a per-release rotation
> > > of 3+ people willing to contribute, perhaps that could spread the pain
> > > around sufficiently..? Just a thought, and of course not being a
> > > maintainer I have no idea how realistic something like that might be..
> > 
> > Not being an XFS person, I don't want to impose anything, but have
> > you read/seen Dan Vetter's talk on this subject?
> > https://blog.ffwll.ch/2017/01/maintainers-dont-scale.html (plenty of
> > links to follow, including particularly https://lwn.net/Articles/705228/ )
> 
> Yes, and I've talked to Dan in great detail about it over several
> past LCAs... :)
> 
> > It seems like the XFS development community might benefit from a
> > group maintainer model.
> 
> Yes, it may well do. The problem is the pool of XFS developers is
> *much smaller* than the graphics subsystem and so a "group
> maintainership" if kinda what we do already. I mean, I do have
> commit rights to the XFS trees kernel.org, even though I'm not a
> maintainer. I'm essentially the backup at this point - if someone
> needs to take time off, I'll take over.
> 
> I think the biggest barrier to maintaining the XFS tree is the
> amount of integration testing that the maintainer does on the merged
> tree.  That's non-trivial, especially with how long it takes to run
> fstests these days. If you're not set up to run QA 24x7 across a
> bunch of different configs, then you need to get that into place
> before being able to do the job of maintaining the XFS kernel
> tree...
> 
> If everyone had that capability and resources at their disposal, then
> rotating the tree maintenance responsibilities would be much
> easier...

I kinda wish the LF or someone would open such a program to the kernel
maintainers.  I never liked that old maxim, "The maintainer is [the
stuckee] with the most testing resources" -- there shouldn't really have
to be a djwong cloud and a dchinner cloud. :/

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale
  2020-02-19  0:29                 ` Darrick J. Wong
@ 2020-02-19  1:17                   ` Theodore Y. Ts'o
  0 siblings, 0 replies; 27+ messages in thread
From: Theodore Y. Ts'o @ 2020-02-19  1:17 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Dave Chinner, Matthew Wilcox, Brian Foster, Allison Collins,
	Amir Goldstein, lsf-pc, linux-fsdevel, xfs, Eryu Guan,
	Eric Sandeen

On Tue, Feb 18, 2020 at 04:29:16PM -0800, Darrick J. Wong wrote:
> I kinda wish the LF or someone would open such a program to the kernel
> maintainers.  I never liked that old maxim, "The maintainer is [the
> stuckee] with the most testing resources" -- there shouldn't really have
> to be a djwong cloud and a dchinner cloud. :/

If there are people who are interested in using gce-xfstests for
testing their file systems, please contact me off-line.  I can't make
a promise that I can swing free GCE credits for *everyone*, but for
maintainers of major file systems, or senior file system developers in
general, we can probably work something out.

						- Ted

P.S.  I also have blktests working in gce-xfstests; it's not clear
it's as useful, and there are a lot of test failures which I think are
test bugs, but in theory blktests can be used via gce-xfstests.  I
also have very rough Phoronix Test suite support, and I hope to try to
get mmtests running out of gce-xfstests as well.  Which probably means
I should think about renaming it.  :-)

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-02-19  1:18 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-31  5:25 [LSF/MM/BPF TOPIC] FS Maintainers Don't Scale Darrick J. Wong
2020-01-31  7:30 ` [Lsf-pc] " Amir Goldstein
2020-02-01  3:20   ` Allison Collins
2020-02-02 21:46     ` Dave Chinner
2020-02-09 17:12       ` Allison Collins
2020-02-12  0:21         ` NeilBrown
2020-02-12  6:58           ` Darrick J. Wong
2020-02-12 22:06         ` Darrick J. Wong
2020-02-12 22:19           ` Dan Williams
2020-02-12 22:36             ` Darrick J. Wong
2020-02-13 15:11           ` Brian Foster
2020-02-13 15:46             ` Matthew Wilcox
2020-02-16 21:55               ` Dave Chinner
2020-02-19  0:29                 ` Darrick J. Wong
2020-02-19  1:17                   ` Theodore Y. Ts'o
2020-02-12 23:39         ` Dave Chinner
2020-02-13 15:19           ` Brian Foster
2020-02-17  0:11             ` Dave Chinner
2020-02-17 15:01               ` Brian Foster
2020-02-12 21:36       ` Darrick J. Wong
2020-02-12 22:42   ` Darrick J. Wong
2020-02-13 10:21     ` Amir Goldstein
2020-02-07 22:03 ` Matthew Wilcox
2020-02-12  3:51   ` Theodore Y. Ts'o
2020-02-12 22:29     ` Darrick J. Wong
2020-02-12 22:21   ` Darrick J. Wong
2020-02-13  1:23     ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).