ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
@ 2021-09-15 17:42 Theodore Ts'o
  2021-09-15 18:03 ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Theodore Ts'o @ 2021-09-15 17:42 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Kent Overstreet, Matthew Wilcox, Linus Torvalds, linux-mm,
	linux-fsdevel, linux-kernel, Andrew Morton, Darrick J. Wong,
	Christoph Hellwig, David Howells, ksummit

Back when we could fit all or most of the Maintainers plus interested
developers in a single room, the question of how to make forward
progress on something like Folios.  These days, all of the interested
parties wouldn't fit in a single room, which is why Maintainers summit
focuses only on development process issues.

However, this means that when we need to make a call about what needs
to happen before Folios can be merged, we don't seem to have a good
way to make that happen.  And being a file system developer who is
eagerly looking forward to what Folios will enable, I'm a bit biased
in terms of wanting to see how we can break the logjam and move
forward.

So.... I have a proposal.  We could potentially schedule a Wither
Folios LPC BOF during one of the time slots on Friday when the
Maintainers Summit is taking place, and we arrange to have all of the
Maintainers switch over to the LPC BOF room.  If enough of the various
stakeholders for Folios are going to be attending LPC or Maintainer's
Summit, and folks (especially Linus, who ultiamtely needs to make the
final decision), this is something we could do.

Would this be helpful?  (Or Linus could pull either the folio or
pageset branch, and make this proposal obsolete, which would be great.  :-)

	    	      	       		 - Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-15 17:42 [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic? Theodore Ts'o
@ 2021-09-15 18:03 ` James Bottomley
  2021-09-15 18:20   ` Theodore Ts'o
  0 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2021-09-15 18:03 UTC (permalink / raw)
  To: Theodore Ts'o, Johannes Weiner
  Cc: Kent Overstreet, Matthew Wilcox, Linus Torvalds, linux-mm,
	linux-fsdevel, linux-kernel, Andrew Morton, Darrick J. Wong,
	Christoph Hellwig, David Howells, ksummit

On Wed, 2021-09-15 at 13:42 -0400, Theodore Ts'o wrote:
[...]
> Would this be helpful?  (Or Linus could pull either the folio or
> pageset branch, and make this proposal obsolete, which would be
> great.  :-)

This is a technical rather than process issue isn't it?  You don't have
enough technical people at the Maintainer summit to help meaningfully. 
The ideal location, of course, was LSF/MM which is now not happening.

However, we did offer the Plumbers BBB infrastructure to willy for a MM
gathering which could be expanded to include this.

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-15 18:03 ` James Bottomley
@ 2021-09-15 18:20   ` Theodore Ts'o
  2021-09-15 18:41     ` Chris Mason
  0 siblings, 1 reply; 23+ messages in thread
From: Theodore Ts'o @ 2021-09-15 18:20 UTC (permalink / raw)
  To: James Bottomley
  Cc: Johannes Weiner, Kent Overstreet, Matthew Wilcox, Linus Torvalds,
	linux-mm, linux-fsdevel, linux-kernel, Andrew Morton,
	Darrick J. Wong, Christoph Hellwig, David Howells, ksummit

On Wed, Sep 15, 2021 at 02:03:46PM -0400, James Bottomley wrote:
> On Wed, 2021-09-15 at 13:42 -0400, Theodore Ts'o wrote:
> [...]
> > Would this be helpful?  (Or Linus could pull either the folio or
> > pageset branch, and make this proposal obsolete, which would be
> > great.  :-)
> 
> This is a technical rather than process issue isn't it?  You don't have
> enough technical people at the Maintainer summit to help meaningfully. 
> The ideal location, of course, was LSF/MM which is now not happening.
> 
> However, we did offer the Plumbers BBB infrastructure to willy for a MM
> gathering which could be expanded to include this.

Well, that's why I was suggesting doing this as a LPC BOF, and using
an LPC BOF session on Friday --- I'm very much aware we don't have the
right tehcnical people at the Maintainer Summit.

It's not clear we will have enough MM folks at the LPC, and I agree
LSF/MM would be a better venue --- but as you say, it's not happening.
We could also use the BBB infrastructure after the LPC as well, if we
can't get everyone lined up and available on short notice.  There are
a lot of different possibilities; I'm for anything where all of the
stakeholders agree will work, so we can make forward progress.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-15 18:20   ` Theodore Ts'o
@ 2021-09-15 18:41     ` Chris Mason
  2021-09-15 19:15       ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Chris Mason @ 2021-09-15 18:41 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: James Bottomley, Johannes Weiner, Kent Overstreet,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit


> On Sep 15, 2021, at 2:20 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> 
> On Wed, Sep 15, 2021 at 02:03:46PM -0400, James Bottomley wrote:
>> On Wed, 2021-09-15 at 13:42 -0400, Theodore Ts'o wrote:
>> [...]
>>> Would this be helpful?  (Or Linus could pull either the folio or
>>> pageset branch, and make this proposal obsolete, which would be
>>> great.  :-)
>> 
>> This is a technical rather than process issue isn't it?  You don't have
>> enough technical people at the Maintainer summit to help meaningfully. 
>> The ideal location, of course, was LSF/MM which is now not happening.
>> 
>> However, we did offer the Plumbers BBB infrastructure to willy for a MM
>> gathering which could be expanded to include this.
> 
> Well, that's why I was suggesting doing this as a LPC BOF, and using
> an LPC BOF session on Friday --- I'm very much aware we don't have the
> right tehcnical people at the Maintainer Summit.
> 
> It's not clear we will have enough MM folks at the LPC, and I agree
> LSF/MM would be a better venue --- but as you say, it's not happening.
> We could also use the BBB infrastructure after the LPC as well, if we
> can't get everyone lined up and available on short notice.  There are
> a lot of different possibilities; I'm for anything where all of the
> stakeholders agree will work, so we can make forward progress.

I think the two different questions are:

* What work is left for merging folios?

* What process should we use to make the overall development of folio sized changes more predictable and rewarding for everyone involved?

-chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-15 18:41     ` Chris Mason
@ 2021-09-15 19:15       ` James Bottomley
  2021-09-15 20:48         ` Theodore Ts'o
                           ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: James Bottomley @ 2021-09-15 19:15 UTC (permalink / raw)
  To: Chris Mason, Theodore Ts'o
  Cc: Johannes Weiner, Kent Overstreet, Matthew Wilcox, Linus Torvalds,
	linux-mm, linux-fsdevel, linux-kernel, Andrew Morton,
	Darrick J. Wong, Christoph Hellwig, David Howells, ksummit

On Wed, 2021-09-15 at 18:41 +0000, Chris Mason wrote:
> > On Sep 15, 2021, at 2:20 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> > 
> > On Wed, Sep 15, 2021 at 02:03:46PM -0400, James Bottomley wrote:
> > > On Wed, 2021-09-15 at 13:42 -0400, Theodore Ts'o wrote:
> > > [...]
> > > > Would this be helpful?  (Or Linus could pull either the folio
> > > > or pageset branch, and make this proposal obsolete, which would
> > > > be great.  :-)
> > > 
> > > This is a technical rather than process issue isn't it?  You
> > > don't have enough technical people at the Maintainer summit to
> > > help meaningfully.  The ideal location, of course, was LSF/MM
> > > which is now not happening.
> > > 
> > > However, we did offer the Plumbers BBB infrastructure to willy
> > > for a MM gathering which could be expanded to include this.
> > 
> > Well, that's why I was suggesting doing this as a LPC BOF, and
> > using an LPC BOF session on Friday --- I'm very much aware we don't
> > have the right tehcnical people at the Maintainer Summit.
> > 
> > It's not clear we will have enough MM folks at the LPC, and I agree
> > LSF/MM would be a better venue --- but as you say, it's not
> > happening. We could also use the BBB infrastructure after the LPC
> > as well, if we can't get everyone lined up and available on short
> > notice.  There are a lot of different possibilities; I'm for
> > anything where all of the stakeholders agree will work, so we can
> > make forward progress.
> 
> I think the two different questions are:
> 
> * What work is left for merging folios?

My reading of the email threads is that they're iterating to an actual
conclusion (I admit, I'm surprised) ... or at least the disagreements
are getting less.  Since the merge window closed this is now a 5.16
thing, so there's no huge urgency to getting it resolved next week.

> * What process should we use to make the overall development of folio
> sized changes more predictable and rewarding for everyone involved?

Well, the current one seems to be working (admittedly eventually, so
achieving faster resolution next time might be good) ... but I'm sure
you could propose alternatives ... especially in the time to resolution
department.

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-15 19:15       ` James Bottomley
@ 2021-09-15 20:48         ` Theodore Ts'o
  2021-09-16 14:55           ` Kent Overstreet
  2021-09-16 13:51         ` David Howells
  2021-09-16 16:46         ` Chris Mason
  2 siblings, 1 reply; 23+ messages in thread
From: Theodore Ts'o @ 2021-09-15 20:48 UTC (permalink / raw)
  To: James Bottomley
  Cc: Chris Mason, Johannes Weiner, Kent Overstreet, Matthew Wilcox,
	Linus Torvalds, linux-mm, linux-fsdevel, linux-kernel,
	Andrew Morton, Darrick J. Wong, Christoph Hellwig, David Howells,
	ksummit

On Wed, Sep 15, 2021 at 03:15:13PM -0400, James Bottomley wrote:
> 
> My reading of the email threads is that they're iterating to an actual
> conclusion (I admit, I'm surprised) ... or at least the disagreements
> are getting less.  Since the merge window closed this is now a 5.16
> thing, so there's no huge urgency to getting it resolved next week.

My read was that it was more that people were just getting exhausted,
and not necessarily that folks were converging.  (Also, Willy is
currently on vacation.)

I'm happy to be wrong, bu the patches haven't changed since the merge
window opened, and it's not clear what *needs* to change before it can
be accepted at the next merge window.

> Well, the current one seems to be working (admittedly eventually, so
> achieving faster resolution next time might be good) ... but I'm sure
> you could propose alternatives ... especially in the time to resolution
> department.

Given how long it took for DAX to converge (years and years and years
and *multiple* LSF/MM's), I'm not as optimistic that Folios is
converge and is about to be merged at the next merge window.  But
again, I'm happy to be proven wrong.

						- Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-15 19:15       ` James Bottomley
  2021-09-15 20:48         ` Theodore Ts'o
@ 2021-09-16 13:51         ` David Howells
  2021-09-16 16:46         ` Chris Mason
  2 siblings, 0 replies; 23+ messages in thread
From: David Howells @ 2021-09-16 13:51 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: dhowells, James Bottomley, Chris Mason, Johannes Weiner,
	Kent Overstreet, Matthew Wilcox, Linus Torvalds, linux-mm,
	linux-fsdevel, linux-kernel, Andrew Morton, Darrick J. Wong,
	Christoph Hellwig, ksummit

Theodore Ts'o <tytso@mit.edu> wrote:

> > My reading of the email threads is that they're iterating to an actual
> > conclusion (I admit, I'm surprised) ... or at least the disagreements
> > are getting less.  Since the merge window closed this is now a 5.16
> > thing, so there's no huge urgency to getting it resolved next week.
> 
> My read was that it was more that people were just getting exhausted,
> and not necessarily that folks were converging.

The problem, from where I sit, is that I'd started rebasing my stuff on top of
Willy's patches and making use of them in the expectation that they were
likely to go in - and I think other people might have been doing that too
based on some of the comments.

However, that's all been thrown up in the air.  Not only did they not get
merged in this window, it's not currently looking certain that they'd get
merged in the next window either.

So what do I do?  Do I defoliate my patches - which then risks merge conflicts
with the folio patches?  Or do I stick with the foliation and hope that
Willy's goes in next time?

Some guidance as to what's likely to happen to the folio patches would be
really appreciated!

David


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-15 20:48         ` Theodore Ts'o
@ 2021-09-16 14:55           ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2021-09-16 14:55 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: James Bottomley, Chris Mason, Johannes Weiner, Matthew Wilcox,
	Linus Torvalds, linux-mm, linux-fsdevel, linux-kernel,
	Andrew Morton, Darrick J. Wong, Christoph Hellwig, David Howells,
	ksummit

On Wed, Sep 15, 2021 at 04:48:55PM -0400, Theodore Ts'o wrote:
> On Wed, Sep 15, 2021 at 03:15:13PM -0400, James Bottomley wrote:
> > 
> > My reading of the email threads is that they're iterating to an actual
> > conclusion (I admit, I'm surprised) ... or at least the disagreements
> > are getting less.  Since the merge window closed this is now a 5.16
> > thing, so there's no huge urgency to getting it resolved next week.
> 
> My read was that it was more that people were just getting exhausted,
> and not necessarily that folks were converging.  (Also, Willy is
> currently on vacation.)
> 
> I'm happy to be wrong, bu the patches haven't changed since the merge
> window opened, and it's not clear what *needs* to change before it can
> be accepted at the next merge window.

I've personally been pretty dissapointed by how the discussions went off the
rails. I don't think Willy was doing the best job of explaining and advocating
for his design decisions, and some of the objections of the MM people have been
just crazypants.

One thing I want to make clear: folios aren't about compound pages, compound
pages are just the mechanism MM side for describing higher order allocations.
And folios are for filesystem pages (possibly including anonymous pages going
forward); they're _not_ for slab. 

Historically, we haven't had a clear allocator/allocatee interface or
distinction in our data structures, and our taxonomy of different types of pages
is also super confusing, and both of those things have been making these
discussions _really_ hard - but also, I expect better of some of you people. All
the bikeshedding over the naming and arguing over eventuallities that will never
happen because they're just pants on head stupid makes it really hard to find
people's _real_ legitimate objections when reading through these discussions. 

I'm probably waiting for Willy to get back from vacation so I can hear more of
his rationale before doing another long recap, and I'm still waiting for
Johannes to retract his NACK. One of the good things that's come out of the
discussions with Johannes is we've got some good concrete ideas for cutting
apart the struct page mess - Willy has done most of the initial work, after all
- and I think it's now possible to work towards a clear disctinction between
allocator and allocatee state and also separate data types for separate types of
pages. Fundamentally, the reason struct page exists at all is because we need
memory to be self describing, but a lot of stuff lives in struct page for more
for convenience reasons - we have a lot of code/data sharing there that's more
accidental than principled. But I'm starting to see a way forward and it's
getting me pretty excited.

> 
> > Well, the current one seems to be working (admittedly eventually, so
> > achieving faster resolution next time might be good) ... but I'm sure
> > you could propose alternatives ... especially in the time to resolution
> > department.
> 
> Given how long it took for DAX to converge (years and years and years
> and *multiple* LSF/MM's), I'm not as optimistic that Folios is
> converge and is about to be merged at the next merge window.  But
> again, I'm happy to be proven wrong.

I hope it doesn't take _that_ long...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-15 19:15       ` James Bottomley
  2021-09-15 20:48         ` Theodore Ts'o
  2021-09-16 13:51         ` David Howells
@ 2021-09-16 16:46         ` Chris Mason
  2021-09-16 17:11           ` James Bottomley
  2021-09-16 17:15           ` Kent Overstreet
  2 siblings, 2 replies; 23+ messages in thread
From: Chris Mason @ 2021-09-16 16:46 UTC (permalink / raw)
  To: James Bottomley
  Cc: Theodore Ts'o, Johannes Weiner, Kent Overstreet,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit


> On Sep 15, 2021, at 3:15 PM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Wed, 2021-09-15 at 18:41 +0000, Chris Mason wrote:
>>> On Sep 15, 2021, at 2:20 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>>> 
>>> On Wed, Sep 15, 2021 at 02:03:46PM -0400, James Bottomley wrote:
>>>> On Wed, 2021-09-15 at 13:42 -0400, Theodore Ts'o wrote:
>>>> [...]
>>>>> Would this be helpful?  (Or Linus could pull either the folio
>>>>> or pageset branch, and make this proposal obsolete, which would
>>>>> be great.  :-)
>>>> 
>>>> This is a technical rather than process issue isn't it?  You
>>>> don't have enough technical people at the Maintainer summit to
>>>> help meaningfully.  The ideal location, of course, was LSF/MM
>>>> which is now not happening.
>>>> 
>>>> However, we did offer the Plumbers BBB infrastructure to willy
>>>> for a MM gathering which could be expanded to include this.
>>> 
>>> Well, that's why I was suggesting doing this as a LPC BOF, and
>>> using an LPC BOF session on Friday --- I'm very much aware we don't
>>> have the right tehcnical people at the Maintainer Summit.
>>> 
>>> It's not clear we will have enough MM folks at the LPC, and I agree
>>> LSF/MM would be a better venue --- but as you say, it's not
>>> happening. We could also use the BBB infrastructure after the LPC
>>> as well, if we can't get everyone lined up and available on short
>>> notice.  There are a lot of different possibilities; I'm for
>>> anything where all of the stakeholders agree will work, so we can
>>> make forward progress.
>> 
>> I think the two different questions are:
>> 
>> * What work is left for merging folios?
> 
> My reading of the email threads is that they're iterating to an actual
> conclusion (I admit, I'm surprised) ... or at least the disagreements
> are getting less.  Since the merge window closed this is now a 5.16
> thing, so there's no huge urgency to getting it resolved next week.
> 

I think the urgency is mostly around clarity for others with out of tree work, or who are depending on folios in some other way.  Setting up a clear set of conditions for the path forward should also be part of saying not-yet to merging them.

>> * What process should we use to make the overall development of folio
>> sized changes more predictable and rewarding for everyone involved?
> 
> Well, the current one seems to be working (admittedly eventually, so
> achieving faster resolution next time might be good) ... but I'm sure
> you could propose alternatives ... especially in the time to resolution
> department.

It feels like these patches are moving forward, but with a pretty heavy emotional cost for the people involved.  I'll definitely agree this has been our process for a long time, but I'm struggling to understand why we'd call it working.

In general, we've all come to terms with huge changes being a slog through  consensus building, design compromise, the actual technical work, and the rebase/test/fix iteration cycle.  It's stressful, both because of technical difficulty and because the whole process is filled with uncertainty.

With folios, we don't have general consensus on:

* Which problems are being solved?  Kent's writeup makes it pretty clear filesystems and memory management developers have diverging opinions on this.  Our process in general is to put this into patch 0.  It mostly works, but there's an intermediate step between patch 0 and the full lwn article that would be really nice to have.

* Who is responsible for accepting the design, and which acks must be obtained before it goes upstream?  Our process here is pretty similar to waiting for answers to messages in bottles.  We consistently leave it implicit and poorly defined.

* What work is left before it can go upstream?  Our process could be effectively modeled by postit notes on one person's monitor, which they may or may not share with the group.  Also, since we don't have agreement on which acks are required, there's no way to have any certainty about what work is left.  It leaves authors feeling derailed when discussion shifts and reviewers feeling frustrated and ignored.

* How do we divide up the long term future direction into individual steps that we can merge?  This also goes back to consensus on the design.  We can't decide which parts are going to get layered in future merge windows until we know if we're building a car or a banana stand.

* What tests will we use to validate it all?  Work this spread out is too big for one developer to test alone.  We need ways for people sign up and agree on which tests/benchmarks provide meaningful results.

The end result of all of this is that missing a merge window isn't just about a time delay.  You add N months of total uncertainty, where every new email could result in having to start over from scratch.  Willy's do-whatever-the-fuck-you-want-I'm-going-on-vacation email is probably the least surprising part of the whole thread.

Internally, we tend to use a simple shared document to nail all of this down.  A two page google doc for folios could probably have avoided a lot of pain here, especially if we’re able to agree on stakeholders.

-chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 16:46         ` Chris Mason
@ 2021-09-16 17:11           ` James Bottomley
  2021-09-16 19:15             ` Theodore Ts'o
  2021-09-16 20:38             ` Chris Mason
  2021-09-16 17:15           ` Kent Overstreet
  1 sibling, 2 replies; 23+ messages in thread
From: James Bottomley @ 2021-09-16 17:11 UTC (permalink / raw)
  To: Chris Mason
  Cc: Theodore Ts'o, Johannes Weiner, Kent Overstreet,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit

On Thu, 2021-09-16 at 16:46 +0000, Chris Mason wrote:
> > On Sep 15, 2021, at 3:15 PM, James Bottomley <
> > James.Bottomley@HansenPartnership.com> wrote:
> > 
> > My reading of the email threads is that they're iterating to an
> > actual conclusion (I admit, I'm surprised) ... or at least the
> > disagreements are getting less.  Since the merge window closed this
> > is now a 5.16 thing, so there's no huge urgency to getting it
> > resolved next week.
> > 
> 
> I think the urgency is mostly around clarity for others with out of
> tree work, or who are depending on folios in some other way.  Setting
> up a clear set of conditions for the path forward should also be part
> of saying not-yet to merging them.
> 
> > > * What process should we use to make the overall development of
> > > folio sized changes more predictable and rewarding for everyone
> > > involved?
> > 
> > Well, the current one seems to be working (admittedly eventually,
> > so achieving faster resolution next time might be good) ... but I'm
> > sure you could propose alternatives ... especially in the time to
> > resolution department.
> 
> It feels like these patches are moving forward, but with a pretty
> heavy emotional cost for the people involved.  I'll definitely agree
> this has been our process for a long time, but I'm struggling to
> understand why we'd call it working.

Well ... moving forwards then.

> In general, we've all come to terms with huge changes being a slog
> through  consensus building, design compromise, the actual technical
> work, and the rebase/test/fix iteration cycle.  It's stressful, both
> because of technical difficulty and because the whole process is
> filled with uncertainty.
> 
> With folios, we don't have general consensus on:
> 
> * Which problems are being solved?  Kent's writeup makes it pretty
> clear filesystems and memory management developers have diverging
> opinions on this.  Our process in general is to put this into patch
> 0.  It mostly works, but there's an intermediate step between patch 0
> and the full lwn article that would be really nice to have.

I agree here ... but problem definition is supposed to be the job of
the submitter and fully laid out in the cover letter.

> * Who is responsible for accepting the design, and which acks must be
> obtained before it goes upstream?  Our process here is pretty similar
> to waiting for answers to messages in bottles.  We consistently leave
> it implicit and poorly defined.

My answer to this would be the same list of people who'd be responsible
for ack'ing the patches.  However, we're always very reluctant to ack
designs in case people don't like the look of the code when it appears
and don't want to be bound by the ack on the design.  I think we can
get around this by making it clear that design acks are equivalent to
"This sounds OK but I won't know for definite until I see the code"

> * What work is left before it can go upstream?  Our process could be
> effectively modeled by postit notes on one person's monitor, which
> they may or may not share with the group.  Also, since we don't have
> agreement on which acks are required, there's no way to have any
> certainty about what work is left.  It leaves authors feeling
> derailed when discussion shifts and reviewers feeling frustrated and
> ignored.

Actually, I don't see who should ack being an unknown.  The MAINTAINERS
file covers most of the kernel and a set of scripts will tell you based
on your code who the maintainers are ... that would seem to be the
definitive ack list.

I think the problem is the ack list for features covering large areas
is large and the problems come when the acker's don't agree ... some
like it, some don't.  The only deadlock breaking mechanism we have for
this is either Linus yelling at everyone or something happening to get
everyone into alignment (like an MM summit meeting).  Our current model
seems to be every acker has a foot on the brake, which means a single
nack can derail the process.  It gets even worse if you get a couple of
nacks each requesting mutually conflicting things.

We also have this other problem of subsystems not being entirely
collaborative.  If one subsystem really likes it and another doesn't,
there's a fear in the maintainers of simply being overridden by the
pull request going through the liking subsystem's tree.  This could be
seen as a deadlock breaking mechanism, but fear of this happening
drives overreactions.

We could definitely do a clear definition of who is allowed to nack and
when can that be overridden.

> * How do we divide up the long term future direction into individual
> steps that we can merge?  This also goes back to consensus on the
> design.  We can't decide which parts are going to get layered in
> future merge windows until we know if we're building a car or a
> banana stand.

This is usual for all large patches, though, and the author gets to
design this.

> * What tests will we use to validate it all?  Work this spread out is
> too big for one developer to test alone.  We need ways for people
> sign up and agree on which tests/benchmarks provide meaningful
> results.

In most large patches I've worked on, the maintainers raise worry about
various areas (usually performance) and the author gets to design tests
to validate or invalidate the concern ... which can become very open
ended if the concern is vague.

> The end result of all of this is that missing a merge window isn't
> just about a time delay.  You add N months of total uncertainty,
> where every new email could result in having to start over from
> scratch.  Willy's do-whatever-the-fuck-you-want-I'm-going-on-vacation 
> email is probably the least surprising part of the whole thread.
> 
> Internally, we tend to use a simple shared document to nail all of
> this down.  A two page google doc for folios could probably have
> avoided a lot of pain here, especially if we’re able to agree on
> stakeholders.

You mean like a cover letter?  Or do you mean a living document that
the acker's could comment on and amend?

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 16:46         ` Chris Mason
  2021-09-16 17:11           ` James Bottomley
@ 2021-09-16 17:15           ` Kent Overstreet
  2021-09-16 22:27             ` Chris Mason
  1 sibling, 1 reply; 23+ messages in thread
From: Kent Overstreet @ 2021-09-16 17:15 UTC (permalink / raw)
  To: Chris Mason
  Cc: James Bottomley, Theodore Ts'o, Johannes Weiner,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit

On Thu, Sep 16, 2021 at 04:46:25PM +0000, Chris Mason wrote:
> It feels like these patches are moving forward, but with a pretty heavy
> emotional cost for the people involved.  I'll definitely agree this has been
> our process for a long time, but I'm struggling to understand why we'd call it
> working.
> 
> In general, we've all come to terms with huge changes being a slog through
> consensus building, design compromise, the actual technical work, and the
> rebase/test/fix iteration cycle.  It's stressful, both because of technical
> difficulty and because the whole process is filled with uncertainty.
> 
> With folios, we don't have general consensus on:
> 
> * Which problems are being solved?  Kent's writeup makes it pretty clear
> filesystems and memory management developers have diverging opinions on this.
> Our process in general is to put this into patch 0.  It mostly works, but
> there's an intermediate step between patch 0 and the full lwn article that
> would be really nice to have.
> 
> * Who is responsible for accepting the design, and which acks must be obtained
> before it goes upstream?  Our process here is pretty similar to waiting for
> answers to messages in bottles.  We consistently leave it implicit and poorly
> defined.
> 
> * What work is left before it can go upstream?  Our process could be
> effectively modeled by postit notes on one person's monitor, which they may or
> may not share with the group.  Also, since we don't have agreement on which
> acks are required, there's no way to have any certainty about what work is
> left.  It leaves authors feeling derailed when discussion shifts and reviewers
> feeling frustrated and ignored.
> 
> * How do we divide up the long term future direction into individual steps
> that we can merge?  This also goes back to consensus on the design.  We can't
> decide which parts are going to get layered in future merge windows until we
> know if we're building a car or a banana stand.
> 
> * What tests will we use to validate it all?  Work this spread out is too big
> for one developer to test alone.  We need ways for people sign up and agree on
> which tests/benchmarks provide meaningful results.
> 
> The end result of all of this is that missing a merge window isn't just about
> a time delay.  You add N months of total uncertainty, where every new email
> could result in having to start over from scratch.  Willy's
> do-whatever-the-fuck-you-want-I'm-going-on-vacation email is probably the
> least surprising part of the whole thread.
> 
> Internally, we tend to use a simple shared document to nail all of this down.
> A two page google doc for folios could probably have avoided a lot of pain
> here, especially if we’re able to agree on stakeholders.
> 
> -chris

Agreed on all points. We don't have a culture of talking about design changes
before doing them, and maybe we should - the Rust RFC process is another
alternate model.

That isn't always a bad thing: I have often found that my best improvements to
my own code have come from doing a lot of exploratory refactoring, keeping what
works and discarding what doesn't, trusting my intuiting and then then looking
afterwards at what got better, and asking myself what that tells me about what
the design wants to be.

In hindsight I feel like Willy must have been doing the same thing; I think the
folio work is opening up _really_ interesting new avenues to explore - I was one
of the people talking about compound pages in the page cache early on, yet I did
not and would not have guessed where the work was actually going to lead, and I
find myself _really_ liking it.

But more than the question of whether we write design docs up front, I frankly
think we have a _broken_ culture with respect to supporting and enabling cross
subsystem refactorings and improvements. Instead of collectively coming up with
ideas for improvements, a lot of the discussions I see end up feeling like turf
wars and bikeshedding where everyone has their pet idea they want the thing to
be and no one is taking a step back and saying "look at this mess we created,
how are we going to simplify and clean it up."

And we have created some unholy messes, especially in MM land. I've been digging
into the rmap code and trying to figure out what the _inherent, fundamental_
differences between file and anonymous pages are - I think folios should also
include anonymous pages, but not yet - and I keep finding stuff that's just
gross. Endless if (old thing) if (new thing) where literally no effort has ever
been made to figure out if these things maybe should be the same thing.

It's like - seriously people, it's ok to create messes when we're doing new
things and figuring them out for the first time, but we have to go back and
clean up our messes or we end up with an unmaintainable Cthulian horror no one
can untangle, and a lot of the MM code is just about that point.

And if you look at our culture for how these kinds of deep invasive new features
gets developed and reviewed and added, is it really any surprise? We bikeshed
things to death, which scares people off and means they make the minimal changes
they need to core code - which means not touching the existing paths any more
than necessary, and people don't want to come back when they're done. Our
process is not encouraging good work!

And when Willy comes along with folios - which by introducing a new data type
for our main subtype of pages, are a starting point to taming this insanity - he
gets hit with the most ridiculous objections, like whether folios are a
replacement for compound pages (answer: no, compound pages belong to the other
side of the allocator/allocatee divide). It's like no one has ever heard of
separation of concerns.

To everyone involved: if you want to do competent design work you have to be
able to separate yourself from the specific problems you've been staring at and
look at the wider picture, and ask yourself if this thing you want is a good
idea for the wider ecosystem, or whether your specific problem _matters_ in this
instance.

MM people: I know you care about fragmentation, and that a lot of your work days
is spent dealing with it. But it's not a concern for folios, because we can
always _fail the allocation and allocate a smaller one_. And I have specifically
pushed back when filesystem people wanted fixed size folios because they thought
it would make their lives easier: to restate my answer to that publically,
folios are basically extents, and part of being a filesystem developer and
dealing with extents is that you have to get used to dealing with arbitrary
sized extents - i.e. processing them incrementally, you have to be more flexible
in your thinking then when you were writing code that was working with fixed
size blocks or pages. But you'll deal.

/end rant

I apologize in advance if anyone feels I've been unfair to them; we are all,
after all, figuring this out as we go along. But we've got room for improvement!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 17:11           ` James Bottomley
@ 2021-09-16 19:15             ` Theodore Ts'o
  2021-09-16 19:26               ` Andrew Morton
  2021-09-16 20:16               ` Kent Overstreet
  2021-09-16 20:38             ` Chris Mason
  1 sibling, 2 replies; 23+ messages in thread
From: Theodore Ts'o @ 2021-09-16 19:15 UTC (permalink / raw)
  To: James Bottomley
  Cc: Chris Mason, Johannes Weiner, Kent Overstreet, Matthew Wilcox,
	Linus Torvalds, linux-mm, linux-fsdevel, linux-kernel,
	Andrew Morton, Darrick J. Wong, Christoph Hellwig, David Howells,
	ksummit

On Thu, Sep 16, 2021 at 01:11:21PM -0400, James Bottomley wrote:
> 
> Actually, I don't see who should ack being an unknown.  The MAINTAINERS
> file covers most of the kernel and a set of scripts will tell you based
> on your code who the maintainers are ... that would seem to be the
> definitive ack list.

It's *really* not that simple.  It is *not* the case that if a change
touches a single line of fs/ext4 (as well as 60+ other filesystems),
for example:

-       ei = kmem_cache_alloc(ext4_inode_cachep, GFP_NOFS);
+       ei = alloc_inode_sb(sb, ext4_inode_cachep, GFP_NOFS);

that the submitter *must* get a ACK from me --- or that I am entitled
to NACK the entire 79 patch series for any reason I feel like, or to
withhold my ACK as hostage until the submitter does some development
work that I want.

What typically happens is if someone were to try to play games like
this inside, say, the Networking subsystem, past a certain point,
David Miller will just take the patch series, ignoring people who have
NACK's down if they can't be justified.  The difference is that even
though Andrew Morton (the titular maintainer for all of Memory
Management, per the MAINTAINERS file), Andrew seems to have a much
lighter touch on how the mm subsystem is run.

> I think the problem is the ack list for features covering large areas
> is large and the problems come when the acker's don't agree ... some
> like it, some don't.  The only deadlock breaking mechanism we have for
> this is either Linus yelling at everyone or something happening to get
> everyone into alignment (like an MM summit meeting).  Our current model
> seems to be every acker has a foot on the brake, which means a single
> nack can derail the process.  It gets even worse if you get a couple of
> nacks each requesting mutually conflicting things.
> 
> We also have this other problem of subsystems not being entirely
> collaborative.  If one subsystem really likes it and another doesn't,
> there's a fear in the maintainers of simply being overridden by the
> pull request going through the liking subsystem's tree.  This could be
> seen as a deadlock breaking mechanism, but fear of this happening
> drives overreactions.
> 
> We could definitely do a clear definition of who is allowed to nack and
> when can that be overridden.

Well, yes.  And this is why I think there is a process issue here that
*is* within the MAINTAINERS SUMMIT purview, and if we need to
technical BOF to settle the specific question of what needs to happen,
whether it happens at LPC, or it needs to happen after LPC, then let's
have it happen.

I'd be really disappointed if we have to wait until December 2022 for
the next LSF/MM, and if we don't get consensus there, ala DAX, that we
then have to wait until late 2023, etc.  As others have said, this is
holding up some work that file system developers would really like to
see.

					- Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 19:15             ` Theodore Ts'o
@ 2021-09-16 19:26               ` Andrew Morton
  2021-09-16 20:16               ` Kent Overstreet
  1 sibling, 0 replies; 23+ messages in thread
From: Andrew Morton @ 2021-09-16 19:26 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: James Bottomley, Chris Mason, Johannes Weiner, Kent Overstreet,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Darrick J. Wong, Christoph Hellwig, David Howells,
	ksummit

On Thu, 16 Sep 2021 15:15:29 -0400 "Theodore Ts'o" <tytso@mit.edu> wrote:

> What typically happens is if someone were to try to play games like
> this inside, say, the Networking subsystem, past a certain point,
> David Miller will just take the patch series, ignoring people who have
> NACK's down if they can't be justified.  The difference is that even
> though Andrew Morton (the titular maintainer for all of Memory
> Management, per the MAINTAINERS file), Andrew seems to have a much
> lighter touch on how the mm subsystem is run.

I do the Dave thing sometimes.  We aren't at that point with folios
though.  The discussions and objections and approvals are all
substantial and things are still playing out.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 19:15             ` Theodore Ts'o
  2021-09-16 19:26               ` Andrew Morton
@ 2021-09-16 20:16               ` Kent Overstreet
  2021-09-17  1:42                 ` Theodore Ts'o
  1 sibling, 1 reply; 23+ messages in thread
From: Kent Overstreet @ 2021-09-16 20:16 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: James Bottomley, Chris Mason, Johannes Weiner, Matthew Wilcox,
	Linus Torvalds, linux-mm, linux-fsdevel, linux-kernel,
	Andrew Morton, Darrick J. Wong, Christoph Hellwig, David Howells,
	ksummit

On Thu, Sep 16, 2021 at 03:15:29PM -0400, Theodore Ts'o wrote:
> On Thu, Sep 16, 2021 at 01:11:21PM -0400, James Bottomley wrote:
> > 
> > Actually, I don't see who should ack being an unknown.  The MAINTAINERS
> > file covers most of the kernel and a set of scripts will tell you based
> > on your code who the maintainers are ... that would seem to be the
> > definitive ack list.
> 
> It's *really* not that simple.  It is *not* the case that if a change
> touches a single line of fs/ext4 (as well as 60+ other filesystems),
> for example:
> 
> -       ei = kmem_cache_alloc(ext4_inode_cachep, GFP_NOFS);
> +       ei = alloc_inode_sb(sb, ext4_inode_cachep, GFP_NOFS);
> 
> that the submitter *must* get a ACK from me --- or that I am entitled
> to NACK the entire 79 patch series for any reason I feel like, or to
> withhold my ACK as hostage until the submitter does some development
> work that I want.
> 
> What typically happens is if someone were to try to play games like
> this inside, say, the Networking subsystem, past a certain point,
> David Miller will just take the patch series, ignoring people who have
> NACK's down if they can't be justified.  The difference is that even
> though Andrew Morton (the titular maintainer for all of Memory
> Management, per the MAINTAINERS file), Andrew seems to have a much
> lighter touch on how the mm subsystem is run.
> 
> > I think the problem is the ack list for features covering large areas
> > is large and the problems come when the acker's don't agree ... some
> > like it, some don't.  The only deadlock breaking mechanism we have for
> > this is either Linus yelling at everyone or something happening to get
> > everyone into alignment (like an MM summit meeting).  Our current model
> > seems to be every acker has a foot on the brake, which means a single
> > nack can derail the process.  It gets even worse if you get a couple of
> > nacks each requesting mutually conflicting things.
> > 
> > We also have this other problem of subsystems not being entirely
> > collaborative.  If one subsystem really likes it and another doesn't,
> > there's a fear in the maintainers of simply being overridden by the
> > pull request going through the liking subsystem's tree.  This could be
> > seen as a deadlock breaking mechanism, but fear of this happening
> > drives overreactions.
> > 
> > We could definitely do a clear definition of who is allowed to nack and
> > when can that be overridden.
> 
> Well, yes.  And this is why I think there is a process issue here that
> *is* within the MAINTAINERS SUMMIT purview, and if we need to
> technical BOF to settle the specific question of what needs to happen,
> whether it happens at LPC, or it needs to happen after LPC, then let's
> have it happen.

I would love to see us putting our energy into trying to have more productive
design discussions instead of getting more rules based. If someone feels
strongly enough to NACK a patch series, usually that's an indication of a
breakdown in communications and it means we need to put more effort into
figuring out what the real disagreement is. It's not like people usually NACK
things just to be petty - and if they are, that becomes apparent when we try to
communicate them to find out what the disagreement is and they don't respond
with the same effort.

And if people aren't being petty and are making a genuine effort to communicate
well and we're still not reaching a consensus - that does happen and there most
definitely are times when we just have differences of opinion and technical
judgement, and the maintainer will have to come to a decision. But before that
happens, we should make sure we've actually had a productive effective
discussion and figured out what those concerns and differences of opinion are,
so that the maintainer can make an _informed_ decision.

> I'd be really disappointed if we have to wait until December 2022 for
> the next LSF/MM, and if we don't get consensus there, ala DAX, that we
> then have to wait until late 2023, etc.  As others have said, this is
> holding up some work that file system developers would really like to
> see.

So I think we're still trying to answer the "what exactly is a folio" question.
As I see it, there's two potential approaches:

 - The minimalist approach, where folios are just pagecache pages

 - The maximalist approach, where folios are also anonymous pages. Potentially
   all pages that could be mapped into userspace would be folios, possibly with
   some work to unify weird driver things.

Network pages, slab pages aren't folios - they're their own thing. Folios are
also not a replacement for compound pages. Whichever way we go, folios are for
things that can be mapped into userspace.

Also: folios are a start on cutting up the unholy mess that is struct page into
separate data types. In struct page, we have a big nested union of structs, for
different types of pages. As I understand it from perusing the code, Willy has
been basically taking the approach of turning the first struct in the big
union-of-structs and (mostly?) making everything that uses that a folio.

I think that is reasonable, because it's basically adding types to describe the
world as it is - I would say that if it leaves things looking like a mess with
confused module boundaries between MM and FS, that's because the code was
already a mess, and while we should certainly work on cleaning that up those
cleanups shouldn't be done in _this_ giant patch series because that's how you
end up with bugs that you can't bisect.

However, Johannes has been pointing out that it's a real open question as to
whether anonymous pages should be folios! Willy's current code seems to leave
things in a somewhat intermediate state - some mm/ code treats anonymous pages
as folios, but it's not clear to me how much. And I still see a lot of
references to page->mapping; we should be clear on what's happening to those (if
the page is a folio, we should definitely not be referencing page->mapping or
page->index).

So: should anonymous pages be more like file pages? I think that's something
worth exploring, and potentially a lot of code could be unified and deleted with
that approach - a lot of the hugepage/transhuge code is doing similar stuff as
folios, but folios look to be doing it much cleaner. There's also things like
rmap.c, which is constantly asking is this page anonymous? is it file? and doing
different things that look somewhat similar (also KSM, but that's a whole nother
bag of crazy). Johannes things that anonymous pages differ too much from file
pages and that trying to unify them would be a mistake - perhaps he's right.
Perhaps we should create a new type analogous to folio for those pages - if all
the current places in the code where we're asking "Is this file? Is this anon?"
really do need to be doing that, then having our types match makes sense.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 17:11           ` James Bottomley
  2021-09-16 19:15             ` Theodore Ts'o
@ 2021-09-16 20:38             ` Chris Mason
  2021-09-16 21:00               ` Konstantin Ryabitsev
  1 sibling, 1 reply; 23+ messages in thread
From: Chris Mason @ 2021-09-16 20:38 UTC (permalink / raw)
  To: James Bottomley
  Cc: Theodore Ts'o, Johannes Weiner, Kent Overstreet,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit



> On Sep 16, 2021, at 1:11 PM, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> 
> On Thu, 2021-09-16 at 16:46 +0000, Chris Mason wrote:
>> 
>> With folios, we don't have general consensus on:
>> 
>> * Which problems are being solved?  Kent's writeup makes it pretty
>> clear filesystems and memory management developers have diverging
>> opinions on this.  Our process in general is to put this into patch
>> 0.  It mostly works, but there's an intermediate step between patch 0
>> and the full lwn article that would be really nice to have.
> 
> I agree here ... but problem definition is supposed to be the job of
> the submitter and fully laid out in the cover letter.
> 
>> * Who is responsible for accepting the design, and which acks must be
>> obtained before it goes upstream?  Our process here is pretty similar
>> to waiting for answers to messages in bottles.  We consistently leave
>> it implicit and poorly defined.
> 
> My answer to this would be the same list of people who'd be responsible
> for ack'ing the patches.  However, we're always very reluctant to ack
> designs in case people don't like the look of the code when it appears
> and don't want to be bound by the ack on the design.  I think we can
> get around this by making it clear that design acks are equivalent to
> "This sounds OK but I won't know for definite until I see the code"
> 
>> * What work is left before it can go upstream?  Our process could be
>> effectively modeled by postit notes on one person's monitor, which
>> they may or may not share with the group.  Also, since we don't have
>> agreement on which acks are required, there's no way to have any
>> certainty about what work is left.  It leaves authors feeling
>> derailed when discussion shifts and reviewers feeling frustrated and
>> ignored.
> 
> Actually, I don't see who should ack being an unknown.  The MAINTAINERS
> file covers most of the kernel and a set of scripts will tell you based
> on your code who the maintainers are ... that would seem to be the
> definitive ack list.

One risk with this thread is over-pivoting on folios.  It’s a great example exactly because Willy is so well established.  If the definitive ack list is easy, how do we consistently seem to mess it up?

Part of the problem is that we just leave it unsaid.  Andrew has a list in his head of acks he’s waiting for, and Willy has a slightly different list, and Linus again has a slightly different list.  

> 
> I think the problem is the ack list for features covering large areas
> is large and the problems come when the acker's don't agree ... some
> like it, some don't.  The only deadlock breaking mechanism we have for
> this is either Linus yelling at everyone or something happening to get
> everyone into alignment (like an MM summit meeting).  Our current model
> seems to be every acker has a foot on the brake, which means a single
> nack can derail the process.  It gets even worse if you get a couple of
> nacks each requesting mutually conflicting things.

Agree here.  Mailing lists make it really hard to figure out when these conflicts are resolved, which is why I love using google docs for that part.

> 
> We also have this other problem of subsystems not being entirely
> collaborative.  If one subsystem really likes it and another doesn't,
> there's a fear in the maintainers of simply being overridden by the
> pull request going through the liking subsystem's tree.  This could be
> seen as a deadlock breaking mechanism, but fear of this happening
> drives overreactions.

I do agree, but I think this part we actually get right more often than not.  It’s one of those places where you usually see Linus using his powers for good.

> 
> We could definitely do a clear definition of who is allowed to nack and
> when can that be overridden.
> 
>> * How do we divide up the long term future direction into individual
>> steps that we can merge?  This also goes back to consensus on the
>> design.  We can't decide which parts are going to get layered in
>> future merge windows until we know if we're building a car or a
>> banana stand.
> 
> This is usual for all large patches, though, and the author gets to
> design this.

Ex: patches tripping over unrelated but useful cleanups that don’t actually have to happen first but end up requirements for inclusion.  The examples matter less than a way to document agreement on requirements for inclusion.

> 
>> * What tests will we use to validate it all?  Work this spread out is
>> too big for one developer to test alone.  We need ways for people
>> sign up and agree on which tests/benchmarks provide meaningful
>> results.
> 
> In most large patches I've worked on, the maintainers raise worry about
> various areas (usually performance) and the author gets to design tests
> to validate or invalidate the concern ... which can become very open
> ended if the concern is vague.
> 
>> The end result of all of this is that missing a merge window isn't
>> just about a time delay.  You add N months of total uncertainty,
>> where every new email could result in having to start over from
>> scratch.  Willy's do-whatever-the-fuck-you-want-I'm-going-on-vacation 
>> email is probably the least surprising part of the whole thread.
>> 
>> Internally, we tend to use a simple shared document to nail all of
>> this down.  A two page google doc for folios could probably have
>> avoided a lot of pain here, especially if we’re able to agree on
>> stakeholders.
> 
> You mean like a cover letter?  Or do you mean a living document that
> the acker's could comment on and amend?

A living document with a single source of truth on key design points, work remaining, and stakeholders who are responsible for ack/nack decisions.  Basically if you don’t have edit permissions on the document, you’re not one of the people that can say no.

If you do have edit permissions, you’re expected to be on board with the overall goal and help work through the design/validation/code/etc until you’re ready to ack it, or until it’s clear the whole thing isn’t going to work.  If you feel you need to have edit permissions, you’ve got a defined set of people to talk with about it.

It can’t completely replace the mailing lists, but it can take a lot of the archeology out of understanding a given patch series and figuring out if it’s actually ready to go.

-chris




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 20:38             ` Chris Mason
@ 2021-09-16 21:00               ` Konstantin Ryabitsev
  2021-09-17 11:14                 ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Konstantin Ryabitsev @ 2021-09-16 21:00 UTC (permalink / raw)
  To: Chris Mason
  Cc: James Bottomley, Theodore Ts'o, Johannes Weiner,
	Kent Overstreet, Matthew Wilcox, Linus Torvalds, linux-mm,
	linux-fsdevel, linux-kernel, Andrew Morton, Darrick J. Wong,
	Christoph Hellwig, David Howells, ksummit

On Thu, Sep 16, 2021 at 08:38:13PM +0000, Chris Mason wrote:
> Agree here.  Mailing lists make it really hard to figure out when these
> conflicts are resolved, which is why I love using google docs for that part.

I would caution that Google docs aren't universally accessible. China blocks
access to many Google resources, and now Russia purportedly does the same.
Perhaps a similar effect can be reached with a git repository with limited
commit access? At least then commits can be attested to individual authors.

> A living document with a single source of truth on key design points, work
> remaining, and stakeholders who are responsible for ack/nack decisions.
> Basically if you don’t have edit permissions on the document, you’re not one
> of the people that can say no.
> 
> If you do have edit permissions, you’re expected to be on board with the
> overall goal and help work through the design/validation/code/etc until
> you’re ready to ack it, or until it’s clear the whole thing isn’t going to
> work.  If you feel you need to have edit permissions, you’ve got a defined
> set of people to talk with about it.
> 
> It can’t completely replace the mailing lists, but it can take a lot of the
> archeology out of understanding a given patch series and figuring out if
> it’s actually ready to go.

You can combine the two and use mailing lists as the source of truth by using
Link: tags in commits to make it easy to verify history and provenance.

-K

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 17:15           ` Kent Overstreet
@ 2021-09-16 22:27             ` Chris Mason
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Mason @ 2021-09-16 22:27 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: James Bottomley, Theodore Ts'o, Johannes Weiner,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit


> On Sep 16, 2021, at 1:15 PM, Kent Overstreet <kent.overstreet@gmail.com> wrote:
> 

[ general agreement ]

> But more than the question of whether we write design docs up front, I frankly
> think we have a _broken_ culture with respect to supporting and enabling cross
> subsystem refactorings and improvements. Instead of collectively coming up with
> ideas for improvements, a lot of the discussions I see end up feeling like turf
> wars and bikeshedding where everyone has their pet idea they want the thing to
> be and no one is taking a step back and saying "look at this mess we created,
> how are we going to simplify and clean it up."
> 
> And we have created some unholy messes, especially in MM land.
> 

[ … ]

> It's like - seriously people, it's ok to create messes when we're doing new
> things and figuring them out for the first time, but we have to go back and
> clean up our messes or we end up with an unmaintainable Cthulian horror no one
> can untangle, and a lot of the MM code is just about that point.
> 

You’ve been doing a lot of bridge building recently, so please don’t take this the wrong way.  I think a key component of avoiding the turf wars is recognizing that we don’t need to make people feel shitty about their subsystem before we can convince them to improve it.  We all have different priorities around what to improve, and we’ve all made compromises over the years.  It’s enough to just be excited about how things can be better.

This email is hard to write because I’m hoping my own messages from earlier today fall into the category of being excited for improvements, but here we are.

-chris

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 20:16               ` Kent Overstreet
@ 2021-09-17  1:42                 ` Theodore Ts'o
  2021-09-17  4:58                   ` Kent Overstreet
  0 siblings, 1 reply; 23+ messages in thread
From: Theodore Ts'o @ 2021-09-17  1:42 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: James Bottomley, Chris Mason, Johannes Weiner, Matthew Wilcox,
	Linus Torvalds, linux-mm, linux-fsdevel, linux-kernel,
	Andrew Morton, Darrick J. Wong, Christoph Hellwig, David Howells,
	ksummit

On Thu, Sep 16, 2021 at 04:16:27PM -0400, Kent Overstreet wrote:
> So I think we're still trying to answer the "what exactly is a folio"
> question....

> However, Johannes has been pointing out that it's a real open
> question as to whether anonymous pages should be folios! Willy's
> current code seems to leave things in a somewhat intermediate state
> - some mm/ code treats anonymous pages as folios, but it's not clear
> to me how much....

Kent, you raise some good questions, and good points.  However, it
seems to me that one of the other sources of the disagreement is the
question of whether this question needs to be answered at all before
the Folios patch can get merged.

We could engage in a process such as what Chris Mason has suggested,
with a more formal design doc, with stakeholders who have to review,
comment, and explicitly give their LGTM's.  We do that sort of thing
quite often at Google (and probably at many other companies), so it's
a familiar approach.  That would be a fine way of trying to come to a
formal agreement on that question.

What comes to my mind, though, is the quote, originally made by Linus,
"Linux is evolution, not Intelligent Design".  Greg K-H requoted Linus
in his 2006 Ottawa Linux Symposium[1], “Myths, Lies, and Truths about
the Linux Kernel”, and further claimed, "The kernel is not developed
with big design documents, feature requests and so on."

[1] http://www.kroah.com/log/linux/ols_2006_keynote.html

Of course, that was 15 years ago, and things have gotten a lot more
complex.  And when things get more complex, a certain amount of
agreement ahead of time between developers, memorialized by Design
Docs, does become more and more inevitable.  The source of friction,
then is how *much* pre-design and consensus is needed in a particular
case.

After all, as you said:

   ".... folios are a start on cutting up the unholy mess that is
   struct page into separate data types. In struct page, we have a big
   nested union of structs, for different types of pages."

So one could argue that folio makes things better.  It's not an 100%
solution, and perhaps it's unfortunate that it leaves things "in a
somewhat intermediate state".  But if it's better than what we
currently have, perhaps we should land this patch set, and if we need
to make further evolutionary changes, is that really such a tragedy?

After all, we've never guaranteed stable API's (another thing which
Greg foot-stomped in his 2006 keynote).  Maybe after we live with
folios, we'll learn more about the benefits and downsides, we can make
further changes --- evolution, as we might say.

Quoting further from Greg K-H:

    "The Linux USB code has been rewritten at least three times. We've
    done this over time in order to handle things that we didn't
    originally need to handle, like high speed devices, and just
    because we learned the problems of our first design, and to fix
    bugs and security issues. Each time we made changes in our api, we
    updated all of the kernel drivers that used the apis, so nothing
    would break. And we deleted the old functions as they were no
    longer needed, and did things wrong. Because of this, Linux now
    has the fastest USB bus speeds when you test out all of the
    different operating systems....."[1]

(And it's not just the USB subsystem that has been rewritten three
times; our networking stack has been rewritten at least 3 times as
well.)

It seems that part of the frustration is that people seem to agree
that Folios does make things better, and yet they *still* are NACK'ing
the patch series.  The argument for why it should not be merged yet
seems to be that it should be doing *more* --- that it doesn't go far
enough.

The opposing argument would be, "if folios improves things, and
doesn't introduce any bugs, why shouldn't we merge it, reap the
benefits, and then we can further evolve things?"

As Linus said, "Linux is evolution, not intelligent design."

	      	      	  	  	 - Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-17  1:42                 ` Theodore Ts'o
@ 2021-09-17  4:58                   ` Kent Overstreet
  0 siblings, 0 replies; 23+ messages in thread
From: Kent Overstreet @ 2021-09-17  4:58 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: James Bottomley, Chris Mason, Johannes Weiner, Matthew Wilcox,
	Linus Torvalds, linux-mm, linux-fsdevel, linux-kernel,
	Andrew Morton, Darrick J. Wong, Christoph Hellwig, David Howells,
	ksummit

On Thu, Sep 16, 2021 at 09:42:21PM -0400, Theodore Ts'o wrote:
> On Thu, Sep 16, 2021 at 04:16:27PM -0400, Kent Overstreet wrote:
> > So I think we're still trying to answer the "what exactly is a folio"
> > question....
> 
> > However, Johannes has been pointing out that it's a real open
> > question as to whether anonymous pages should be folios! Willy's
> > current code seems to leave things in a somewhat intermediate state
> > - some mm/ code treats anonymous pages as folios, but it's not clear
> > to me how much....
> 
> Kent, you raise some good questions, and good points.  However, it
> seems to me that one of the other sources of the disagreement is the
> question of whether this question needs to be answered at all before
> the Folios patch can get merged.

...

> It seems that part of the frustration is that people seem to agree
> that Folios does make things better, and yet they *still* are NACK'ing
> the patch series.  The argument for why it should not be merged yet
> seems to be that it should be doing *more* --- that it doesn't go far
> enough.

Yeah, I agree 100%, and I've expressed my own frustrations with how the folios
discussions have been going (and I could, and will, express some more of those
frustrations - later).

But, that's water under the bridge. For now, I'm really just trying to drive the
technical discussion. I'm not Andrew or Linus, it's not my say whether folios
get merged, I'm just trying to dig to figure out what the _actual_ technical
points of contention are (and it's taken some real digging...)

And having done so, I think the question of whether or not anonymous pages are
becoming folios actually is extremely cogent - I think there's a lot of meat to
that discussion, and it definitely impacts _squarely_ in MM internals land.

So, let's just try to be more forward looking, try to forget the acrimony, and
get into that discussion, and remember that we'll all be having beers with each
other whenever the fsck LSF actually happens again.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-16 21:00               ` Konstantin Ryabitsev
@ 2021-09-17 11:14                 ` James Bottomley
  2021-09-17 12:36                   ` Konstantin Ryabitsev
  0 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2021-09-17 11:14 UTC (permalink / raw)
  To: Konstantin Ryabitsev, Chris Mason
  Cc: Theodore Ts'o, Johannes Weiner, Kent Overstreet,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit

On Thu, 2021-09-16 at 17:00 -0400, Konstantin Ryabitsev wrote:
> On Thu, Sep 16, 2021 at 08:38:13PM +0000, Chris Mason wrote:
> > Agree here.  Mailing lists make it really hard to figure out when
> > these conflicts are resolved, which is why I love using google docs
> > for that part.
> 
> I would caution that Google docs aren't universally accessible. China
> blocks access to many Google resources, and now Russia purportedly
> does the same. Perhaps a similar effect can be reached with a git
> repository with limited commit access? At least then commits can be
> attested to individual authors.

In days of old, when knights were bold and cloud silos weren't
invented, we had an ancient magic handed down by the old gods who spoke
non type safe languages.  They called it wiki and etherpad ... could we
make use of such tools today without committing heresy against our
cloud overlords?

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-17 11:14                 ` James Bottomley
@ 2021-09-17 12:36                   ` Konstantin Ryabitsev
  2021-09-17 13:00                     ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Konstantin Ryabitsev @ 2021-09-17 12:36 UTC (permalink / raw)
  To: James Bottomley
  Cc: Chris Mason, Theodore Ts'o, Johannes Weiner, Kent Overstreet,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit

On Fri, Sep 17, 2021 at 07:14:11AM -0400, James Bottomley wrote:
> > I would caution that Google docs aren't universally accessible. China
> > blocks access to many Google resources, and now Russia purportedly
> > does the same. Perhaps a similar effect can be reached with a git
> > repository with limited commit access? At least then commits can be
> > attested to individual authors.
> 
> In days of old, when knights were bold and cloud silos weren't
> invented, we had an ancient magic handed down by the old gods who spoke
> non type safe languages.  They called it wiki and etherpad ... could we
> make use of such tools today without committing heresy against our
> cloud overlords?

You mean, like https://pad.kernel.org ? :)

However, a large part of why I was suggesting a git repo is because it is
automatically redistributable, clonable, and verifiable using builtin git
tools. We have end-to-end attestation with git, but we don't have it with
etherpad or a wiki. If the goal is to use a document that solicits acks and
other input across subsystems, then having a tamper-evident backend may be
important.

-K

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-17 12:36                   ` Konstantin Ryabitsev
@ 2021-09-17 13:00                     ` James Bottomley
  2021-09-17 14:36                       ` Chris Mason
  0 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2021-09-17 13:00 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Chris Mason, Theodore Ts'o, Johannes Weiner, Kent Overstreet,
	Matthew Wilcox, Linus Torvalds, linux-mm, linux-fsdevel,
	linux-kernel, Andrew Morton, Darrick J. Wong, Christoph Hellwig,
	David Howells, ksummit

On Fri, 2021-09-17 at 08:36 -0400, Konstantin Ryabitsev wrote:
> On Fri, Sep 17, 2021 at 07:14:11AM -0400, James Bottomley wrote:
> > > I would caution that Google docs aren't universally accessible.
> > > China blocks access to many Google resources, and now Russia
> > > purportedly does the same. Perhaps a similar effect can be
> > > reached with a git repository with limited commit access? At
> > > least then commits can be attested to individual authors.
> > 
> > In days of old, when knights were bold and cloud silos weren't
> > invented, we had an ancient magic handed down by the old gods who
> > spoke non type safe languages.  They called it wiki and etherpad
> > ... could we make use of such tools today without committing heresy
> > against our cloud overlords?
> 
> You mean, like https://pad.kernel.org ? :)
> 
> However, a large part of why I was suggesting a git repo is because
> it is automatically redistributable, clonable, and verifiable using
> builtin git tools. We have end-to-end attestation with git, but we
> don't have it with etherpad or a wiki. If the goal is to use a
> document that solicits acks and other input across subsystems, then
> having a tamper-evident backend may be important.

I think the goal is to have a living document that records who should
ack, what the design goals are who has what current concerns and how
they're being addressed and what the status of the patch set is. 
Actually collecting acks for the patches would be the job of the author
as it is today and verification would be via the public lists.

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic?
  2021-09-17 13:00                     ` James Bottomley
@ 2021-09-17 14:36                       ` Chris Mason
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Mason @ 2021-09-17 14:36 UTC (permalink / raw)
  To: James Bottomley
  Cc: Konstantin Ryabitsev, Theodore Ts'o, Johannes Weiner,
	Kent Overstreet, Matthew Wilcox, Linus Torvalds, linux-mm,
	linux-fsdevel, linux-kernel, Andrew Morton, Darrick J. Wong,
	Christoph Hellwig, David Howells, ksummit


> On Sep 17, 2021, at 9:00 AM, James Bottomley <James.Bottomley@hansenpartnership.com> wrote:
> 
> On Fri, 2021-09-17 at 08:36 -0400, Konstantin Ryabitsev wrote:
>> On Fri, Sep 17, 2021 at 07:14:11AM -0400, James Bottomley wrote:
>>>> I would caution that Google docs aren't universally accessible.
>>>> China blocks access to many Google resources, and now Russia
>>>> purportedly does the same. Perhaps a similar effect can be
>>>> reached with a git repository with limited commit access? At
>>>> least then commits can be attested to individual authors.
>>> 
>>> In days of old, when knights were bold and cloud silos weren't
>>> invented, we had an ancient magic handed down by the old gods who
>>> spoke non type safe languages.  They called it wiki and etherpad
>>> ... could we make use of such tools today without committing heresy
>>> against our cloud overlords?
>> 
>> You mean, like https://pad.kernel.org ? :)
>> 
>> However, a large part of why I was suggesting a git repo is because
>> it is automatically redistributable, clonable, and verifiable using
>> builtin git tools. We have end-to-end attestation with git, but we
>> don't have it with etherpad or a wiki. If the goal is to use a
>> document that solicits acks and other input across subsystems, then
>> having a tamper-evident backend may be important.
> 
> I think the goal is to have a living document that records who should
> ack, what the design goals are who has what current concerns and how
> they're being addressed and what the status of the patch set is. 
> Actually collecting acks for the patches would be the job of the author
> as it is today and verification would be via the public lists.

Thanks Konstantin for bringing up issues with google docs.  I assumed different groups of people would store state differently, but didn’t think of this problem.  One nice feature about google docs is you can mark issues as resolved etc, but obviously people can simulate that in other ways with etherpad.

-chris


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-09-17 14:37 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-15 17:42 [MAINTAINER SUMMIT] Folios as a potential Kernel/Maintainers Summit topic? Theodore Ts'o
2021-09-15 18:03 ` James Bottomley
2021-09-15 18:20   ` Theodore Ts'o
2021-09-15 18:41     ` Chris Mason
2021-09-15 19:15       ` James Bottomley
2021-09-15 20:48         ` Theodore Ts'o
2021-09-16 14:55           ` Kent Overstreet
2021-09-16 13:51         ` David Howells
2021-09-16 16:46         ` Chris Mason
2021-09-16 17:11           ` James Bottomley
2021-09-16 19:15             ` Theodore Ts'o
2021-09-16 19:26               ` Andrew Morton
2021-09-16 20:16               ` Kent Overstreet
2021-09-17  1:42                 ` Theodore Ts'o
2021-09-17  4:58                   ` Kent Overstreet
2021-09-16 20:38             ` Chris Mason
2021-09-16 21:00               ` Konstantin Ryabitsev
2021-09-17 11:14                 ` James Bottomley
2021-09-17 12:36                   ` Konstantin Ryabitsev
2021-09-17 13:00                     ` James Bottomley
2021-09-17 14:36                       ` Chris Mason
2021-09-16 17:15           ` Kent Overstreet
2021-09-16 22:27             ` Chris Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).