linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Let's get a File & Storage miniconf going at LPC2015!
@ 2015-05-15 20:58 Darrick J. Wong
  2015-05-19 15:42 ` Kent Overstreet
  0 siblings, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2015-05-15 20:58 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: linux-scsi, xfs, device-mapper development, Linux FS Devel,
	linux-ext4, linux-btrfs

Hi again,

Early registration for this summer's Plumbers in Seattle, Washington ends
next Friday the 22nd.  With that in mind, I still don't have quite enough
people listed on the File & Storage miniconf wiki page for the organizers
to declare us an official miniconference, so if you're planning to go, or
even think you might go, please take a look at the planning page[1] and
add your name!

At a bare minimum, it seems like we can continue the ongoing discussions
around SMR exploration; the work going on with persistent memory and how to
expose it to filesystems and user apps; I see a proposal for the refereed
talks about the ongoing Open SSD work; and there have been prompts for more
conversation about rich ACL integration and supporting RDMA in NFS and Samba.
I think it would also be a good place to talk more about how to better
accomodate userspace filesystems like Ceph, giving things like the recent
O_NOMTIME thread, and other things like copy_file_range().

We of course are not limited to just those topics -- if there's something
you'd really like to discuss with everyone, please add that to the wiki.
I realize that mid-August is family vacation time for many people, but
those are the constraints the conference has to work with this year.
There are fun things to do around Seattle, and that's probably the best
time of year to visit.

--Darrick

[1] http://wiki.linuxplumbersconf.org/2015:file_and_storage_systems

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Let's get a File & Storage miniconf going at LPC2015!
  2015-05-15 20:58 Let's get a File & Storage miniconf going at LPC2015! Darrick J. Wong
@ 2015-05-19 15:42 ` Kent Overstreet
  2015-05-19 20:10   ` Darrick J. Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Kent Overstreet @ 2015-05-19 15:42 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Ric Wheeler, Linux FS Devel, linux-scsi,
	device-mapper development, linux-ext4, xfs, linux-btrfs

On Fri, May 15, 2015 at 01:58:25PM -0700, Darrick J. Wong wrote:
> We of course are not limited to just those topics -- if there's something
> you'd really like to discuss with everyone, please add that to the wiki.
> I realize that mid-August is family vacation time for many people, but
> those are the constraints the conference has to work with this year.
> There are fun things to do around Seattle, and that's probably the best
> time of year to visit.

It's getting near time for the big bcachefs announcement :)

Also, stable pages - what's been going on there? Last I heard you were talking
about using the page migration code to do COW, did anything come of that? I just
added data checksumming/compression to bcachefs, so that's been fresh on my
mind.

Also, there's probably always going to be situations where we're reading or
writing to pages user space can stomp on (dio) - IMO we need to add a bio flag
to annotate this - "if you need this to be stable you have to bounce it".
Otherwise either filesystems/block drivers are going to be stuck bouncing
everything, or it'll just (continue to be) buggy.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Let's get a File & Storage miniconf going at LPC2015!
  2015-05-19 15:42 ` Kent Overstreet
@ 2015-05-19 20:10   ` Darrick J. Wong
  2015-05-21  1:04     ` Proposal for annotating _unstable_ pages Kent Overstreet
  0 siblings, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2015-05-19 20:10 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Ric Wheeler, linux-scsi, xfs, device-mapper development,
	Linux FS Devel, linux-ext4, linux-btrfs

On Tue, May 19, 2015 at 08:42:00AM -0700, Kent Overstreet wrote:
> On Fri, May 15, 2015 at 01:58:25PM -0700, Darrick J. Wong wrote:
> > We of course are not limited to just those topics -- if there's something
> > you'd really like to discuss with everyone, please add that to the wiki.
> > I realize that mid-August is family vacation time for many people, but
> > those are the constraints the conference has to work with this year.
> > There are fun things to do around Seattle, and that's probably the best
> > time of year to visit.
> 
> It's getting near time for the big bcachefs announcement :)
> 
> Also, stable pages - what's been going on there? Last I heard you were talking
> about using the page migration code to do COW, did anything come of that? I just
> added data checksumming/compression to bcachefs, so that's been fresh on my
> mind.

Yeah.  I never figured out a sane way to migrate pages and keep everything
else happy.  Daniel Phillips is having a go at page forking for tux3; let's
see if the questions about that get resolved.

> Also, there's probably always going to be situations where we're reading or
> writing to pages user space can stomp on (dio) - IMO we need to add a bio flag
> to annotate this - "if you need this to be stable you have to bounce it".
> Otherwise either filesystems/block drivers are going to be stuck bouncing
> everything, or it'll just (continue to be) buggy.

Well, for now there's BIO_SNAP_STABLE that forces the block layer to bounce it,
but right now ext3 is the last user of it, and afaict btrfs is the only other
FS that takes care of stable pages on its own.

--D

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Proposal for annotating _unstable_ pages
  2015-05-19 20:10   ` Darrick J. Wong
@ 2015-05-21  1:04     ` Kent Overstreet
  2015-05-21 16:54       ` Jan Kara
  0 siblings, 1 reply; 9+ messages in thread
From: Kent Overstreet @ 2015-05-21  1:04 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Linux FS Devel, linux-scsi, device-mapper development,
	linux-btrfs, axboe, zab, neilb

On Tue, May 19, 2015 at 01:10:55PM -0700, Darrick J. Wong wrote:
> On Tue, May 19, 2015 at 08:42:00AM -0700, Kent Overstreet wrote:
> > Also, stable pages - what's been going on there? Last I heard you were talking
> > about using the page migration code to do COW, did anything come of that? I just
> > added data checksumming/compression to bcachefs, so that's been fresh on my
> > mind.
> 
> Yeah.  I never figured out a sane way to migrate pages and keep everything
> else happy.  Daniel Phillips is having a go at page forking for tux3; let's
> see if the questions about that get resolved.

That would be great, we need something.

I'd also be really curious what btrfs is doing today - is it just bouncing
everything internally, or did they come up with something more clever?

> > Also, there's probably always going to be situations where we're reading or
> > writing to pages user space can stomp on (dio) - IMO we need to add a bio flag
> > to annotate this - "if you need this to be stable you have to bounce it".
> > Otherwise either filesystems/block drivers are going to be stuck bouncing
> > everything, or it'll just (continue to be) buggy.
> 
> Well, for now there's BIO_SNAP_STABLE that forces the block layer to bounce it,
> but right now ext3 is the last user of it, and afaict btrfs is the only other
> FS that takes care of stable pages on its own.

I have no idea what BIO_SNAP_STABLE was supposed to be for, but I don't see how
it's useful for anything sane.

I'm _guessing_ it's to get atomic snapshots? But if the upper layer is modifying
the data being written while the write is in flight, the memcpy() for the bounce
still isn't going to be atomic. If the upper layer cares about atomicity, it
needs to not diddle over the memory its writing while the write is in flight.

But that's the complete opposite of the problem stable pages are supposed to
solve: stable pages are for when the _lower_ layer (be it filesystem, bcache,
md, lvm) needs the memory being either read to or written from (both, it's not
just writes) to not be diddled over while the IO is in flight.

Now, a point that I think has been missed is that stable pages are _not_ a
complete solution, at least for consumers in the block layer.

The situation today is that if I'm in the block layer, and I get a handed a read
or write bio, I _don't know_ if it's from something that's going to diddle over
those pages or not. So if I require stable pages - be it for data checksumming
or for other things - I've just got to bounce the bio myself.

And then the really annoying thing is that if you've got stacked things that all
need stable pages (maybe btrfs on top of bcache on top of md) - they _all_ have
to assume the pages aren't going to be stable, so if they need them they _all_
have to bounce - even though once the first layer bounced the bio that made it
stable for everything underneath it.

Stable pages for IO to/from the pagecache are _not_ going to solve this problem,
because the page cache is not the only source of IO to non stable pages (Direct
IO will always be, even if everything else gets fixed).

So what I'm proposing is:

 - Add a new bio flag: BIO_PAGES_NOT_STABLE

 - Everything that submits a bio and _doesn't_ guarantee that the pages won't be
   touched while the IO is in flight has to set that flag.  This flag will have
   to be preserved when cloning a bio, but not when cloning a bio and its pages
   (i.e. bouncing it).

This is going to be a lot of not-fun work auditing code, but IMO it really needs
to be done. As a bonus, once it's done everything that generates IO that must be
expensively bounced will be nicely annotated.

To verify that the annotations are correct, for writes we can add some debug
code to the generic IO path that checksums the data before and after the IO and
complains loudly if the checksums don't match. Dunno what we can do for reads.

Thoughts?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for annotating _unstable_ pages
  2015-05-21  1:04     ` Proposal for annotating _unstable_ pages Kent Overstreet
@ 2015-05-21 16:54       ` Jan Kara
  2015-05-21 18:09         ` Kent Overstreet
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2015-05-21 16:54 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Darrick J. Wong, Linux FS Devel, linux-scsi,
	device-mapper development, linux-btrfs, axboe, zab, neilb

On Wed 20-05-15 18:04:40, Kent Overstreet wrote:
> > Yeah.  I never figured out a sane way to migrate pages and keep everything
> > else happy.  Daniel Phillips is having a go at page forking for tux3; let's
> > see if the questions about that get resolved.
> 
> That would be great, we need something.
> 
> I'd also be really curious what btrfs is doing today - is it just bouncing
> everything internally, or did they come up with something more clever?

Btrfs is just waiting for IO to complete.

> > > Also, there's probably always going to be situations where we're reading or
> > > writing to pages user space can stomp on (dio) - IMO we need to add a bio flag
> > > to annotate this - "if you need this to be stable you have to bounce it".
> > > Otherwise either filesystems/block drivers are going to be stuck bouncing
> > > everything, or it'll just (continue to be) buggy.
> > 
> > Well, for now there's BIO_SNAP_STABLE that forces the block layer to bounce it,
> > but right now ext3 is the last user of it, and afaict btrfs is the only other
> > FS that takes care of stable pages on its own.
> 
> I have no idea what BIO_SNAP_STABLE was supposed to be for, but I don't see how
> it's useful for anything sane.

It's for the case where lower layer requests it needs stable pages but
upper layer isn't able to provide them (as is the case of ext3). Then block
layer bounces the data for the caller.

> But that's the complete opposite of the problem stable pages are supposed to
> solve: stable pages are for when the _lower_ layer (be it filesystem, bcache,
> md, lvm) needs the memory being either read to or written from (both, it's not
> just writes) to not be diddled over while the IO is in flight.
> 
> Now, a point that I think has been missed is that stable pages are _not_ a
> complete solution, at least for consumers in the block layer.
> 
> The situation today is that if I'm in the block layer, and I get a handed a read
> or write bio, I _don't know_ if it's from something that's going to diddle over
> those pages or not. So if I require stable pages - be it for data checksumming
> or for other things - I've just got to bounce the bio myself.
> 
> And then the really annoying thing is that if you've got stacked things that all
> need stable pages (maybe btrfs on top of bcache on top of md) - they _all_ have
> to assume the pages aren't going to be stable, so if they need them they _all_
> have to bounce - even though once the first layer bounced the bio that made it
> stable for everything underneath it.

The current design is that if you need stable pages for your device, set
bdi capability BDI_CAP_STABLE_WRITES, fs then takes care of not scribbling
over your page while it is under writeback or uses BIO_SNAP_STABLE if it
cannot.

There is no reason why this shouldn't work with device stacking (although
I'm not sure it really works currently). You are right that this won't
solve the possible issues with direct IO where user scribbles over the
buffers while direct IO is in flight. We could make direct IO submit pages
with BIO_SNAP_STABLE when underlying device declares they are required but
I assume some users would rather like to promise they don't touch them than
paying the cost of copy...

I have to say I don't quite see the advantage of your proposal over this...

> Stable pages for IO to/from the pagecache are _not_ going to solve this problem,
> because the page cache is not the only source of IO to non stable pages (Direct
> IO will always be, even if everything else gets fixed).
> 
> So what I'm proposing is:
> 
>  - Add a new bio flag: BIO_PAGES_NOT_STABLE
> 
>  - Everything that submits a bio and _doesn't_ guarantee that the pages won't be
>    touched while the IO is in flight has to set that flag.  This flag will have
>    to be preserved when cloning a bio, but not when cloning a bio and its pages
>    (i.e. bouncing it).
> 
> This is going to be a lot of not-fun work auditing code, but IMO it really needs
> to be done. As a bonus, once it's done everything that generates IO that must be
> expensively bounced will be nicely annotated.
> 
> To verify that the annotations are correct, for writes we can add some debug
> code to the generic IO path that checksums the data before and after the IO and
> complains loudly if the checksums don't match. Dunno what we can do for reads.
> 
> Thoughts?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for annotating _unstable_ pages
  2015-05-21 16:54       ` Jan Kara
@ 2015-05-21 18:09         ` Kent Overstreet
  2015-05-21 19:21           ` Jan Kara
  0 siblings, 1 reply; 9+ messages in thread
From: Kent Overstreet @ 2015-05-21 18:09 UTC (permalink / raw)
  To: Jan Kara
  Cc: Darrick J. Wong, Linux FS Devel, linux-scsi,
	device-mapper development, linux-btrfs, axboe, zab, neilb

On Thu, May 21, 2015 at 06:54:53PM +0200, Jan Kara wrote:
> On Wed 20-05-15 18:04:40, Kent Overstreet wrote:
> > > Yeah.  I never figured out a sane way to migrate pages and keep everything
> > > else happy.  Daniel Phillips is having a go at page forking for tux3; let's
> > > see if the questions about that get resolved.
> > 
> > That would be great, we need something.
> > 
> > I'd also be really curious what btrfs is doing today - is it just bouncing
> > everything internally, or did they come up with something more clever?
> 
> Btrfs is just waiting for IO to complete.
> 
> > > > Also, there's probably always going to be situations where we're reading or
> > > > writing to pages user space can stomp on (dio) - IMO we need to add a bio flag
> > > > to annotate this - "if you need this to be stable you have to bounce it".
> > > > Otherwise either filesystems/block drivers are going to be stuck bouncing
> > > > everything, or it'll just (continue to be) buggy.
> > > 
> > > Well, for now there's BIO_SNAP_STABLE that forces the block layer to bounce it,
> > > but right now ext3 is the last user of it, and afaict btrfs is the only other
> > > FS that takes care of stable pages on its own.
> > 
> > I have no idea what BIO_SNAP_STABLE was supposed to be for, but I don't see how
> > it's useful for anything sane.
> 
> It's for the case where lower layer requests it needs stable pages but
> upper layer isn't able to provide them (as is the case of ext3). Then block
> layer bounces the data for the caller.
> 
> > But that's the complete opposite of the problem stable pages are supposed to
> > solve: stable pages are for when the _lower_ layer (be it filesystem, bcache,
> > md, lvm) needs the memory being either read to or written from (both, it's not
> > just writes) to not be diddled over while the IO is in flight.
> > 
> > Now, a point that I think has been missed is that stable pages are _not_ a
> > complete solution, at least for consumers in the block layer.
> > 
> > The situation today is that if I'm in the block layer, and I get a handed a read
> > or write bio, I _don't know_ if it's from something that's going to diddle over
> > those pages or not. So if I require stable pages - be it for data checksumming
> > or for other things - I've just got to bounce the bio myself.
> > 
> > And then the really annoying thing is that if you've got stacked things that all
> > need stable pages (maybe btrfs on top of bcache on top of md) - they _all_ have
> > to assume the pages aren't going to be stable, so if they need them they _all_
> > have to bounce - even though once the first layer bounced the bio that made it
> > stable for everything underneath it.
> 
> The current design is that if you need stable pages for your device, set
> bdi capability BDI_CAP_STABLE_WRITES, fs then takes care of not scribbling
> over your page while it is under writeback or uses BIO_SNAP_STABLE if it
> cannot.

But if I need stable pages, I still have to bounce because that _does not_
guarantee stable pages, it only gives me stable pages for some of the IOs and in
the lower layers you can't tell which is which.

Do you see the problem? What good is BDI_CAP_STABLE_WRITES if it's not a
guarantee and I can't tell if I need to bounce or not?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Proposal for annotating _unstable_ pages
  2015-05-21 18:09         ` Kent Overstreet
@ 2015-05-21 19:21           ` Jan Kara
  2015-05-22 18:17             ` [dm-devel] " Darrick J. Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2015-05-21 19:21 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: Jan Kara, Darrick J. Wong, Linux FS Devel, linux-scsi,
	device-mapper development, linux-btrfs, axboe, zab, neilb

On Thu 21-05-15 11:09:55, Kent Overstreet wrote:
> On Thu, May 21, 2015 at 06:54:53PM +0200, Jan Kara wrote:
> > On Wed 20-05-15 18:04:40, Kent Overstreet wrote:
> > > > Yeah.  I never figured out a sane way to migrate pages and keep everything
> > > > else happy.  Daniel Phillips is having a go at page forking for tux3; let's
> > > > see if the questions about that get resolved.
> > > 
> > > That would be great, we need something.
> > > 
> > > I'd also be really curious what btrfs is doing today - is it just bouncing
> > > everything internally, or did they come up with something more clever?
> > 
> > Btrfs is just waiting for IO to complete.
> > 
> > > > > Also, there's probably always going to be situations where we're reading or
> > > > > writing to pages user space can stomp on (dio) - IMO we need to add a bio flag
> > > > > to annotate this - "if you need this to be stable you have to bounce it".
> > > > > Otherwise either filesystems/block drivers are going to be stuck bouncing
> > > > > everything, or it'll just (continue to be) buggy.
> > > > 
> > > > Well, for now there's BIO_SNAP_STABLE that forces the block layer to bounce it,
> > > > but right now ext3 is the last user of it, and afaict btrfs is the only other
> > > > FS that takes care of stable pages on its own.
> > > 
> > > I have no idea what BIO_SNAP_STABLE was supposed to be for, but I don't see how
> > > it's useful for anything sane.
> > 
> > It's for the case where lower layer requests it needs stable pages but
> > upper layer isn't able to provide them (as is the case of ext3). Then block
> > layer bounces the data for the caller.
> > 
> > > But that's the complete opposite of the problem stable pages are supposed to
> > > solve: stable pages are for when the _lower_ layer (be it filesystem, bcache,
> > > md, lvm) needs the memory being either read to or written from (both, it's not
> > > just writes) to not be diddled over while the IO is in flight.
> > > 
> > > Now, a point that I think has been missed is that stable pages are _not_ a
> > > complete solution, at least for consumers in the block layer.
> > > 
> > > The situation today is that if I'm in the block layer, and I get a handed a read
> > > or write bio, I _don't know_ if it's from something that's going to diddle over
> > > those pages or not. So if I require stable pages - be it for data checksumming
> > > or for other things - I've just got to bounce the bio myself.
> > > 
> > > And then the really annoying thing is that if you've got stacked things that all
> > > need stable pages (maybe btrfs on top of bcache on top of md) - they _all_ have
> > > to assume the pages aren't going to be stable, so if they need them they _all_
> > > have to bounce - even though once the first layer bounced the bio that made it
> > > stable for everything underneath it.
> > 
> > The current design is that if you need stable pages for your device, set
> > bdi capability BDI_CAP_STABLE_WRITES, fs then takes care of not scribbling
> > over your page while it is under writeback or uses BIO_SNAP_STABLE if it
> > cannot.
> 
> But if I need stable pages, I still have to bounce because that _does not_
> guarantee stable pages, it only gives me stable pages for some of the IOs and in
> the lower layers you can't tell which is which.
> 
> Do you see the problem? What good is BDI_CAP_STABLE_WRITES if it's not a
> guarantee and I can't tell if I need to bounce or not?
  So fix the upper layers to make it a guarantee? You mentioned direct IO
needs fixing. Anything else?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] Proposal for annotating _unstable_ pages
  2015-05-21 19:21           ` Jan Kara
@ 2015-05-22 18:17             ` Darrick J. Wong
  2015-05-22 18:33               ` Kent Overstreet
  0 siblings, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2015-05-22 18:17 UTC (permalink / raw)
  To: device-mapper development
  Cc: Kent Overstreet, Jan Kara, linux-scsi, axboe, Linux FS Devel,
	zab, linux-btrfs

On Thu, May 21, 2015 at 09:21:12PM +0200, Jan Kara wrote:
> On Thu 21-05-15 11:09:55, Kent Overstreet wrote:
> > On Thu, May 21, 2015 at 06:54:53PM +0200, Jan Kara wrote:
> > > On Wed 20-05-15 18:04:40, Kent Overstreet wrote:
> > > > > Yeah.  I never figured out a sane way to migrate pages and keep everything
> > > > > else happy.  Daniel Phillips is having a go at page forking for tux3; let's
> > > > > see if the questions about that get resolved.
> > > > 
> > > > That would be great, we need something.
> > > > 
> > > > I'd also be really curious what btrfs is doing today - is it just bouncing
> > > > everything internally, or did they come up with something more clever?
> > > 
> > > Btrfs is just waiting for IO to complete.
> > > 
> > > > > > Also, there's probably always going to be situations where we're reading or
> > > > > > writing to pages user space can stomp on (dio) - IMO we need to add a bio flag
> > > > > > to annotate this - "if you need this to be stable you have to bounce it".
> > > > > > Otherwise either filesystems/block drivers are going to be stuck bouncing
> > > > > > everything, or it'll just (continue to be) buggy.
> > > > > 
> > > > > Well, for now there's BIO_SNAP_STABLE that forces the block layer to bounce it,
> > > > > but right now ext3 is the last user of it, and afaict btrfs is the only other
> > > > > FS that takes care of stable pages on its own.
> > > > 
> > > > I have no idea what BIO_SNAP_STABLE was supposed to be for, but I don't see how
> > > > it's useful for anything sane.
> > > 
> > > It's for the case where lower layer requests it needs stable pages but
> > > upper layer isn't able to provide them (as is the case of ext3). Then block
> > > layer bounces the data for the caller.
> > > 
> > > > But that's the complete opposite of the problem stable pages are supposed to
> > > > solve: stable pages are for when the _lower_ layer (be it filesystem, bcache,
> > > > md, lvm) needs the memory being either read to or written from (both, it's not
> > > > just writes) to not be diddled over while the IO is in flight.
> > > > 
> > > > Now, a point that I think has been missed is that stable pages are _not_ a
> > > > complete solution, at least for consumers in the block layer.
> > > > 
> > > > The situation today is that if I'm in the block layer, and I get a handed a read
> > > > or write bio, I _don't know_ if it's from something that's going to diddle over
> > > > those pages or not. So if I require stable pages - be it for data checksumming
> > > > or for other things - I've just got to bounce the bio myself.
> > > > 
> > > > And then the really annoying thing is that if you've got stacked things that all
> > > > need stable pages (maybe btrfs on top of bcache on top of md) - they _all_ have
> > > > to assume the pages aren't going to be stable, so if they need them they _all_
> > > > have to bounce - even though once the first layer bounced the bio that made it
> > > > stable for everything underneath it.
> > > 
> > > The current design is that if you need stable pages for your device, set
> > > bdi capability BDI_CAP_STABLE_WRITES, fs then takes care of not scribbling
> > > over your page while it is under writeback or uses BIO_SNAP_STABLE if it
> > > cannot.
> > 
> > But if I need stable pages, I still have to bounce because that _does not_
> > guarantee stable pages, it only gives me stable pages for some of the IOs and in
> > the lower layers you can't tell which is which.
> > 
> > Do you see the problem? What good is BDI_CAP_STABLE_WRITES if it's not a
> > guarantee and I can't tell if I need to bounce or not?
>   So fix the upper layers to make it a guarantee? You mentioned direct IO
> needs fixing. Anything else?

Back when I was writing the stable pages patches, I observed that some of the
filesystems didn't hold the pages containing their own metadata stable during
writeback on a stable-writes device.  The journalling filesystems were fine
because they had various means to take care of that.

ISTR ext2 and vfat were the biggest culprits, but both maintainers rejected
the patches to fix that behavior.  This might no longer be the case; those
patches were so long ago I can't find them in Google.

--D

> 
> 								Honza
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dm-devel] Proposal for annotating _unstable_ pages
  2015-05-22 18:17             ` [dm-devel] " Darrick J. Wong
@ 2015-05-22 18:33               ` Kent Overstreet
  0 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2015-05-22 18:33 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: device-mapper development, Jan Kara, linux-scsi, axboe,
	Linux FS Devel, zab, linux-btrfs

On Fri, May 22, 2015 at 11:17:59AM -0700, Darrick J. Wong wrote:
> Back when I was writing the stable pages patches, I observed that some of the
> filesystems didn't hold the pages containing their own metadata stable during
> writeback on a stable-writes device.  The journalling filesystems were fine
> because they had various means to take care of that.
> 
> ISTR ext2 and vfat were the biggest culprits, but both maintainers rejected
> the patches to fix that behavior.  This might no longer be the case; those
> patches were so long ago I can't find them in Google.

Not at all surprised. Yeah, this would solve that problem - we just annotate
those bios so we don't bounce them until we hit a point where we need to.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-05-22 18:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-15 20:58 Let's get a File & Storage miniconf going at LPC2015! Darrick J. Wong
2015-05-19 15:42 ` Kent Overstreet
2015-05-19 20:10   ` Darrick J. Wong
2015-05-21  1:04     ` Proposal for annotating _unstable_ pages Kent Overstreet
2015-05-21 16:54       ` Jan Kara
2015-05-21 18:09         ` Kent Overstreet
2015-05-21 19:21           ` Jan Kara
2015-05-22 18:17             ` [dm-devel] " Darrick J. Wong
2015-05-22 18:33               ` Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).