All of lore.kernel.org
 help / color / mirror / Atom feed
* booked-page-flag.patch
@ 2007-02-15  8:45 Andrew Morton
  2007-02-15 14:03 ` booked-page-flag.patch Alex Tomas
  2007-02-15 17:18 ` booked-page-flag.patch Eric Sandeen
  0 siblings, 2 replies; 16+ messages in thread
From: Andrew Morton @ 2007-02-15  8:45 UTC (permalink / raw)
  To: Theodore Ts'o, Alex Tomas; +Cc: linux-ext4


Sorry, we're seriously, seriously, seriously short on flags in the page
struct and this patch is going to need one heck of a good case for it to be
acceptable.

Even then, we should put a lot of effort into finding some way of avoiding
adding that page flag.  One option might be to add a new radix-tree tag,
and defining it as "for filesytem usage".  Or use PG_checked (which should
be renamed to to PG_fs_misc) (if that doesn't conflict with ext4's existing
use of PG_checked).  Or use !PageMappedToDisk()?

These patches seem to have a number of issues - we should get them properly
commented and properly changelogged then get them on the wire for decent
review before investing too much in them, please.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15  8:45 booked-page-flag.patch Andrew Morton
@ 2007-02-15 14:03 ` Alex Tomas
  2007-02-15 17:18 ` booked-page-flag.patch Eric Sandeen
  1 sibling, 0 replies; 16+ messages in thread
From: Alex Tomas @ 2007-02-15 14:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Theodore Ts'o, Alex Tomas, linux-ext4

>>>>> Andrew Morton (AM) writes:

 AM> Sorry, we're seriously, seriously, seriously short on flags in the page
 AM> struct and this patch is going to need one heck of a good case for it to be
 AM> acceptable.

 AM> Even then, we should put a lot of effort into finding some way of avoiding
 AM> adding that page flag.  One option might be to add a new radix-tree tag,
 AM> and defining it as "for filesytem usage".  Or use PG_checked (which should
 AM> be renamed to to PG_fs_misc) (if that doesn't conflict with ext4's existing
 AM> use of PG_checked).  Or use !PageMappedToDisk()?

there is a difference between being mapped and "booked".
the latter means that page isn't allocated yet, but
space is reserved (including metadata) and we're sure
we'll be able to allocate space when the page is being
flushed.

 AM> These patches seem to have a number of issues - we should get them properly
 AM> commented and properly changelogged then get them on the wire for decent
 AM> review before investing too much in them, please.

I've been reworking the delayed allocation patch to move
part of it into VFS. I'll try to comment things better
this time.

thanks, Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15  8:45 booked-page-flag.patch Andrew Morton
  2007-02-15 14:03 ` booked-page-flag.patch Alex Tomas
@ 2007-02-15 17:18 ` Eric Sandeen
  2007-02-15 17:30   ` booked-page-flag.patch Alex Tomas
                     ` (2 more replies)
  1 sibling, 3 replies; 16+ messages in thread
From: Eric Sandeen @ 2007-02-15 17:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Theodore Ts'o, Alex Tomas, linux-ext4

Andrew Morton wrote:
> Sorry, we're seriously, seriously, seriously short on flags in the page
> struct and this patch is going to need one heck of a good case for it to be
> acceptable.

This was for the delayed allocation patchset, right; and by managing 
this at the page level that means we can't do this for block size < page 
size, I think...

There are already buffer head flags for delalloc (block to be allocated 
  on flush) and unwritten (actual block allocated to a file but not yet 
written) in the vfs - shouldn't we be looking at using those?

Unless there is a clear path from this patchset to one that supports 
blocks < page, I'm hesitant about it... but I know I'm being a bit of an 
armchair quarterback here, and I'll try to set aside some time to give 
this a better look.

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15 17:18 ` booked-page-flag.patch Eric Sandeen
@ 2007-02-15 17:30   ` Alex Tomas
  2007-02-15 20:56     ` booked-page-flag.patch Andrew Morton
  2007-02-15 20:21   ` booked-page-flag.patch Andrew Morton
  2007-02-16 12:17   ` booked-page-flag.patch Andreas Dilger
  2 siblings, 1 reply; 16+ messages in thread
From: Alex Tomas @ 2007-02-15 17:30 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Andrew Morton, Theodore Ts'o, Alex Tomas, linux-ext4

>>>>> Eric Sandeen (ES) writes:

 ES> Andrew Morton wrote:
 >> Sorry, we're seriously, seriously, seriously short on flags in the page
 >> struct and this patch is going to need one heck of a good case for it to be
 >> acceptable.

 ES> This was for the delayed allocation patchset, right; and by managing
 ES> this at the page level that means we can't do this for block size <
 ES> page size, I think...

I still think that having PG_booked and special code to handle
case when blocksize==pagesize is worthwhile -- we save 56 bytes
on 32bit and 104 bytes on 64bit for every written page.

thanks, Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15 17:18 ` booked-page-flag.patch Eric Sandeen
  2007-02-15 17:30   ` booked-page-flag.patch Alex Tomas
@ 2007-02-15 20:21   ` Andrew Morton
  2007-02-16 12:17   ` booked-page-flag.patch Andreas Dilger
  2 siblings, 0 replies; 16+ messages in thread
From: Andrew Morton @ 2007-02-15 20:21 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Theodore Ts'o, Alex Tomas, linux-ext4

On Thu, 15 Feb 2007 09:18:34 -0800 Eric Sandeen <sandeen@redhat.com> wrote:

> Andrew Morton wrote:
> > Sorry, we're seriously, seriously, seriously short on flags in the page
> > struct and this patch is going to need one heck of a good case for it to be
> > acceptable.
> 
> This was for the delayed allocation patchset, right; and by managing 
> this at the page level that means we can't do this for block size < page 
> size, I think...

If the page is partially mapped to disk then we can still reserve an entire
page's worth of blocks, as long as we unreserve an entire page's worth
once the page gets fully mapped to disk.

> There are already buffer head flags for delalloc (block to be allocated 
>   on flush) and unwritten (actual block allocated to a file but not yet 
> written) in the vfs - shouldn't we be looking at using those?

We could, but we have this no-buffer-head mode which really should be the
preferred method, for 4k pagesize systems, at least.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15 17:30   ` booked-page-flag.patch Alex Tomas
@ 2007-02-15 20:56     ` Andrew Morton
  2007-02-15 21:07       ` booked-page-flag.patch Alex Tomas
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2007-02-15 20:56 UTC (permalink / raw)
  To: Alex Tomas; +Cc: Eric Sandeen, Theodore Ts'o, linux-ext4

On Thu, 15 Feb 2007 20:30:43 +0300 Alex Tomas <alex@clusterfs.com> wrote:

> >>>>> Eric Sandeen (ES) writes:
> 
>  ES> Andrew Morton wrote:
>  >> Sorry, we're seriously, seriously, seriously short on flags in the page
>  >> struct and this patch is going to need one heck of a good case for it to be
>  >> acceptable.
> 
>  ES> This was for the delayed allocation patchset, right; and by managing
>  ES> this at the page level that means we can't do this for block size <
>  ES> page size, I think...
> 
> I still think that having PG_booked and special code to handle
> case when blocksize==pagesize is worthwhile -- we save 56 bytes
> on 32bit and 104 bytes on 64bit for every written page.
> 

If the page doesn't have buffer-heads, set PG_private and clear page->private

If the page does have buffer_heads, use BH_Delay.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15 20:56     ` booked-page-flag.patch Andrew Morton
@ 2007-02-15 21:07       ` Alex Tomas
  2007-02-15 23:23         ` booked-page-flag.patch Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Tomas @ 2007-02-15 21:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alex Tomas, Eric Sandeen, Theodore Ts'o, linux-ext4

>>>>> Andrew Morton (AM) writes:

 AM> If the page doesn't have buffer-heads, set PG_private and clear page->private

 AM> If the page does have buffer_heads, use BH_Delay.

I did exactly this way in the first version, but later
got feeling that the community'd discard "ugly hack".

one more question: how much of it you want in VFS?

->get_block(with BH_Delay) can be used to signal
filesystem that no actual allocation is required.
so, aware filesystem can just reserve space. then
->writepages() should walk through the pages like
it does currently, collect contiugous sequences
and again call ->get_block(w/o BH_Delay) with b_size
covering all contiguous pages ...

thanks, Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15 21:07       ` booked-page-flag.patch Alex Tomas
@ 2007-02-15 23:23         ` Andrew Morton
  2007-02-16  7:30           ` booked-page-flag.patch Alex Tomas
  2007-02-16 15:03           ` booked-page-flag.patch Theodore Tso
  0 siblings, 2 replies; 16+ messages in thread
From: Andrew Morton @ 2007-02-15 23:23 UTC (permalink / raw)
  To: Alex Tomas; +Cc: Eric Sandeen, Theodore Ts'o, linux-ext4

On Fri, 16 Feb 2007 00:07:55 +0300
Alex Tomas <alex@clusterfs.com> wrote:

> >>>>> Andrew Morton (AM) writes:
> 
>  AM> If the page doesn't have buffer-heads, set PG_private and clear page->private
> 
>  AM> If the page does have buffer_heads, use BH_Delay.
> 
> I did exactly this way in the first version, but later
> got feeling that the community'd discard "ugly hack".

These things can be wrapped in nice helper functions in header files.  that
allows us to easily reimplement if a better idea comes along.

And we already do all sorts of horrid things to save space in the page
struct.

> one more question: how much of it you want in VFS?

As much as possible, really.  It's probably too late for XFS to reuse any
new infrastructure, but if the infrastructure is well-designed it _should_
be possible to us it in new filesytems, and to convert ext2.

> ->get_block(with BH_Delay) can be used to signal
> filesystem that no actual allocation is required.
> so, aware filesystem can just reserve space. then
> ->writepages() should walk through the pages like
> it does currently, collect contiugous sequences
> and again call ->get_block(w/o BH_Delay) with b_size
> covering all contiguous pages ...
> 

That sounds like it'd work, yes.

Except for an address_space which is using delayed allocation, its
->prepare_write wouldn't call get_block at all, so perhaps that isn't
needed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15 23:23         ` booked-page-flag.patch Andrew Morton
@ 2007-02-16  7:30           ` Alex Tomas
  2007-02-16  7:46             ` booked-page-flag.patch Andrew Morton
  2007-02-16 15:03           ` booked-page-flag.patch Theodore Tso
  1 sibling, 1 reply; 16+ messages in thread
From: Alex Tomas @ 2007-02-16  7:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alex Tomas, Eric Sandeen, Theodore Ts'o, linux-ext4

>>>>> Andrew Morton (AM) writes:

 -> get_block(with BH_Delay) can be used to signal
 >> filesystem that no actual allocation is required.
 >> so, aware filesystem can just reserve space. then
 -> writepages() should walk through the pages like
 >> it does currently, collect contiugous sequences
 >> and again call ->get_block(w/o BH_Delay) with b_size
 >> covering all contiguous pages ...
 >> 

 AM> That sounds like it'd work, yes.

 AM> Except for an address_space which is using delayed allocation, its
 -> prepare_write wouldn't call get_block at all, so perhaps that isn't
 AM> needed.

hmm. I thought it has to call get_block() at least to know whether
the block is already allocated. and I was going to reserve space
in prepare_write for which some fs-specific method is needed becase
only fs knows how much metadata it'd need.

thanks, Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-16  7:30           ` booked-page-flag.patch Alex Tomas
@ 2007-02-16  7:46             ` Andrew Morton
  2007-02-16  7:56               ` booked-page-flag.patch Alex Tomas
  2007-02-16 16:02               ` booked-page-flag.patch Dave Kleikamp
  0 siblings, 2 replies; 16+ messages in thread
From: Andrew Morton @ 2007-02-16  7:46 UTC (permalink / raw)
  To: Alex Tomas; +Cc: Eric Sandeen, Theodore Ts'o, linux-ext4

On Fri, 16 Feb 2007 10:30:39 +0300 Alex Tomas <alex@clusterfs.com> wrote:

> >>>>> Andrew Morton (AM) writes:
> 
>  -> get_block(with BH_Delay) can be used to signal
>  >> filesystem that no actual allocation is required.
>  >> so, aware filesystem can just reserve space. then
>  -> writepages() should walk through the pages like
>  >> it does currently, collect contiugous sequences
>  >> and again call ->get_block(w/o BH_Delay) with b_size
>  >> covering all contiguous pages ...
>  >> 
> 
>  AM> That sounds like it'd work, yes.
> 
>  AM> Except for an address_space which is using delayed allocation, its
>  -> prepare_write wouldn't call get_block at all, so perhaps that isn't
>  AM> needed.
> 
> hmm. I thought it has to call get_block() at least to know whether
> the block is already allocated. and I was going to reserve space
> in prepare_write for which some fs-specific method is needed becase
> only fs knows how much metadata it'd need.

Well, one could just assume that the page has no disk mapping and go and
make the space reservation.  Things will work out OK when we come to do
writepage().

Or one could do both: call get_block() only if the page was inside i_size.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-16  7:46             ` booked-page-flag.patch Andrew Morton
@ 2007-02-16  7:56               ` Alex Tomas
  2007-02-16  8:06                 ` booked-page-flag.patch Andrew Morton
  2007-02-16 16:02               ` booked-page-flag.patch Dave Kleikamp
  1 sibling, 1 reply; 16+ messages in thread
From: Alex Tomas @ 2007-02-16  7:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alex Tomas, Eric Sandeen, Theodore Ts'o, linux-ext4

>>>>> Andrew Morton (AM) writes:

 AM> Well, one could just assume that the page has no disk mapping and go and
 AM> make the space reservation.  Things will work out OK when we come to do
 AM> writepage().

 AM> Or one could do both: call get_block() only if the page was inside i_size.

well, even so we need to reserve not that block ony, but
also needed metadata (for the worst case). probably this
is work for get_block or some different method? anyway,
we have to call it if the page is being written partial. 

thanks, Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-16  7:56               ` booked-page-flag.patch Alex Tomas
@ 2007-02-16  8:06                 ` Andrew Morton
  2007-02-16 12:14                   ` booked-page-flag.patch Andreas Dilger
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2007-02-16  8:06 UTC (permalink / raw)
  To: Alex Tomas; +Cc: Eric Sandeen, Theodore Ts'o, linux-ext4

On Fri, 16 Feb 2007 10:56:55 +0300 Alex Tomas <alex@clusterfs.com> wrote:

> >>>>> Andrew Morton (AM) writes:
> 
>  AM> Well, one could just assume that the page has no disk mapping and go and
>  AM> make the space reservation.  Things will work out OK when we come to do
>  AM> writepage().
> 
>  AM> Or one could do both: call get_block() only if the page was inside i_size.
> 
> well, even so we need to reserve not that block ony, but
> also needed metadata (for the worst case).

Right.  Reserve all the blocks for the page and all the metadata for a page
at that file offset.  Worst case.

> probably this
> is work for get_block or some different method?

umm, when I did this I think I added a new ->reservepage address_space op
and called that from within the VFS's delalloc_prepare_write().

> anyway,
> we have to call it if the page is being written partial. 

Not necessarily.  If we're operating in nobh mode that page is brought
uptodate in prepare_write().

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-16  8:06                 ` booked-page-flag.patch Andrew Morton
@ 2007-02-16 12:14                   ` Andreas Dilger
  0 siblings, 0 replies; 16+ messages in thread
From: Andreas Dilger @ 2007-02-16 12:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alex Tomas, Eric Sandeen, Theodore Ts'o, linux-ext4

On Feb 16, 2007  00:06 -0800, Andrew Morton wrote:
> On Fri, 16 Feb 2007 10:56:55 +0300 Alex Tomas <alex@clusterfs.com> wrote:
> > well, even so we need to reserve not that block ony, but
> > also needed metadata (for the worst case).
> 
> Right.  Reserve all the blocks for the page and all the metadata for a page
> at that file offset.  Worst case.

This only really becomes an issue when the filesystem (+reservations)
is close to being full.  If we are close enough to worry about running
out of space we can also refuse to do delayed allocation.
 
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15 17:18 ` booked-page-flag.patch Eric Sandeen
  2007-02-15 17:30   ` booked-page-flag.patch Alex Tomas
  2007-02-15 20:21   ` booked-page-flag.patch Andrew Morton
@ 2007-02-16 12:17   ` Andreas Dilger
  2 siblings, 0 replies; 16+ messages in thread
From: Andreas Dilger @ 2007-02-16 12:17 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Andrew Morton, Theodore Ts'o, Alex Tomas, linux-ext4

On Feb 15, 2007  09:18 -0800, Eric Sandeen wrote:
> This was for the delayed allocation patchset, right; and by managing 
> this at the page level that means we can't do this for block size < page 
> size, I think...

We could always disable delayed allocation in the PAGE_SIZE != blocksize
case.  It would be no worse than what we have now, and for IO performance
you need larger blocksize anyways.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-15 23:23         ` booked-page-flag.patch Andrew Morton
  2007-02-16  7:30           ` booked-page-flag.patch Alex Tomas
@ 2007-02-16 15:03           ` Theodore Tso
  1 sibling, 0 replies; 16+ messages in thread
From: Theodore Tso @ 2007-02-16 15:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alex Tomas, Eric Sandeen, linux-ext4

On Thu, Feb 15, 2007 at 03:23:52PM -0800, Andrew Morton wrote:
> > one more question: how much of it you want in VFS?
> 
> As much as possible, really.  It's probably too late for XFS to reuse any
> new infrastructure, but if the infrastructure is well-designed it _should_
> be possible to us it in new filesytems, and to convert ext2.

Actually, in a hallway conversation with David Chinner at the Linux
filesystem/storage workshop this week, it sounded like he was willing
to at least consider changing XFS to use a VFS-based infrastructure.
So when we put together a proposal for how to do this in the XFS
layer, I would suggest cc'ing the linux-fsdevel list.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: booked-page-flag.patch
  2007-02-16  7:46             ` booked-page-flag.patch Andrew Morton
  2007-02-16  7:56               ` booked-page-flag.patch Alex Tomas
@ 2007-02-16 16:02               ` Dave Kleikamp
  1 sibling, 0 replies; 16+ messages in thread
From: Dave Kleikamp @ 2007-02-16 16:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alex Tomas, Eric Sandeen, Theodore Ts'o, linux-ext4

On Thu, 2007-02-15 at 23:46 -0800, Andrew Morton wrote:
> On Fri, 16 Feb 2007 10:30:39 +0300 Alex Tomas <alex@clusterfs.com> wrote:
> > hmm. I thought it has to call get_block() at least to know whether
> > the block is already allocated. and I was going to reserve space
> > in prepare_write for which some fs-specific method is needed becase
> > only fs knows how much metadata it'd need.
> 
> Well, one could just assume that the page has no disk mapping and go and
> make the space reservation.  Things will work out OK when we come to do
> writepage().
> 
> Or one could do both: call get_block() only if the page was inside i_size.

Or call get_block() with create = 0.  Or replace the create argument
with a flags field that can take either GET_BLOCK_CREATE or
GET_BLOCK_RESERVE.
-- 
David Kleikamp
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2007-02-16 16:07 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-15  8:45 booked-page-flag.patch Andrew Morton
2007-02-15 14:03 ` booked-page-flag.patch Alex Tomas
2007-02-15 17:18 ` booked-page-flag.patch Eric Sandeen
2007-02-15 17:30   ` booked-page-flag.patch Alex Tomas
2007-02-15 20:56     ` booked-page-flag.patch Andrew Morton
2007-02-15 21:07       ` booked-page-flag.patch Alex Tomas
2007-02-15 23:23         ` booked-page-flag.patch Andrew Morton
2007-02-16  7:30           ` booked-page-flag.patch Alex Tomas
2007-02-16  7:46             ` booked-page-flag.patch Andrew Morton
2007-02-16  7:56               ` booked-page-flag.patch Alex Tomas
2007-02-16  8:06                 ` booked-page-flag.patch Andrew Morton
2007-02-16 12:14                   ` booked-page-flag.patch Andreas Dilger
2007-02-16 16:02               ` booked-page-flag.patch Dave Kleikamp
2007-02-16 15:03           ` booked-page-flag.patch Theodore Tso
2007-02-15 20:21   ` booked-page-flag.patch Andrew Morton
2007-02-16 12:17   ` booked-page-flag.patch Andreas Dilger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.