All of lore.kernel.org
 help / color / mirror / Atom feed
* ext4 not currently doing (much) multi-block allocation?
@ 2009-02-15  5:32 Theodore Tso
  2009-02-15 11:05 ` Aneesh Kumar K.V
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Tso @ 2009-02-15  5:32 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-ext4

So I was looking at the ext4 code to see how hard it would be to add a
function that would take a struct inode *, and make sure that all of
the pages in the page cache had been allocated a physical block on
disk (but not necessarily writing the I/O to disk).  The idea would be
to do this on close if the file had been truncated or opened with
O_TRUNC, and to also call this function if the inode had been renamed
and in the process a destination inode was freed.  That way if we have
data=ordered, the blocks would be allocated, and at the next commit,
we would force the data blocks to disk.

While I was looking at the code, it looks to me like we are currently
only allocating a page at a time; ext4_da_writepages() may end up
allocating a number of pages, but it's doing it one page at a time,
not an extent at a time.  So if the filesystem blocksize is 4k (and
the page size is 4k), the only time we will ever call the mballoc with
an allocation request greater than 1 is in the fallocate() system call
handler.   This seems... non-optimal.   Am I missing something?

	   		 		     - Ted



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext4 not currently doing (much) multi-block allocation?
  2009-02-15  5:32 ext4 not currently doing (much) multi-block allocation? Theodore Tso
@ 2009-02-15 11:05 ` Aneesh Kumar K.V
  2009-02-15 13:36   ` Theodore Tso
  0 siblings, 1 reply; 6+ messages in thread
From: Aneesh Kumar K.V @ 2009-02-15 11:05 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

On Sun, Feb 15, 2009 at 12:32:06AM -0500, Theodore Tso wrote:
> So I was looking at the ext4 code to see how hard it would be to add a
> function that would take a struct inode *, and make sure that all of
> the pages in the page cache had been allocated a physical block on
> disk (but not necessarily writing the I/O to disk).  The idea would be
> to do this on close if the file had been truncated or opened with
> O_TRUNC, and to also call this function if the inode had been renamed
> and in the process a destination inode was freed.  That way if we have
> data=ordered, the blocks would be allocated, and at the next commit,
> we would force the data blocks to disk.
> 
> While I was looking at the code, it looks to me like we are currently
> only allocating a page at a time; ext4_da_writepages() may end up
> allocating a number of pages, but it's doing it one page at a time,
> not an extent at a time.  So if the filesystem blocksize is 4k (and
> the page size is 4k), the only time we will ever call the mballoc with
> an allocation request greater than 1 is in the fallocate() system call
> handler.   This seems... non-optimal.   Am I missing something?
> 

Here is how it works. During writepages we loop through the dirty pages
and build largest contiguous block extent (mpage_add_bh_to_extent). Then we call
mpage_da_map_blocks. mpage_da_map_blocks does the mutli block request.
Once we have the blocks allocated we map these blocks to the pages. And
then we writeback one page at a time using writepage callback.

-aneesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext4 not currently doing (much) multi-block allocation?
  2009-02-15 11:05 ` Aneesh Kumar K.V
@ 2009-02-15 13:36   ` Theodore Tso
  2009-02-15 17:36     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Tso @ 2009-02-15 13:36 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-ext4

On Sun, Feb 15, 2009 at 04:35:28PM +0530, Aneesh Kumar K.V wrote:
> Here is how it works. During writepages we loop through the dirty pages
> and build largest contiguous block extent (mpage_add_bh_to_extent). Then we call
> mpage_da_map_blocks. mpage_da_map_blocks does the mutli block request.
> Once we have the blocks allocated we map these blocks to the pages. And
> then we writeback one page at a time using writepage callback.

mpage_da_map_blocks() calls mpd->get_block, which is set to
ext4_da_get_block_write(), which allocates a single block at a time
(max_blocks is set to bh->b_size >> inode->i_blkbits).

Put another way, where is the call to ext4_get_blocks_wrap() which
does the multi-block request?  I don't see it...

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext4 not currently doing (much) multi-block allocation?
  2009-02-15 13:36   ` Theodore Tso
@ 2009-02-15 17:36     ` Aneesh Kumar K.V
  2009-02-15 19:37       ` Eric Sandeen
  2009-02-15 21:12       ` Theodore Tso
  0 siblings, 2 replies; 6+ messages in thread
From: Aneesh Kumar K.V @ 2009-02-15 17:36 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

On Sun, Feb 15, 2009 at 08:36:18AM -0500, Theodore Tso wrote:
> On Sun, Feb 15, 2009 at 04:35:28PM +0530, Aneesh Kumar K.V wrote:
> > Here is how it works. During writepages we loop through the dirty pages
> > and build largest contiguous block extent (mpage_add_bh_to_extent). Then we call
> > mpage_da_map_blocks. mpage_da_map_blocks does the mutli block request.
> > Once we have the blocks allocated we map these blocks to the pages. And
> > then we writeback one page at a time using writepage callback.
> 
> mpage_da_map_blocks() calls mpd->get_block, which is set to
> ext4_da_get_block_write(), which allocates a single block at a time
> (max_blocks is set to bh->b_size >> inode->i_blkbits).


That bh>b_size indicate multiple blocks. 

we do the below in mpage_add_bh_to_extent

2024         if (logical == next && (bh->b_state & BH_FLAGS) == lbh->b_state) {
2025                 lbh->b_size += b_size;
2026                 return;
2027         }

> 
> Put another way, where is the call to ext4_get_blocks_wrap() which
> does the multi-block request?  I don't see it...
> 

-aneesh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext4 not currently doing (much) multi-block allocation?
  2009-02-15 17:36     ` Aneesh Kumar K.V
@ 2009-02-15 19:37       ` Eric Sandeen
  2009-02-15 21:12       ` Theodore Tso
  1 sibling, 0 replies; 6+ messages in thread
From: Eric Sandeen @ 2009-02-15 19:37 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: Theodore Tso, linux-ext4

Aneesh Kumar K.V wrote:
> On Sun, Feb 15, 2009 at 08:36:18AM -0500, Theodore Tso wrote:
>> On Sun, Feb 15, 2009 at 04:35:28PM +0530, Aneesh Kumar K.V wrote:
>>> Here is how it works. During writepages we loop through the dirty pages
>>> and build largest contiguous block extent (mpage_add_bh_to_extent). Then we call
>>> mpage_da_map_blocks. mpage_da_map_blocks does the mutli block request.
>>> Once we have the blocks allocated we map these blocks to the pages. And
>>> then we writeback one page at a time using writepage callback.
>> mpage_da_map_blocks() calls mpd->get_block, which is set to
>> ext4_da_get_block_write(), which allocates a single block at a time
>> (max_blocks is set to bh->b_size >> inode->i_blkbits).
> 
> 
> That bh>b_size indicate multiple blocks. 

I never did like this overloading of a buffer head for this sort of
purpose, for this (sometimes confusing) reason, but it's done throughout
the kernel... :)

But, maybe a comment would be in order just to make it clear  "This is a
mapping BH" or something.

-Eric

> we do the below in mpage_add_bh_to_extent
> 
> 2024         if (logical == next && (bh->b_state & BH_FLAGS) == lbh->b_state) {
> 2025                 lbh->b_size += b_size;
> 2026                 return;
> 2027         }
> 
>> Put another way, where is the call to ext4_get_blocks_wrap() which
>> does the multi-block request?  I don't see it...
>>
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ext4 not currently doing (much) multi-block allocation?
  2009-02-15 17:36     ` Aneesh Kumar K.V
  2009-02-15 19:37       ` Eric Sandeen
@ 2009-02-15 21:12       ` Theodore Tso
  1 sibling, 0 replies; 6+ messages in thread
From: Theodore Tso @ 2009-02-15 21:12 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-ext4

On Sun, Feb 15, 2009 at 11:06:29PM +0530, Aneesh Kumar K.V wrote:
> 
> That bh>b_size indicate multiple blocks. 
> 
> we do the below in mpage_add_bh_to_extent
> 
> 2024         if (logical == next && (bh->b_state & BH_FLAGS) == lbh->b_state) {
> 2025                 lbh->b_size += b_size;
> 2026                 return;
> 2027         }
> 

Urgh, right.  mpd->lbh isn't a real struct buffer_head at all; the
only fields we use out of it is b_size, b_state, and b_blocknr.  I
really dislike this coding style; it's hard to tell what is a real
buffer_head, and what is a fake buffer_head.  It also wastes about 200
bytes of kernel stack space.

There's a similar problem with bh_result; in which functions is it a
real buffer head (and must be a real buffer head), and in which
functions is it a fake buffer head, and which functions is it
sometimes a real buffer_head, and when it is a fake buffer_head?  This
is one of those things which makes for hard-to-maintain code.

							- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-02-15 21:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-15  5:32 ext4 not currently doing (much) multi-block allocation? Theodore Tso
2009-02-15 11:05 ` Aneesh Kumar K.V
2009-02-15 13:36   ` Theodore Tso
2009-02-15 17:36     ` Aneesh Kumar K.V
2009-02-15 19:37       ` Eric Sandeen
2009-02-15 21:12       ` Theodore Tso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.