All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs
@ 2011-01-19  1:15 Shaohua Li
  2011-01-19 20:34 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2011-01-19  1:15 UTC (permalink / raw)
  To: linux-btrfs-u79uwXL29TY76Z2rM5mHXA, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA
  Cc: Chris Mason, Christoph Hellwig, Andrew Morton, Arjan van de Ven,
	Yan, Zheng, Wu, Fengguang, linux-api, manpages

Hi,
  We have file readahead to do asyn file read, but has no metadata
readahead. For a list of files, their metadata is stored in fragmented
disk space and metadata read is a sync operation, which impacts the
efficiency of readahead much. The patches try to add meatadata readahead
for btrfs. It has two advantages. One is make metadata read async, the
other is significant reducing disk I/O seek.
  In btrfs, metadata is stored in btree_inode. Ideally, if we could hook
the inode to a fd so we could use existing syscalls (readahead, mincore
or upcoming fincore) to do readahead, but the inode is hidden, there is
no easy way for this from my understanding. Another problem is we need
check page referenced bit to make sure if a page is valid, which isn't
ok doing this in fincore/mincore. And in metadata readahead, filesystem
need specific checking like the patch4. Doing the checking in current
API (for example fadvise) will mess things too. So we add two ioctls for
this. One is like readahead syscall, the other is like micore/fincore
syscall.
  Under a harddisk based netbook with Meego, the metadata readahead
reduced about 3.5s boot time in average from total 16s.

v2->v3:
1. fixed some issues Arnd pointed out
2. rebased to latest git
3. remove the 'updated' page flag check from patch 2 as suggested by
Fengguang.

v1->v2:
1. Added more comments and fix return values suggested by Andrew Morton
2. fix a race condition pointed out by Yan Zheng

initial post:
http://marc.info/?l=linux-fsdevel&m=129222493406353&w=2

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs
  2011-01-19  1:15 [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs Shaohua Li
@ 2011-01-19 20:34 ` Andrew Morton
  2011-01-19 21:33   ` David Nicol
       [not found]   ` <20110119123451.75bb3c76.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
  0 siblings, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2011-01-19 20:34 UTC (permalink / raw)
  To: Shaohua Li
  Cc: linux-btrfs, linux-fsdevel, Chris Mason, Christoph Hellwig,
	Arjan van de Ven, Yan, Zheng, Wu, Fengguang, linux-api, manpages

On Wed, 19 Jan 2011 09:15:15 +0800
Shaohua Li <shaohua.li@intel.com> wrote:

>   We have file readahead to do asyn file read, but has no metadata
> readahead. For a list of files, their metadata is stored in fragmented
> disk space and metadata read is a sync operation, which impacts the
> efficiency of readahead much. The patches try to add meatadata readahead
> for btrfs. It has two advantages. One is make metadata read async, the
> other is significant reducing disk I/O seek.
>   In btrfs, metadata is stored in btree_inode. Ideally, if we could hook
> the inode to a fd so we could use existing syscalls (readahead, mincore
> or upcoming fincore) to do readahead, but the inode is hidden, there is
> no easy way for this from my understanding. Another problem is we need
> check page referenced bit to make sure if a page is valid, which isn't
> ok doing this in fincore/mincore. And in metadata readahead, filesystem
> need specific checking like the patch4. Doing the checking in current
> API (for example fadvise) will mess things too. So we add two ioctls for
> this. One is like readahead syscall, the other is like micore/fincore
> syscall.

Has anyone looked at implementing this for filesystems other than
btrfs?  Have the ext4 guys taken a look?  Did they see any impediments
to implementing it for ext4?

>   Under a harddisk based netbook with Meego, the metadata readahead
> reduced about 3.5s boot time in average from total 16s.

That's a respectable speedup.  And it *needs* to be a good speedup,
given how hacky all of this is!

But then..  reducing bootup time on a laptop/desktop/server by 3.5s
isn't exactly a world-shattering benefit, is it?  Is it worth all the
hacky code?

It would be much more valuable if those 3.5 seconds were available to
devices which really really care about bootup times, but very few of
those devices use rotating disks nowadays, I expect?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs
  2011-01-19 20:34 ` Andrew Morton
@ 2011-01-19 21:33   ` David Nicol
  2011-01-20  2:27     ` Shaohua Li
       [not found]   ` <20110119123451.75bb3c76.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
  1 sibling, 1 reply; 7+ messages in thread
From: David Nicol @ 2011-01-19 21:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Shaohua Li, linux-btrfs, linux-fsdevel, Chris Mason,
	Christoph Hellwig, Arjan van de Ven, Yan, Zheng, Wu, Fengguang,
	linux-api, manpages

On Wed, Jan 19, 2011 at 2:34 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:

> It would be much more valuable if those 3.5 seconds were available to
> devices which really really care about bootup times, but very few of
> those devices use rotating disks nowadays, I expect?

And don't rotating disk modules read and buffer whole tracks, doing
their own readahead, anymore, anyway? Isn't that part of what "on-disk
cache" does?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs
  2011-01-19 21:33   ` David Nicol
@ 2011-01-20  2:27     ` Shaohua Li
  0 siblings, 0 replies; 7+ messages in thread
From: Shaohua Li @ 2011-01-20  2:27 UTC (permalink / raw)
  To: David Nicol
  Cc: Andrew Morton, linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Chris Mason,
	Christoph Hellwig, Arjan van de Ven, Yan, Zheng, Wu, Fengguang,
	linux-api, manpages

On Thu, 2011-01-20 at 05:33 +0800, David Nicol wrote:
> On Wed, Jan 19, 2011 at 2:34 PM, Andrew Morton
> <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
> 
> > It would be much more valuable if those 3.5 seconds were available to
> > devices which really really care about bootup times, but very few of
> > those devices use rotating disks nowadays, I expect?
> 
> And don't rotating disk modules read and buffer whole tracks, doing
> their own readahead, anymore, anyway? Isn't that part of what "on-disk
> cache" does?
The disk readahead and the metadata readahead is completely different,
you didn't even look at the patch or log before saying this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs
       [not found]   ` <20110119123451.75bb3c76.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
@ 2011-01-20  2:34     ` Shaohua Li
  2011-01-20  2:46       ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2011-01-20  2:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Chris Mason,
	Christoph Hellwig, Arjan van de Ven, Yan, Zheng, Wu, Fengguang,
	linux-api, manpages

On Thu, 2011-01-20 at 04:34 +0800, Andrew Morton wrote:
> On Wed, 19 Jan 2011 09:15:15 +0800
> Shaohua Li <shaohua.li-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> 
> >   We have file readahead to do asyn file read, but has no metadata
> > readahead. For a list of files, their metadata is stored in fragmented
> > disk space and metadata read is a sync operation, which impacts the
> > efficiency of readahead much. The patches try to add meatadata readahead
> > for btrfs. It has two advantages. One is make metadata read async, the
> > other is significant reducing disk I/O seek.
> >   In btrfs, metadata is stored in btree_inode. Ideally, if we could hook
> > the inode to a fd so we could use existing syscalls (readahead, mincore
> > or upcoming fincore) to do readahead, but the inode is hidden, there is
> > no easy way for this from my understanding. Another problem is we need
> > check page referenced bit to make sure if a page is valid, which isn't
> > ok doing this in fincore/mincore. And in metadata readahead, filesystem
> > need specific checking like the patch4. Doing the checking in current
> > API (for example fadvise) will mess things too. So we add two ioctls for
> > this. One is like readahead syscall, the other is like micore/fincore
> > syscall.
> 
> Has anyone looked at implementing this for filesystems other than
> btrfs?  Have the ext4 guys taken a look?  Did they see any impediments
> to implementing it for ext4?
Not yet. I do expect ext4 guys can check it. From my understanding, it
should be relatively easy to do it in ext filesystems.

> >   Under a harddisk based netbook with Meego, the metadata readahead
> > reduced about 3.5s boot time in average from total 16s.
> 
> That's a respectable speedup.  And it *needs* to be a good speedup,
> given how hacky all of this is!
> 
> But then..  reducing bootup time on a laptop/desktop/server by 3.5s
> isn't exactly a world-shattering benefit, is it?  Is it worth all the
> hacky code?
a laptop/desktop/server need read more data from hard disks, this will
give more bootup time saving I think, though not tested yet.

> It would be much more valuable if those 3.5 seconds were available to
> devices which really really care about bootup times, but very few of
> those devices use rotating disks nowadays, I expect?
Currently most popular netbooks are using rotating disks actually. And
this will benefit laptop/desktop too.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs
  2011-01-20  2:34     ` Shaohua Li
@ 2011-01-20  2:46       ` Andrew Morton
       [not found]         ` <20110119184636.fed233a7.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2011-01-20  2:46 UTC (permalink / raw)
  To: Shaohua Li
  Cc: linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Chris Mason,
	Christoph Hellwig, Arjan van de Ven, Yan, Zheng, Wu, Fengguang,
	linux-api, manpages

On Thu, 20 Jan 2011 10:34:18 +0800 Shaohua Li <shaohua.li-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:

> > >   Under a harddisk based netbook with Meego, the metadata readahead
> > > reduced about 3.5s boot time in average from total 16s.
> > 
> > That's a respectable speedup.  And it *needs* to be a good speedup,
> > given how hacky all of this is!
> > 
> > But then..  reducing bootup time on a laptop/desktop/server by 3.5s
> > isn't exactly a world-shattering benefit, is it?  Is it worth all the
> > hacky code?
> a laptop/desktop/server need read more data from hard disks, this will
> give more bootup time saving I think, though not tested yet.

Well, the whole point of the patch is to improve boot times, so the
more boot-time testing you can do, the better that is!

> > It would be much more valuable if those 3.5 seconds were available to
> > devices which really really care about bootup times, but very few of
> > those devices use rotating disks nowadays, I expect?
> Currently most popular netbooks are using rotating disks actually. And
> this will benefit laptop/desktop too.

But my point is that three seconds boot-time improvement for a system
which has an uptime of days or months isn't terribly exciting.

What *would* be terribly exciting is a three-second improvement for
cameras, cellphones, etc.  But they don't use spinning disks.

Can we expect *any* benefit for flash-type storage devices?  If so, how
much?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs
       [not found]         ` <20110119184636.fed233a7.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
@ 2011-01-20  2:58           ` Shaohua Li
  0 siblings, 0 replies; 7+ messages in thread
From: Shaohua Li @ 2011-01-20  2:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-btrfs-u79uwXL29TY76Z2rM5mHXA,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Chris Mason,
	Christoph Hellwig, Arjan van de Ven, Yan, Zheng, Wu, Fengguang,
	linux-api, manpages

On Thu, 2011-01-20 at 10:46 +0800, Andrew Morton wrote:
> On Thu, 20 Jan 2011 10:34:18 +0800 Shaohua Li <shaohua.li-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> 
> > > >   Under a harddisk based netbook with Meego, the metadata readahead
> > > > reduced about 3.5s boot time in average from total 16s.
> > > 
> > > That's a respectable speedup.  And it *needs* to be a good speedup,
> > > given how hacky all of this is!
> > > 
> > > But then..  reducing bootup time on a laptop/desktop/server by 3.5s
> > > isn't exactly a world-shattering benefit, is it?  Is it worth all the
> > > hacky code?
> > a laptop/desktop/server need read more data from hard disks, this will
> > give more bootup time saving I think, though not tested yet.
> 
> Well, the whole point of the patch is to improve boot times, so the
> more boot-time testing you can do, the better that is!
each distribution uses its own readahead (data readahead) daemon, it's
time-cost to change the daemon, but I'll check if I get some data in a
desktop.

> > > It would be much more valuable if those 3.5 seconds were available to
> > > devices which really really care about bootup times, but very few of
> > > those devices use rotating disks nowadays, I expect?
> > Currently most popular netbooks are using rotating disks actually. And
> > this will benefit laptop/desktop too.
> 
> But my point is that three seconds boot-time improvement for a system
> which has an uptime of days or months isn't terribly exciting.
> 
> What *would* be terribly exciting is a three-second improvement for
> cameras, cellphones, etc.  But they don't use spinning disks.
> 
> Can we expect *any* benefit for flash-type storage devices?  If so, how
> much?
There should be no benefit for high end SSD, because they have high
throughput even for random IO. For low end flash-type storage devices,
this should have a little benefit, but won't expect much. I can't test a
camera or cellphone, I can test a USB disk in a desktop if you like.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-01-20  2:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-19  1:15 [PATCH v3 0/5]add new ioctls to do metadata readahead in btrfs Shaohua Li
2011-01-19 20:34 ` Andrew Morton
2011-01-19 21:33   ` David Nicol
2011-01-20  2:27     ` Shaohua Li
     [not found]   ` <20110119123451.75bb3c76.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2011-01-20  2:34     ` Shaohua Li
2011-01-20  2:46       ` Andrew Morton
     [not found]         ` <20110119184636.fed233a7.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2011-01-20  2:58           ` Shaohua Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.