All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Ronald Moesbergen <intercommit@gmail.com>,
	Vladislav Bolkhovitin <vst@vlnb.net>
Date: Wed, 23 Sep 2009 09:48:02 +0800	[thread overview]
raw)

Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev
Reply-To: 
In-Reply-To: <20090922135838.33ebe36b.akpm@linux-foundation.org>

On Wed, Sep 23, 2009 at 04:58:38AM +0800, Andrew Morton wrote:
> On Fri, 29 May 2009 14:35:55 +0900
> Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> wrote:
> 
> > I added blk_run_backing_dev on page_cache_async_readahead
> > so readahead I/O is unpluged to improve throughput on 
> > especially RAID environment. 
> 
> I still haven't sent this upstream.  It's unclear to me that we've
> decided that it merits merging?

Yes, if I remember it right, the performance gain is later confirmed
by Ronald's independent testing on his RAID. (Ronald CC-ed)

Thanks,
Fengguang

> 
> 
> From: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> 
> I added blk_run_backing_dev on page_cache_async_readahead so readahead I/O
> is unpluged to improve throughput on especially RAID environment.
> 
> The normal case is, if page N become uptodate at time T(N), then T(N) <=
> T(N+1) holds.  With RAID (and NFS to some degree), there is no strict
> ordering, the data arrival time depends on runtime status of individual
> disks, which breaks that formula.  So in do_generic_file_read(), just
> after submitting the async readahead IO request, the current page may well
> be uptodate, so the page won't be locked, and the block device won't be
> implicitly unplugged:
> 
>                if (PageReadahead(page))
>                         page_cache_async_readahead()
>                 if (!PageUptodate(page))
>                                 goto page_not_up_to_date;
>                 //...
> page_not_up_to_date:
>                 lock_page_killable(page);
> 
> Therefore explicit unplugging can help.
> 
> Following is the test result with dd.
> 
> #dd if=testdir/testfile of=/dev/null bs=16384
> 
> -2.6.30-rc6
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> 
> -2.6.30-rc6-patched
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> 
> (7Disks RAID-0 Array)
> 
> -2.6.30-rc6
> 1054976+0 records in
> 1054976+0 records out
> 17284726784 bytes (17 GB) copied, 212.233 seconds, 81.4 MB/s
> 
> -2.6.30-rc6-patched
> 1054976+0 records out
> 17284726784 bytes (17 GB) copied, 198.878 seconds, 86.9 MB/s
> 
> (7Disks RAID-5 Array)
> 
> The patch was found to improve performance with the SCST scsi target
> driver.  See
> http://sourceforge.net/mailarchive/forum.php?thread_name=a0272b440906030714g67eabc5k8f847fb1e538cc62%40mail.gmail.com&forum_name=scst-devel
> 
> [akpm@linux-foundation.org: unbust comment layout]
> [akpm@linux-foundation.org: "fix" CONFIG_BLOCK=n]
> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> Acked-by: Wu Fengguang <fengguang.wu@intel.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Tested-by: Ronald <intercommit@gmail.com>
> Cc: Bart Van Assche <bart.vanassche@gmail.com>
> Cc: Vladislav Bolkhovitin <vst@vlnb.net>
> Cc: Randy Dunlap <randy.dunlap@oracle.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/readahead.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff -puN mm/readahead.c~readahead-add-blk_run_backing_dev mm/readahead.c
> --- a/mm/readahead.c~readahead-add-blk_run_backing_dev
> +++ a/mm/readahead.c
> @@ -547,5 +547,17 @@ page_cache_async_readahead(struct addres
>  
>  	/* do read-ahead */
>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> +
> +#ifdef CONFIG_BLOCK
> +	/*
> +	 * Normally the current page is !uptodate and lock_page() will be
> +	 * immediately called to implicitly unplug the device. However this
> +	 * is not always true for RAID conifgurations, where data arrives
> +	 * not strictly in their submission order. In this case we need to
> +	 * explicitly kick off the IO.
> +	 */
> +	if (PageUptodate(page))
> +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
> +#endif
>  }
>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> _
> 

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Ronald Moesbergen <intercommit@gmail.com>,
	Vladislav Bolkhovitin <vst@vlnb.net>
Subject: (unknown)
Date: Wed, 23 Sep 2009 09:48:02 +0800	[thread overview]
Message-ID: <20090923014802.GA11491@localhost> (raw)

Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev
Reply-To: 
In-Reply-To: <20090922135838.33ebe36b.akpm@linux-foundation.org>

On Wed, Sep 23, 2009 at 04:58:38AM +0800, Andrew Morton wrote:
> On Fri, 29 May 2009 14:35:55 +0900
> Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> wrote:
> 
> > I added blk_run_backing_dev on page_cache_async_readahead
> > so readahead I/O is unpluged to improve throughput on 
> > especially RAID environment. 
> 
> I still haven't sent this upstream.  It's unclear to me that we've
> decided that it merits merging?

Yes, if I remember it right, the performance gain is later confirmed
by Ronald's independent testing on his RAID. (Ronald CC-ed)

Thanks,
Fengguang

> 
> 
> From: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> 
> I added blk_run_backing_dev on page_cache_async_readahead so readahead I/O
> is unpluged to improve throughput on especially RAID environment.
> 
> The normal case is, if page N become uptodate at time T(N), then T(N) <=
> T(N+1) holds.  With RAID (and NFS to some degree), there is no strict
> ordering, the data arrival time depends on runtime status of individual
> disks, which breaks that formula.  So in do_generic_file_read(), just
> after submitting the async readahead IO request, the current page may well
> be uptodate, so the page won't be locked, and the block device won't be
> implicitly unplugged:
> 
>                if (PageReadahead(page))
>                         page_cache_async_readahead()
>                 if (!PageUptodate(page))
>                                 goto page_not_up_to_date;
>                 //...
> page_not_up_to_date:
>                 lock_page_killable(page);
> 
> Therefore explicit unplugging can help.
> 
> Following is the test result with dd.
> 
> #dd if=testdir/testfile of=/dev/null bs=16384
> 
> -2.6.30-rc6
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> 
> -2.6.30-rc6-patched
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> 
> (7Disks RAID-0 Array)
> 
> -2.6.30-rc6
> 1054976+0 records in
> 1054976+0 records out
> 17284726784 bytes (17 GB) copied, 212.233 seconds, 81.4 MB/s
> 
> -2.6.30-rc6-patched
> 1054976+0 records out
> 17284726784 bytes (17 GB) copied, 198.878 seconds, 86.9 MB/s
> 
> (7Disks RAID-5 Array)
> 
> The patch was found to improve performance with the SCST scsi target
> driver.  See
> http://sourceforge.net/mailarchive/forum.php?thread_name=a0272b440906030714g67eabc5k8f847fb1e538cc62%40mail.gmail.com&forum_name=scst-devel
> 
> [akpm@linux-foundation.org: unbust comment layout]
> [akpm@linux-foundation.org: "fix" CONFIG_BLOCK=n]
> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> Acked-by: Wu Fengguang <fengguang.wu@intel.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Tested-by: Ronald <intercommit@gmail.com>
> Cc: Bart Van Assche <bart.vanassche@gmail.com>
> Cc: Vladislav Bolkhovitin <vst@vlnb.net>
> Cc: Randy Dunlap <randy.dunlap@oracle.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/readahead.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff -puN mm/readahead.c~readahead-add-blk_run_backing_dev mm/readahead.c
> --- a/mm/readahead.c~readahead-add-blk_run_backing_dev
> +++ a/mm/readahead.c
> @@ -547,5 +547,17 @@ page_cache_async_readahead(struct addres
>  
>  	/* do read-ahead */
>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> +
> +#ifdef CONFIG_BLOCK
> +	/*
> +	 * Normally the current page is !uptodate and lock_page() will be
> +	 * immediately called to implicitly unplug the device. However this
> +	 * is not always true for RAID conifgurations, where data arrives
> +	 * not strictly in their submission order. In this case we need to
> +	 * explicitly kick off the IO.
> +	 */
> +	if (PageUptodate(page))
> +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
> +#endif
>  }
>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> _
> 

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Ronald Moesbergen <intercommit@gmail.com>,
	Vladislav Bolkhovitin <vst@vlnb.net>
Subject: Re: [RESEND] [PATCH] readahead:add blk_run_backing_dev
Date: Wed, 23 Sep 2009 09:48:02 +0800	[thread overview]
Message-ID: <20090923014802.GA11491@localhost> (raw)
In-Reply-To: <20090922135838.33ebe36b.akpm@linux-foundation.org>

On Wed, Sep 23, 2009 at 04:58:38AM +0800, Andrew Morton wrote:
> On Fri, 29 May 2009 14:35:55 +0900
> Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> wrote:
> 
> > I added blk_run_backing_dev on page_cache_async_readahead
> > so readahead I/O is unpluged to improve throughput on 
> > especially RAID environment. 
> 
> I still haven't sent this upstream.  It's unclear to me that we've
> decided that it merits merging?

Yes, if I remember it right, the performance gain is later confirmed
by Ronald's independent testing on his RAID. (Ronald CC-ed)

Thanks,
Fengguang

> 
> 
> From: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> 
> I added blk_run_backing_dev on page_cache_async_readahead so readahead I/O
> is unpluged to improve throughput on especially RAID environment.
> 
> The normal case is, if page N become uptodate at time T(N), then T(N) <=
> T(N+1) holds.  With RAID (and NFS to some degree), there is no strict
> ordering, the data arrival time depends on runtime status of individual
> disks, which breaks that formula.  So in do_generic_file_read(), just
> after submitting the async readahead IO request, the current page may well
> be uptodate, so the page won't be locked, and the block device won't be
> implicitly unplugged:
> 
>                if (PageReadahead(page))
>                         page_cache_async_readahead()
>                 if (!PageUptodate(page))
>                                 goto page_not_up_to_date;
>                 //...
> page_not_up_to_date:
>                 lock_page_killable(page);
> 
> Therefore explicit unplugging can help.
> 
> Following is the test result with dd.
> 
> #dd if=testdir/testfile of=/dev/null bs=16384
> 
> -2.6.30-rc6
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 224.182 seconds, 76.6 MB/s
> 
> -2.6.30-rc6-patched
> 1048576+0 records in
> 1048576+0 records out
> 17179869184 bytes (17 GB) copied, 206.465 seconds, 83.2 MB/s
> 
> (7Disks RAID-0 Array)
> 
> -2.6.30-rc6
> 1054976+0 records in
> 1054976+0 records out
> 17284726784 bytes (17 GB) copied, 212.233 seconds, 81.4 MB/s
> 
> -2.6.30-rc6-patched
> 1054976+0 records out
> 17284726784 bytes (17 GB) copied, 198.878 seconds, 86.9 MB/s
> 
> (7Disks RAID-5 Array)
> 
> The patch was found to improve performance with the SCST scsi target
> driver.  See
> http://sourceforge.net/mailarchive/forum.php?thread_name=a0272b440906030714g67eabc5k8f847fb1e538cc62%40mail.gmail.com&forum_name=scst-devel
> 
> [akpm@linux-foundation.org: unbust comment layout]
> [akpm@linux-foundation.org: "fix" CONFIG_BLOCK=n]
> Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
> Acked-by: Wu Fengguang <fengguang.wu@intel.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Tested-by: Ronald <intercommit@gmail.com>
> Cc: Bart Van Assche <bart.vanassche@gmail.com>
> Cc: Vladislav Bolkhovitin <vst@vlnb.net>
> Cc: Randy Dunlap <randy.dunlap@oracle.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/readahead.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff -puN mm/readahead.c~readahead-add-blk_run_backing_dev mm/readahead.c
> --- a/mm/readahead.c~readahead-add-blk_run_backing_dev
> +++ a/mm/readahead.c
> @@ -547,5 +547,17 @@ page_cache_async_readahead(struct addres
>  
>  	/* do read-ahead */
>  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> +
> +#ifdef CONFIG_BLOCK
> +	/*
> +	 * Normally the current page is !uptodate and lock_page() will be
> +	 * immediately called to implicitly unplug the device. However this
> +	 * is not always true for RAID conifgurations, where data arrives
> +	 * not strictly in their submission order. In this case we need to
> +	 * explicitly kick off the IO.
> +	 */
> +	if (PageUptodate(page))
> +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
> +#endif
>  }
>  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> _
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2009-09-23  1:48 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-23  1:48 Wu Fengguang [this message]
2009-09-23  1:48 ` [RESEND] [PATCH] readahead:add blk_run_backing_dev Wu Fengguang
2009-09-23  1:48 ` (unknown) Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2009-05-29  5:35 [RESEND] [PATCH] readahead:add blk_run_backing_dev Hisashi Hifumi
2009-06-01  0:36 ` Andrew Morton
2009-06-01  1:04   ` Hisashi Hifumi
2009-06-05 15:15     ` Alan D. Brunelle
2009-06-06 14:36       ` KOSAKI Motohiro
2009-06-06 22:45         ` Wu Fengguang
2009-06-18 19:04           ` Andrew Morton
2009-06-20  3:55             ` Wu Fengguang
2009-06-20 12:29               ` Vladislav Bolkhovitin
2009-06-29  9:34                 ` Wu Fengguang
2009-06-29 10:26                   ` Ronald Moesbergen
2009-06-29 10:26                     ` Ronald Moesbergen
2009-06-29 10:55                     ` Vladislav Bolkhovitin
2009-06-29 12:54                       ` Wu Fengguang
2009-06-29 12:58                         ` Bart Van Assche
2009-06-29 13:01                           ` Wu Fengguang
2009-06-29 13:04                         ` Vladislav Bolkhovitin
2009-06-29 13:13                           ` Wu Fengguang
2009-06-29 13:28                             ` Wu Fengguang
2009-06-29 14:43                               ` Ronald Moesbergen
2009-06-29 14:51                                 ` Wu Fengguang
2009-06-29 14:56                                   ` Ronald Moesbergen
2009-06-29 15:37                                   ` Vladislav Bolkhovitin
2009-06-29 14:00                           ` Ronald Moesbergen
2009-06-29 14:21                             ` Wu Fengguang
2009-06-29 15:01                               ` Wu Fengguang
2009-06-29 15:37                                 ` Vladislav Bolkhovitin
     [not found]                                   ` <20090630010414.GB31418@localhost>
2009-06-30 10:54                                     ` Vladislav Bolkhovitin
2009-07-01 13:07                                       ` Ronald Moesbergen
2009-07-01 18:12                                         ` Vladislav Bolkhovitin
2009-07-03  9:14                                       ` Ronald Moesbergen
2009-07-03 10:56                                         ` Vladislav Bolkhovitin
2009-07-03 12:41                                           ` Ronald Moesbergen
2009-07-03 12:46                                             ` Vladislav Bolkhovitin
2009-07-04 15:19                                           ` Ronald Moesbergen
2009-07-06 11:12                                             ` Vladislav Bolkhovitin
2009-07-06 14:37                                               ` Ronald Moesbergen
2009-07-06 14:37                                                 ` Ronald Moesbergen
2009-07-06 17:48                                                 ` Vladislav Bolkhovitin
2009-07-07  6:49                                                   ` Ronald Moesbergen
2009-07-07  6:49                                                     ` Ronald Moesbergen
     [not found]                                                     ` <4A5395FD.2040507@vlnb.net>
     [not found]                                                       ` <a0272b440907080149j3eeeb9bat13f942520db059a8@mail.gmail.com>
2009-07-08 12:40                                                         ` Vladislav Bolkhovitin
2009-07-10  6:32                                                           ` Ronald Moesbergen
2009-07-10  8:43                                                             ` Vladislav Bolkhovitin
2009-07-10  9:27                                                               ` Vladislav Bolkhovitin
2009-07-13 12:12                                                                 ` Ronald Moesbergen
2009-07-13 12:36                                                                   ` Wu Fengguang
2009-07-13 12:47                                                                     ` Ronald Moesbergen
2009-07-13 12:52                                                                       ` Wu Fengguang
2009-07-14 18:52                                                                     ` Vladislav Bolkhovitin
2009-07-15  7:06                                                                       ` Wu Fengguang
2009-07-14 18:52                                                                   ` Vladislav Bolkhovitin
2009-07-15  6:30                                                                     ` Vladislav Bolkhovitin
2009-07-16  7:32                                                                       ` Ronald Moesbergen
2009-07-16 10:36                                                                         ` Vladislav Bolkhovitin
2009-07-16 14:54                                                                           ` Ronald Moesbergen
2009-07-16 16:03                                                                             ` Vladislav Bolkhovitin
2009-07-17 14:15                                                                           ` Ronald Moesbergen
2009-07-17 18:23                                                                             ` Vladislav Bolkhovitin
2009-07-20  7:20                                                                               ` Vladislav Bolkhovitin
2009-07-22  8:44                                                                                 ` Ronald Moesbergen
2009-07-27 13:11                                                                                   ` Vladislav Bolkhovitin
2009-07-28  9:51                                                                                     ` Ronald Moesbergen
2009-07-28 19:07                                                                                       ` Vladislav Bolkhovitin
2009-07-29 12:48                                                                                         ` Ronald Moesbergen
2009-07-31 18:32                                                                                           ` Vladislav Bolkhovitin
2009-08-03  9:15                                                                                             ` Ronald Moesbergen
2009-08-03  9:20                                                                                               ` Vladislav Bolkhovitin
2009-08-03 11:44                                                                                                 ` Ronald Moesbergen
2009-08-03 11:44                                                                                                   ` Ronald Moesbergen
2009-07-15 20:52                                                           ` Kurt Garloff
2009-07-16 10:38                                                             ` Vladislav Bolkhovitin
2009-06-30 10:22                             ` Vladislav Bolkhovitin
2009-06-29 10:55                   ` Vladislav Bolkhovitin
2009-06-29 13:00                     ` Wu Fengguang
2009-09-22 20:58 ` Andrew Morton
2009-09-22 20:58   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090923014802.GA11491@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hifumi.hisashi@oss.ntt.co.jp \
    --cc=intercommit@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vst@vlnb.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.