linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
@ 2014-02-18  7:25 Raghavendra K T
  2014-02-18  9:49 ` Jan Kara
  2014-02-18 22:23 ` David Rientjes
  0 siblings, 2 replies; 9+ messages in thread
From: Raghavendra K T @ 2014-02-18  7:25 UTC (permalink / raw)
  To: Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, Jan Kara, rientjes, Linus, nacc
  Cc: linux-mm, linux-kernel, Raghavendra K T

Currently max_sane_readahead() returns zero on the cpu having no local memory node
which leads to readahead failure. Fix the readahead failure by returning
minimum of (requested pages, 512). Users running application on a memory-less cpu
which needs readahead such as streaming application see considerable boost in the
performance.

Result:
fadvise experiment with FADV_WILLNEED on a PPC machine having memoryless CPU
with 1GB testfile ( 12 iterations) yielded around 46.66% improvement.

fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
32GB* 4G RAM  numa machine ( 12 iterations) showed no impact on the normal
NUMA cases w/ patch.

Kernel     Avg  Stddev
base	7.4975	3.92%
patched	7.4174  3.26%

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
[Andrew: making return value PAGE_SIZE independent]
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
---
 I would like to thank Honza, David for their valuable suggestions and 
 patiently reviewing the patches.

 Changes in V6:
  - Just limit the readahead to 2MB on 4k pages system as suggested by Linus.
 and make it independent of PAGE_SIZE. 

 Changes in V5:
 - Drop the 4k limit for normal readahead. (Jan Kara)

 Changes in V4:
 - Check for total node memory to decide whether we don't
   have local memory (jan Kara)
 - Add 4k page limit on readahead for normal and remote readahead (Linus)
   (Linus suggestion was 16MB limit).

 Changes in V3:
 - Drop iterating over numa nodes that calculates total free pages (Linus)

 Agree that we do not have control on allocation for readahead on a
 particular numa node and hence for remote readahead we can not further
 sanitize based on potential free pages of that node. and also we do
 not want to itererate through all nodes to find total free pages.

 Suggestions and comments welcome
 mm/readahead.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index 0de2360..1fa0d6f 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -233,14 +233,14 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
 	return 0;
 }
 
+#define MAX_READAHEAD   ((512*4096)/PAGE_CACHE_SIZE)
 /*
  * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
  * sensible upper limit.
  */
 unsigned long max_sane_readahead(unsigned long nr)
 {
-	return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
-		+ node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
+	return min(nr, MAX_READAHEAD);
 }
 
 /*
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
  2014-02-18  7:25 [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages Raghavendra K T
@ 2014-02-18  9:49 ` Jan Kara
  2014-02-18 12:04   ` Raghavendra K T
  2014-02-18 22:23 ` David Rientjes
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Kara @ 2014-02-18  9:49 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, Jan Kara, rientjes, Linus, nacc, linux-mm,
	linux-kernel

On Tue 18-02-14 12:55:38, Raghavendra K T wrote:
> Currently max_sane_readahead() returns zero on the cpu having no local memory node
> which leads to readahead failure. Fix the readahead failure by returning
> minimum of (requested pages, 512). Users running application on a memory-less cpu
> which needs readahead such as streaming application see considerable boost in the
> performance.
> 
> Result:
> fadvise experiment with FADV_WILLNEED on a PPC machine having memoryless CPU
> with 1GB testfile ( 12 iterations) yielded around 46.66% improvement.
> 
> fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
> 32GB* 4G RAM  numa machine ( 12 iterations) showed no impact on the normal
> NUMA cases w/ patch.
  Can you try one more thing please? Compare startup time of some big
executable (Firefox or LibreOffice come to my mind) for the patched and
normal kernel on a machine which wasn't hit by this NUMA issue. And don't
forget to do "echo 3 >/proc/sys/vm/drop_caches" before each test to flush
the caches. If this doesn't show significant differences, I'm OK with the
patch.

								Honza

> Kernel     Avg  Stddev
> base	7.4975	3.92%
> patched	7.4174  3.26%
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> [Andrew: making return value PAGE_SIZE independent]
> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> ---
>  I would like to thank Honza, David for their valuable suggestions and 
>  patiently reviewing the patches.
> 
>  Changes in V6:
>   - Just limit the readahead to 2MB on 4k pages system as suggested by Linus.
>  and make it independent of PAGE_SIZE. 
> 
>  Changes in V5:
>  - Drop the 4k limit for normal readahead. (Jan Kara)
> 
>  Changes in V4:
>  - Check for total node memory to decide whether we don't
>    have local memory (jan Kara)
>  - Add 4k page limit on readahead for normal and remote readahead (Linus)
>    (Linus suggestion was 16MB limit).
> 
>  Changes in V3:
>  - Drop iterating over numa nodes that calculates total free pages (Linus)
> 
>  Agree that we do not have control on allocation for readahead on a
>  particular numa node and hence for remote readahead we can not further
>  sanitize based on potential free pages of that node. and also we do
>  not want to itererate through all nodes to find total free pages.
> 
>  Suggestions and comments welcome
>  mm/readahead.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 0de2360..1fa0d6f 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -233,14 +233,14 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
>  	return 0;
>  }
>  
> +#define MAX_READAHEAD   ((512*4096)/PAGE_CACHE_SIZE)
>  /*
>   * Given a desired number of PAGE_CACHE_SIZE readahead pages, return a
>   * sensible upper limit.
>   */
>  unsigned long max_sane_readahead(unsigned long nr)
>  {
> -	return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
> -		+ node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
> +	return min(nr, MAX_READAHEAD);
>  }
>  
>  /*
> -- 
> 1.7.11.7
> 
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
  2014-02-18 12:04   ` Raghavendra K T
@ 2014-02-18 12:04     ` Jan Kara
  2014-03-17  2:07     ` Madper Xie
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Kara @ 2014-02-18 12:04 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: Jan Kara, Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, rientjes, Linus, nacc, linux-mm, linux-kernel

On Tue 18-02-14 17:34:54, Raghavendra K T wrote:
> On 02/18/2014 03:19 PM, Jan Kara wrote:
> >On Tue 18-02-14 12:55:38, Raghavendra K T wrote:
> >>Currently max_sane_readahead() returns zero on the cpu having no local memory node
> >>which leads to readahead failure. Fix the readahead failure by returning
> >>minimum of (requested pages, 512). Users running application on a memory-less cpu
> >>which needs readahead such as streaming application see considerable boost in the
> >>performance.
> >>
> >>Result:
> >>fadvise experiment with FADV_WILLNEED on a PPC machine having memoryless CPU
> >>with 1GB testfile ( 12 iterations) yielded around 46.66% improvement.
> >>
> >>fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
> >>32GB* 4G RAM  numa machine ( 12 iterations) showed no impact on the normal
> >>NUMA cases w/ patch.
> >   Can you try one more thing please? Compare startup time of some big
> >executable (Firefox or LibreOffice come to my mind) for the patched and
> >normal kernel on a machine which wasn't hit by this NUMA issue. And don't
> >forget to do "echo 3 >/proc/sys/vm/drop_caches" before each test to flush
> >the caches. If this doesn't show significant differences, I'm OK with the
> >patch.
> >
> 
> Thanks Honza, I checked with firefox (starting to particular point)..
> I do not see any difference. Both the case took around 14sec.
  Good. You can add my:
Acked-by: Jan Kara <jack@suse.cz>

>  ( some time it is even faster.. may be because we do not do free
> page calculation?. )
  Hardly, that calculation is just a tiny amount of CPU time in the
startup of the application. If there is really a significant difference, it
might be because we don't preload stuff which isn't used in the end.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
  2014-02-18  9:49 ` Jan Kara
@ 2014-02-18 12:04   ` Raghavendra K T
  2014-02-18 12:04     ` Jan Kara
  2014-03-17  2:07     ` Madper Xie
  0 siblings, 2 replies; 9+ messages in thread
From: Raghavendra K T @ 2014-02-18 12:04 UTC (permalink / raw)
  To: Jan Kara
  Cc: Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, rientjes, Linus, nacc, linux-mm, linux-kernel

On 02/18/2014 03:19 PM, Jan Kara wrote:
> On Tue 18-02-14 12:55:38, Raghavendra K T wrote:
>> Currently max_sane_readahead() returns zero on the cpu having no local memory node
>> which leads to readahead failure. Fix the readahead failure by returning
>> minimum of (requested pages, 512). Users running application on a memory-less cpu
>> which needs readahead such as streaming application see considerable boost in the
>> performance.
>>
>> Result:
>> fadvise experiment with FADV_WILLNEED on a PPC machine having memoryless CPU
>> with 1GB testfile ( 12 iterations) yielded around 46.66% improvement.
>>
>> fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
>> 32GB* 4G RAM  numa machine ( 12 iterations) showed no impact on the normal
>> NUMA cases w/ patch.
>    Can you try one more thing please? Compare startup time of some big
> executable (Firefox or LibreOffice come to my mind) for the patched and
> normal kernel on a machine which wasn't hit by this NUMA issue. And don't
> forget to do "echo 3 >/proc/sys/vm/drop_caches" before each test to flush
> the caches. If this doesn't show significant differences, I'm OK with the
> patch.
>

Thanks Honza, I checked with firefox (starting to particular point)..
I do not see any difference. Both the case took around 14sec.

  ( some time it is even faster.. may be because we do not do free page 
calculation?. )


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
  2014-02-18  7:25 [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages Raghavendra K T
  2014-02-18  9:49 ` Jan Kara
@ 2014-02-18 22:23 ` David Rientjes
  2014-02-18 22:38   ` Andrew Morton
  1 sibling, 1 reply; 9+ messages in thread
From: David Rientjes @ 2014-02-18 22:23 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, Jan Kara, Linus, nacc, linux-mm, linux-kernel

On Tue, 18 Feb 2014, Raghavendra K T wrote:

> Currently max_sane_readahead() returns zero on the cpu having no local memory node
> which leads to readahead failure. Fix the readahead failure by returning
> minimum of (requested pages, 512). Users running application on a memory-less cpu
> which needs readahead such as streaming application see considerable boost in the
> performance.
> 
> Result:
> fadvise experiment with FADV_WILLNEED on a PPC machine having memoryless CPU
> with 1GB testfile ( 12 iterations) yielded around 46.66% improvement.
> 
> fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
> 32GB* 4G RAM  numa machine ( 12 iterations) showed no impact on the normal
> NUMA cases w/ patch.
> 
> Kernel     Avg  Stddev
> base	7.4975	3.92%
> patched	7.4174  3.26%
> 
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> [Andrew: making return value PAGE_SIZE independent]
> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>

So this replaces 
mm-readaheadc-fix-readahead-fail-for-no-local-memory-and-limit-readahead-pages.patch 
in -mm correct?

> ---
>  I would like to thank Honza, David for their valuable suggestions and 
>  patiently reviewing the patches.
> 
>  Changes in V6:
>   - Just limit the readahead to 2MB on 4k pages system as suggested by Linus.
>  and make it independent of PAGE_SIZE. 
> 

I'm not sure I understand why we want to be independent of PAGE_SIZE since 
we're still relying on PAGE_CACHE_SIZE.  Don't you mean to do

#define MAX_READAHEAD	((512*PAGE_SIZE)/PAGE_CACHE_SIZE)

instead?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
  2014-02-18 22:23 ` David Rientjes
@ 2014-02-18 22:38   ` Andrew Morton
  2014-02-18 22:46     ` David Rientjes
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2014-02-18 22:38 UTC (permalink / raw)
  To: David Rientjes
  Cc: Raghavendra K T, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, Jan Kara, Linus, nacc, linux-mm, linux-kernel

On Tue, 18 Feb 2014 14:23:44 -0800 (PST) David Rientjes <rientjes@google.com> wrote:

> On Tue, 18 Feb 2014, Raghavendra K T wrote:
> 
> > Currently max_sane_readahead() returns zero on the cpu having no local memory node
> > which leads to readahead failure. Fix the readahead failure by returning
> > minimum of (requested pages, 512). Users running application on a memory-less cpu
> > which needs readahead such as streaming application see considerable boost in the
> > performance.
> > 
> > Result:
> > fadvise experiment with FADV_WILLNEED on a PPC machine having memoryless CPU
> > with 1GB testfile ( 12 iterations) yielded around 46.66% improvement.
> > 
> > fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
> > 32GB* 4G RAM  numa machine ( 12 iterations) showed no impact on the normal
> > NUMA cases w/ patch.
> > 
> > Kernel     Avg  Stddev
> > base	7.4975	3.92%
> > patched	7.4174  3.26%
> > 
> > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> > [Andrew: making return value PAGE_SIZE independent]
> > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> 
> So this replaces 
> mm-readaheadc-fix-readahead-fail-for-no-local-memory-and-limit-readahead-pages.patch 
> in -mm correct?

yup.

> >  Changes in V6:
> >   - Just limit the readahead to 2MB on 4k pages system as suggested by Linus.
> >  and make it independent of PAGE_SIZE. 
> > 
> 
> I'm not sure I understand why we want to be independent of PAGE_SIZE since 
> we're still relying on PAGE_CACHE_SIZE.  Don't you mean to do
> 
> #define MAX_READAHEAD	((512*PAGE_SIZE)/PAGE_CACHE_SIZE)

MAX_READAHEAD is in units of "pages".

This:

+#define MAX_READAHEAD   ((512*4096)/PAGE_CACHE_SIZE)

means "two megabytes", and is implemented in a way to ensure that
MAX_READAHEAD=2mb on 4k pagesize as well as on 64k pagesize.  Because
we don't want variations in PAGE_SIZE to cause alterations in readahead
behavior.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
  2014-02-18 22:38   ` Andrew Morton
@ 2014-02-18 22:46     ` David Rientjes
  0 siblings, 0 replies; 9+ messages in thread
From: David Rientjes @ 2014-02-18 22:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Raghavendra K T, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, Jan Kara, Linus, nacc, linux-mm, linux-kernel

On Tue, 18 Feb 2014, Andrew Morton wrote:

> > I'm not sure I understand why we want to be independent of PAGE_SIZE since 
> > we're still relying on PAGE_CACHE_SIZE.  Don't you mean to do
> > 
> > #define MAX_READAHEAD	((512*PAGE_SIZE)/PAGE_CACHE_SIZE)
> 
> MAX_READAHEAD is in units of "pages".
> 
> This:
> 
> +#define MAX_READAHEAD   ((512*4096)/PAGE_CACHE_SIZE)
> 
> means "two megabytes", and is implemented in a way to ensure that
> MAX_READAHEAD=2mb on 4k pagesize as well as on 64k pagesize.  Because
> we don't want variations in PAGE_SIZE to cause alterations in readahead
> behavior.
> 

Ah, ok, so 2MB is the magic value that we limit readhead to on all 
architectures.  512 * 4096 is a strange way to write 2MB, but ok :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
  2014-02-18 12:04   ` Raghavendra K T
  2014-02-18 12:04     ` Jan Kara
@ 2014-03-17  2:07     ` Madper Xie
  2014-03-18  7:13       ` Raghavendra K T
  1 sibling, 1 reply; 9+ messages in thread
From: Madper Xie @ 2014-03-17  2:07 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: Jan Kara, Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, rientjes, Linus, nacc, linux-mm, linux-kernel


Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> writes:

> On 02/18/2014 03:19 PM, Jan Kara wrote:
>> On Tue 18-02-14 12:55:38, Raghavendra K T wrote:
>>> Currently max_sane_readahead() returns zero on the cpu having no local memory node
>>> which leads to readahead failure. Fix the readahead failure by returning
>>> minimum of (requested pages, 512). Users running application on a memory-less cpu
>>> which needs readahead such as streaming application see considerable boost in the
>>> performance.
>>>
>>> Result:
>>> fadvise experiment with FADV_WILLNEED on a PPC machine having memoryless CPU
>>> with 1GB testfile ( 12 iterations) yielded around 46.66% improvement.
>>>
>>> fadvise experiment with FADV_WILLNEED on a x240 machine with 1GB testfile
>>> 32GB* 4G RAM  numa machine ( 12 iterations) showed no impact on the normal
>>> NUMA cases w/ patch.
>>    Can you try one more thing please? Compare startup time of some big
>> executable (Firefox or LibreOffice come to my mind) for the patched and
>> normal kernel on a machine which wasn't hit by this NUMA issue. And don't
>> forget to do "echo 3 >/proc/sys/vm/drop_caches" before each test to flush
>> the caches. If this doesn't show significant differences, I'm OK with the
>> patch.
>>
>
> Thanks Honza, I checked with firefox (starting to particular point)..
> I do not see any difference. Both the case took around 14sec.
>
>   ( some time it is even faster.. may be because we do not do free page 
> calculation?. )
Hi. Just a concern. Will the performance reduce on some special storage
backend? E.g. tape.
The existent applications may using readahead for userspace I/O schedule
to decrease seeking time.
-- 
Thanks,
Madper

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages
  2014-03-17  2:07     ` Madper Xie
@ 2014-03-18  7:13       ` Raghavendra K T
  0 siblings, 0 replies; 9+ messages in thread
From: Raghavendra K T @ 2014-03-18  7:13 UTC (permalink / raw)
  To: Madper Xie
  Cc: Jan Kara, Andrew Morton, Fengguang Wu, David Cohen, Al Viro,
	Damien Ramonda, rientjes, Linus, nacc, linux-mm, linux-kernel

On 03/17/2014 07:37 AM, Madper Xie wrote:
>
> Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> writes:
>
>> On 02/18/2014 03:19 PM, Jan Kara wrote:
>>> On Tue 18-02-14 12:55:38, Raghavendra K T wrote:
> Hi. Just a concern. Will the performance reduce on some special storage
> backend? E.g. tape.
> The existent applications may using readahead for userspace I/O schedule
> to decrease seeking time.

I have not tested the patch on such systems yet unfortunately :(.
Sequential read with huge file has not suffered on disk based system,
but I think, I should be honest enough not to guess the effect on tape.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-03-18  7:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-18  7:25 [PATCH V6 ] mm readahead: Fix readahead fail for memoryless cpu and limit readahead pages Raghavendra K T
2014-02-18  9:49 ` Jan Kara
2014-02-18 12:04   ` Raghavendra K T
2014-02-18 12:04     ` Jan Kara
2014-03-17  2:07     ` Madper Xie
2014-03-18  7:13       ` Raghavendra K T
2014-02-18 22:23 ` David Rientjes
2014-02-18 22:38   ` Andrew Morton
2014-02-18 22:46     ` David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).