All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org,
	Chris Mason <chris.mason@oracle.com>
Subject: Re: [PATCH] mm: disallow direct reclaim page writeback
Date: Wed, 14 Apr 2010 13:12:05 +1000	[thread overview]
Message-ID: <20100414031205.GE2493@dastard> (raw)
In-Reply-To: <20100413143659.GA2493@dastard>

On Wed, Apr 14, 2010 at 12:36:59AM +1000, Dave Chinner wrote:
> On Tue, Apr 13, 2010 at 08:39:29PM +0900, KOSAKI Motohiro wrote:
> > > FWIW, the biggest problem here is that I have absolutely no clue on
> > > how to test what the impact on lumpy reclaim really is. Does anyone
> > > have a relatively simple test that can be run to determine what the
> > > impact is?
> > 
> > So, can you please run two workloads concurrently?
> >  - Normal IO workload (fio, iozone, etc..)
> >  - echo $NUM > /proc/sys/vm/nr_hugepages
> 
> What do I measure/observe/record that is meaningful?

So, a rough as guts first pass - just run a large dd (8 times the
size of memory - 8GB file vs 1GB RAM) and repeated try to allocate
the entire of memory in huge pages (500) every 5 seconds. The IO
rate is roughly 100MB/s, so it takes 75-85s to complete the dd.

The script:

$ cat t.sh
#!/bin/bash

echo 0 > /proc/sys/vm/nr_hugepages
echo 3 > /proc/sys/vm/drop_caches

dd if=/dev/zero of=/mnt/scratch/test bs=1024k count=8000 > /dev/null 2>&1 &

(
for i in `seq 1 1 20`; do
        sleep 5
        /usr/bin/time --format="wall %e" sh -c "echo 500 > /proc/sys/vm/nr_hugepages" 2>&1
        grep HugePages_Total /proc/meminfo
done
) | awk '
        /wall/ { wall += $2; cnt += 1 }
        /Pages/ { pages[cnt] = $2 }
        END { printf "average wall time %f\nPages step: ", wall / cnt ;
                for (i = 1; i <= cnt; i++) {
                        printf "%d ", pages[i];
                }
        }'
----

And the output looks like:

$ sudo ./t.sh
average wall time 0.954500
Pages step: 97 101 101 121 173 173 173 173 173 173 175 194 195 195 202 220 226 419 423 426
$

Run 50 times in a loop, and the outputs averaged, the existing lumpy
reclaim resulted in:

dave@test-1:~$ cat current.txt | awk -f av.awk
av. wall = 0.519385 secs
av Pages step: 192 228 242 255 265 272 279 284 289 294 298 303 307 322 342 366 383 401 412 420

And with my patch that disables ->writepage:

dave@test-1:~$ cat no-direct.txt | awk -f av.awk
av. wall = 0.554163 secs
av Pages step: 231 283 310 316 323 328 336 340 345 351 356 359 364 377 388 397 413 423 432 439

Basically, with my patch lumpy reclaim was *substantially* more
effective with only a slight increase in average allocation latency
with this test case.

I need to add a marker to the output that records when the dd
completes, but from monitoring the writeback rates via PCP, they
were in the balllpark of 85-100MB/s for the existing code, and
95-110MB/s with my patch.  Hence it improved both IO throughput and
the effectiveness of lumpy reclaim.

On the down side, I did have an OOM killer invocation with my patch
after about 150 iterations - dd failed an order zero allocation
because there were 455 huge pages allocated and there were only
_320_ available pages for IO, all of which were under IO. i.e. lumpy
reclaim worked so well that the machine got into order-0 page
starvation.

I know this is a simple test case, but it shows much better results
than I think anyone (even me) is expecting...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org,
	Chris Mason <chris.mason@oracle.com>
Subject: Re: [PATCH] mm: disallow direct reclaim page writeback
Date: Wed, 14 Apr 2010 13:12:05 +1000	[thread overview]
Message-ID: <20100414031205.GE2493@dastard> (raw)
In-Reply-To: <20100413143659.GA2493@dastard>

On Wed, Apr 14, 2010 at 12:36:59AM +1000, Dave Chinner wrote:
> On Tue, Apr 13, 2010 at 08:39:29PM +0900, KOSAKI Motohiro wrote:
> > > FWIW, the biggest problem here is that I have absolutely no clue on
> > > how to test what the impact on lumpy reclaim really is. Does anyone
> > > have a relatively simple test that can be run to determine what the
> > > impact is?
> > 
> > So, can you please run two workloads concurrently?
> >  - Normal IO workload (fio, iozone, etc..)
> >  - echo $NUM > /proc/sys/vm/nr_hugepages
> 
> What do I measure/observe/record that is meaningful?

So, a rough as guts first pass - just run a large dd (8 times the
size of memory - 8GB file vs 1GB RAM) and repeated try to allocate
the entire of memory in huge pages (500) every 5 seconds. The IO
rate is roughly 100MB/s, so it takes 75-85s to complete the dd.

The script:

$ cat t.sh
#!/bin/bash

echo 0 > /proc/sys/vm/nr_hugepages
echo 3 > /proc/sys/vm/drop_caches

dd if=/dev/zero of=/mnt/scratch/test bs=1024k count=8000 > /dev/null 2>&1 &

(
for i in `seq 1 1 20`; do
        sleep 5
        /usr/bin/time --format="wall %e" sh -c "echo 500 > /proc/sys/vm/nr_hugepages" 2>&1
        grep HugePages_Total /proc/meminfo
done
) | awk '
        /wall/ { wall += $2; cnt += 1 }
        /Pages/ { pages[cnt] = $2 }
        END { printf "average wall time %f\nPages step: ", wall / cnt ;
                for (i = 1; i <= cnt; i++) {
                        printf "%d ", pages[i];
                }
        }'
----

And the output looks like:

$ sudo ./t.sh
average wall time 0.954500
Pages step: 97 101 101 121 173 173 173 173 173 173 175 194 195 195 202 220 226 419 423 426
$

Run 50 times in a loop, and the outputs averaged, the existing lumpy
reclaim resulted in:

dave@test-1:~$ cat current.txt | awk -f av.awk
av. wall = 0.519385 secs
av Pages step: 192 228 242 255 265 272 279 284 289 294 298 303 307 322 342 366 383 401 412 420

And with my patch that disables ->writepage:

dave@test-1:~$ cat no-direct.txt | awk -f av.awk
av. wall = 0.554163 secs
av Pages step: 231 283 310 316 323 328 336 340 345 351 356 359 364 377 388 397 413 423 432 439

Basically, with my patch lumpy reclaim was *substantially* more
effective with only a slight increase in average allocation latency
with this test case.

I need to add a marker to the output that records when the dd
completes, but from monitoring the writeback rates via PCP, they
were in the balllpark of 85-100MB/s for the existing code, and
95-110MB/s with my patch.  Hence it improved both IO throughput and
the effectiveness of lumpy reclaim.

On the down side, I did have an OOM killer invocation with my patch
after about 150 iterations - dd failed an order zero allocation
because there were 455 huge pages allocated and there were only
_320_ available pages for IO, all of which were under IO. i.e. lumpy
reclaim worked so well that the machine got into order-0 page
starvation.

I know this is a simple test case, but it shows much better results
than I think anyone (even me) is expecting...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-04-14  3:12 UTC|newest]

Thread overview: 248+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-13  0:17 [PATCH] mm: disallow direct reclaim page writeback Dave Chinner
2010-04-13  0:17 ` Dave Chinner
2010-04-13  8:31 ` KOSAKI Motohiro
2010-04-13  8:31   ` KOSAKI Motohiro
2010-04-13 10:29   ` Dave Chinner
2010-04-13 10:29     ` Dave Chinner
2010-04-13 11:39     ` KOSAKI Motohiro
2010-04-13 11:39       ` KOSAKI Motohiro
2010-04-13 14:36       ` Dave Chinner
2010-04-13 14:36         ` Dave Chinner
2010-04-14  3:12         ` Dave Chinner [this message]
2010-04-14  3:12           ` Dave Chinner
2010-04-14  6:52           ` KOSAKI Motohiro
2010-04-14  6:52             ` KOSAKI Motohiro
2010-04-15  1:56             ` Dave Chinner
2010-04-15  1:56               ` Dave Chinner
2010-04-14  6:52         ` KOSAKI Motohiro
2010-04-14  6:52           ` KOSAKI Motohiro
2010-04-14  7:36           ` Dave Chinner
2010-04-14  7:36             ` Dave Chinner
2010-04-13  9:58 ` Mel Gorman
2010-04-13  9:58   ` Mel Gorman
2010-04-13 11:19   ` Dave Chinner
2010-04-13 11:19     ` Dave Chinner
2010-04-13 19:34     ` Mel Gorman
2010-04-13 19:34       ` Mel Gorman
2010-04-13 20:20       ` Chris Mason
2010-04-13 20:20         ` Chris Mason
2010-04-14  1:40         ` Dave Chinner
2010-04-14  1:40           ` Dave Chinner
2010-04-14  4:59           ` KAMEZAWA Hiroyuki
2010-04-14  4:59             ` KAMEZAWA Hiroyuki
2010-04-14  5:41             ` Dave Chinner
2010-04-14  5:41               ` Dave Chinner
2010-04-14  5:54               ` KOSAKI Motohiro
2010-04-14  5:54                 ` KOSAKI Motohiro
2010-04-14  6:13                 ` Minchan Kim
2010-04-14  7:19                   ` Minchan Kim
2010-04-14  7:19                     ` Minchan Kim
2010-04-14  9:42                     ` KAMEZAWA Hiroyuki
2010-04-14  9:42                       ` KAMEZAWA Hiroyuki
2010-04-14  9:42                       ` KAMEZAWA Hiroyuki
2010-04-14 10:01                       ` Minchan Kim
2010-04-14 10:01                         ` Minchan Kim
2010-04-14 10:07                         ` Mel Gorman
2010-04-14 10:07                           ` Mel Gorman
2010-04-14 10:07                           ` Mel Gorman
2010-04-14 10:16                           ` Minchan Kim
2010-04-14 10:16                             ` Minchan Kim
2010-04-14  7:06                 ` Dave Chinner
2010-04-14  7:06                   ` Dave Chinner
2010-04-14  6:52           ` KOSAKI Motohiro
2010-04-14  6:52             ` KOSAKI Motohiro
2010-04-14  7:28             ` Dave Chinner
2010-04-14  7:28               ` Dave Chinner
2010-04-14  8:51               ` Mel Gorman
2010-04-14  8:51                 ` Mel Gorman
2010-04-15  1:34                 ` Dave Chinner
2010-04-15  1:34                   ` Dave Chinner
2010-04-15  1:34                   ` Dave Chinner
2010-04-15  4:09                   ` KOSAKI Motohiro
2010-04-15  4:09                     ` KOSAKI Motohiro
2010-04-15  4:11                     ` [PATCH 1/4] vmscan: delegate pageout io to flusher thread if current is kswapd KOSAKI Motohiro
2010-04-15  4:11                       ` KOSAKI Motohiro
2010-04-15  4:11                       ` KOSAKI Motohiro
2010-04-15  8:05                       ` Suleiman Souhlal
2010-04-15  8:05                         ` Suleiman Souhlal
2010-04-15  8:17                         ` KOSAKI Motohiro
2010-04-15  8:17                           ` KOSAKI Motohiro
2010-04-15  8:26                           ` KOSAKI Motohiro
2010-04-15  8:26                             ` KOSAKI Motohiro
2010-04-15 10:30                             ` Johannes Weiner
2010-04-15 10:30                               ` Johannes Weiner
2010-04-15 17:24                               ` Suleiman Souhlal
2010-04-15 17:24                                 ` Suleiman Souhlal
2010-04-20  2:56                               ` Ying Han
2010-04-20  2:56                                 ` Ying Han
2010-04-15  9:32                         ` Dave Chinner
2010-04-15  9:32                           ` Dave Chinner
2010-04-15  9:41                           ` KOSAKI Motohiro
2010-04-15  9:41                             ` KOSAKI Motohiro
2010-04-15 17:27                           ` Suleiman Souhlal
2010-04-15 17:27                             ` Suleiman Souhlal
2010-04-15 23:33                             ` Dave Chinner
2010-04-15 23:33                               ` Dave Chinner
2010-04-15 23:41                               ` Suleiman Souhlal
2010-04-15 23:41                                 ` Suleiman Souhlal
2010-04-16  9:50                               ` Alan Cox
2010-04-16  9:50                                 ` Alan Cox
2010-04-17  3:06                                 ` Dave Chinner
2010-04-17  3:06                                   ` Dave Chinner
2010-04-15  8:18                       ` KOSAKI Motohiro
2010-04-15  8:18                         ` KOSAKI Motohiro
2010-04-15  8:18                         ` KOSAKI Motohiro
2010-04-15 10:31                       ` Mel Gorman
2010-04-15 10:31                         ` Mel Gorman
2010-04-15 11:26                         ` KOSAKI Motohiro
2010-04-15 11:26                           ` KOSAKI Motohiro
2010-04-15  4:13                     ` [PATCH 2/4] vmscan: kill prev_priority completely KOSAKI Motohiro
2010-04-15  4:13                       ` KOSAKI Motohiro
2010-04-15  4:13                       ` KOSAKI Motohiro
2010-04-15  4:14                     ` [PATCH 3/4] vmscan: move priority variable into scan_control KOSAKI Motohiro
2010-04-15  4:14                       ` KOSAKI Motohiro
2010-04-15  4:14                       ` KOSAKI Motohiro
2010-04-15  4:15                     ` [PATCH 4/4] vmscan: delegate page cleaning io to flusher thread if VM pressure is low KOSAKI Motohiro
2010-04-15  4:15                       ` KOSAKI Motohiro
2010-04-15  4:15                       ` KOSAKI Motohiro
2010-04-15  4:35                     ` [PATCH] mm: disallow direct reclaim page writeback KOSAKI Motohiro
2010-04-15  4:35                       ` KOSAKI Motohiro
2010-04-15  6:32                       ` Dave Chinner
2010-04-15  6:32                         ` Dave Chinner
2010-04-15  6:44                         ` KOSAKI Motohiro
2010-04-15  6:44                           ` KOSAKI Motohiro
2010-04-15  6:58                           ` Dave Chinner
2010-04-15  6:58                             ` Dave Chinner
2010-04-15  6:20                     ` Dave Chinner
2010-04-15  6:20                       ` Dave Chinner
2010-04-15  6:35                       ` KOSAKI Motohiro
2010-04-15  6:35                         ` KOSAKI Motohiro
2010-04-15  8:54                         ` Dave Chinner
2010-04-15  8:54                           ` Dave Chinner
2010-04-15 10:21                           ` KOSAKI Motohiro
2010-04-15 10:21                             ` KOSAKI Motohiro
2010-04-15 10:23                             ` [PATCH 1/4] vmscan: simplify shrink_inactive_list() KOSAKI Motohiro
2010-04-15 10:23                               ` KOSAKI Motohiro
2010-04-15 13:15                               ` Mel Gorman
2010-04-15 13:15                                 ` Mel Gorman
2010-04-15 15:01                                 ` Andi Kleen
2010-04-15 15:01                                   ` Andi Kleen
2010-04-15 15:01                                   ` Andi Kleen
2010-04-15 15:44                                   ` Mel Gorman
2010-04-15 15:44                                     ` Mel Gorman
2010-04-15 16:54                                     ` Andi Kleen
2010-04-15 16:54                                       ` Andi Kleen
2010-04-15 23:40                                       ` Dave Chinner
2010-04-15 23:40                                         ` Dave Chinner
2010-04-16  7:13                                         ` Andi Kleen
2010-04-16  7:13                                           ` Andi Kleen
2010-04-16 14:57                                         ` Mel Gorman
2010-04-16 14:57                                           ` Mel Gorman
2010-04-17  2:37                                           ` Dave Chinner
2010-04-17  2:37                                             ` Dave Chinner
2010-04-16 14:55                                       ` Mel Gorman
2010-04-16 14:55                                         ` Mel Gorman
2010-04-15 18:22                                 ` Valdis.Kletnieks
2010-04-16  9:39                                   ` Mel Gorman
2010-04-16  9:39                                     ` Mel Gorman
2010-04-15 10:24                             ` [PATCH 2/4] [cleanup] mm: introduce free_pages_prepare KOSAKI Motohiro
2010-04-15 10:24                               ` KOSAKI Motohiro
2010-04-15 10:24                               ` KOSAKI Motohiro
2010-04-15 13:33                               ` Mel Gorman
2010-04-15 13:33                                 ` Mel Gorman
2010-04-15 10:24                             ` [PATCH 3/4] mm: introduce free_pages_bulk KOSAKI Motohiro
2010-04-15 10:24                               ` KOSAKI Motohiro
2010-04-15 10:24                               ` KOSAKI Motohiro
2010-04-15 13:46                               ` Mel Gorman
2010-04-15 13:46                                 ` Mel Gorman
2010-04-15 10:26                             ` [PATCH 4/4] vmscan: replace the pagevec in shrink_inactive_list() with list KOSAKI Motohiro
2010-04-15 10:26                               ` KOSAKI Motohiro
2010-04-15 10:28                   ` [PATCH] mm: disallow direct reclaim page writeback Mel Gorman
2010-04-15 10:28                     ` Mel Gorman
2010-04-15 13:42                     ` Chris Mason
2010-04-15 13:42                       ` Chris Mason
2010-04-15 17:50                       ` tytso
2010-04-15 17:50                       ` tytso
2010-04-15 17:50                         ` tytso
2010-04-16 15:05                       ` Mel Gorman
2010-04-16 15:05                         ` Mel Gorman
2010-04-19 15:15                         ` Mel Gorman
2010-04-19 15:15                         ` Mel Gorman
2010-04-19 15:15                           ` Mel Gorman
2010-04-19 17:38                           ` Chris Mason
2010-04-16 15:05                       ` Mel Gorman
2010-04-16  4:14                     ` Dave Chinner
2010-04-16  4:14                       ` Dave Chinner
2010-04-16 15:14                       ` Mel Gorman
2010-04-16 15:14                         ` Mel Gorman
2010-04-18  0:32                         ` Andrew Morton
2010-04-18  0:32                           ` Andrew Morton
2010-04-18 19:05                           ` Christoph Hellwig
2010-04-18 19:05                             ` Christoph Hellwig
2010-04-18 16:31                             ` Andrew Morton
2010-04-18 16:31                               ` Andrew Morton
2010-04-18 19:35                               ` Christoph Hellwig
2010-04-18 19:35                                 ` Christoph Hellwig
2010-04-18 19:11                             ` Sorin Faibish
2010-04-18 19:11                               ` Sorin Faibish
2010-04-18 19:11                               ` Sorin Faibish
2010-04-18 19:10                           ` Sorin Faibish
2010-04-18 19:10                             ` Sorin Faibish
2010-04-18 19:10                             ` Sorin Faibish
2010-04-18 21:30                             ` James Bottomley
2010-04-18 21:30                               ` James Bottomley
2010-04-18 23:34                               ` Sorin Faibish
2010-04-18 23:34                                 ` Sorin Faibish
2010-04-18 23:34                                 ` Sorin Faibish
2010-04-19  3:08                               ` tytso
2010-04-19  3:08                                 ` tytso
2010-04-19  0:35                           ` Dave Chinner
2010-04-19  0:35                             ` Dave Chinner
2010-04-19  0:49                             ` Arjan van de Ven
2010-04-19  0:49                               ` Arjan van de Ven
2010-04-19  1:08                               ` Dave Chinner
2010-04-19  1:08                                 ` Dave Chinner
2010-04-19  4:32                                 ` Arjan van de Ven
2010-04-19  4:32                                   ` Arjan van de Ven
2010-04-19 15:20                         ` Mel Gorman
2010-04-19 15:20                           ` Mel Gorman
2010-04-23  1:06                           ` Dave Chinner
2010-04-23  1:06                             ` Dave Chinner
2010-04-23 10:50                             ` Mel Gorman
2010-04-23 10:50                               ` Mel Gorman
2010-04-15 14:57                   ` Andi Kleen
2010-04-15 14:57                     ` Andi Kleen
2010-04-15  2:37                 ` Johannes Weiner
2010-04-15  2:37                   ` Johannes Weiner
2010-04-15  2:43                   ` KOSAKI Motohiro
2010-04-15  2:43                     ` KOSAKI Motohiro
2010-04-16 23:56                     ` Johannes Weiner
2010-04-16 23:56                       ` Johannes Weiner
2010-04-14  6:52         ` KOSAKI Motohiro
2010-04-14  6:52           ` KOSAKI Motohiro
2010-04-14 10:06         ` Andi Kleen
2010-04-14 10:06           ` Andi Kleen
2010-04-14 10:06           ` Andi Kleen
2010-04-14 11:20           ` Chris Mason
2010-04-14 11:20             ` Chris Mason
2010-04-14 12:15             ` Andi Kleen
2010-04-14 12:15               ` Andi Kleen
2010-04-14 12:15               ` Andi Kleen
2010-04-14 12:32               ` Alan Cox
2010-04-14 12:32                 ` Alan Cox
2010-04-14 12:34                 ` Andi Kleen
2010-04-14 12:34                   ` Andi Kleen
2010-04-14 13:23             ` Mel Gorman
2010-04-14 13:23               ` Mel Gorman
2010-04-14 14:07               ` Chris Mason
2010-04-14 14:07                 ` Chris Mason
2010-04-14  0:24 ` Minchan Kim
2010-04-14  0:24   ` Minchan Kim
2010-04-14  4:44   ` Dave Chinner
2010-04-14  4:44     ` Dave Chinner
2010-04-14  7:54     ` Minchan Kim
2010-04-14  7:54       ` Minchan Kim
2010-04-16  1:13 ` KAMEZAWA Hiroyuki
2010-04-16  1:13   ` KAMEZAWA Hiroyuki
2010-04-16  4:18   ` KAMEZAWA Hiroyuki
2010-04-16  4:18     ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100414031205.GE2493@dastard \
    --to=david@fromorbit.com \
    --cc=chris.mason@oracle.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.