linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][5/?] count writeback pages in nr_scanned
@ 2005-01-03 17:25 Rik van Riel
  2005-01-05 10:08 ` Andrew Morton
  2005-01-05 23:26 ` Andrew Morton
  0 siblings, 2 replies; 43+ messages in thread
From: Rik van Riel @ 2005-01-03 17:25 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Andrea Arcangeli, Marcelo Tosatti

Still untested, but posting the concept here anyway, since this
could explain a lot...

OOM kills have been observed with 70% of the pages in lowmem being
in the writeback state.  If we count those pages in sc->nr_scanned,
the VM should throttle and wait for IO completion, instead of OOM
killing.

Signed-off-by: Rik van Riel <riel@redhat.com>

--- linux-2.6.9/mm/vmscan.c.screclaim	2005-01-03 12:17:56.547148905 -0500
+++ linux-2.6.9/mm/vmscan.c	2005-01-03 12:18:16.855965416 -0500
@@ -376,10 +376,10 @@

  		BUG_ON(PageActive(page));

+		sc->nr_scanned++;
  		if (PageWriteback(page))
  			goto keep_locked;

-		sc->nr_scanned++;
  		/* Double the slab pressure for mapped and swapcache pages */
  		if (page_mapped(page) || PageSwapCache(page))
  			sc->nr_scanned++;

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-03 17:25 [PATCH][5/?] count writeback pages in nr_scanned Rik van Riel
@ 2005-01-05 10:08 ` Andrew Morton
  2005-01-05 18:06   ` Andrea Arcangeli
  2005-01-05 23:26 ` Andrew Morton
  1 sibling, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2005-01-05 10:08 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, andrea, marcelo.tosatti

Rik van Riel <riel@redhat.com> wrote:
>
> Still untested, but posting the concept here anyway, since this
>  could explain a lot...
> 
>  OOM kills have been observed with 70% of the pages in lowmem being
>  in the writeback state.  If we count those pages in sc->nr_scanned,
>  the VM should throttle and wait for IO completion, instead of OOM
>  killing.
> 
>  Signed-off-by: Rik van Riel <riel@redhat.com>
> 
>  --- linux-2.6.9/mm/vmscan.c.screclaim	2005-01-03 12:17:56.547148905 -0500
>  +++ linux-2.6.9/mm/vmscan.c	2005-01-03 12:18:16.855965416 -0500
>  @@ -376,10 +376,10 @@
> 
>    		BUG_ON(PageActive(page));
> 
>  +		sc->nr_scanned++;
>    		if (PageWriteback(page))
>    			goto keep_locked;
> 
>  -		sc->nr_scanned++;

Patch looks very sane.  It in fact restores that which we were doing until
12 June 2004, when the rampant `struct scan_control' depredations violated
the tree.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-05 18:50     ` Rik van Riel
@ 2005-01-05 17:49       ` Marcelo Tosatti
  2005-01-05 21:44         ` Andrew Morton
  0 siblings, 1 reply; 43+ messages in thread
From: Marcelo Tosatti @ 2005-01-05 17:49 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Andrea Arcangeli, Andrew Morton, linux-kernel

On Wed, Jan 05, 2005 at 01:50:51PM -0500, Rik van Riel wrote:
> On Wed, 5 Jan 2005, Andrea Arcangeli wrote:
> 
> >Another unrelated problem I have in this same area and that can explain
> >VM troubles at least theoretically, is that blk_congestion_wait is
> >broken by design. First we cannot wait on random I/O not related to
> >write back. Second blk_congestion_wait gets trivially fooled by
> >direct-io for example. Plus the timeout may cause it to return too early
> >with slow blkdev.
> 
> Or the IO that just finished, finished for pages in
> another memory zone, or pages we won't scan again in
> our current go-around through the VM...

Thing is there is no distinction between pages which have been written out 
for what purpose at the block level. 

One can conjecture the following: per-zone waitqueue to be awakened from 
end_page_writeback() (for PG_reclaim pages only of course), and a function
to wait on the perzone waitqueue:

 wait_vm_writeback (zone, timeout);

Instead of the current blk_congestion_wait() on try_to_free_pages/balance_pgdat.

It will probably slow the reclaiming procedure in general, though, and has
other side effects, but its the only way of strictly following VM writeback
progress from VM reclaiming routines.

Does it make any sense?

> >blk_congestion_wait is a fundamental piece to get oom detection right
> >during writeback and unfortunately it's fundamentally fragile in 2.6
> >(this as usual wasn't the case in 2.4).
> 
> Indeed ;(

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-05 10:08 ` Andrew Morton
@ 2005-01-05 18:06   ` Andrea Arcangeli
  2005-01-05 18:50     ` Rik van Riel
  0 siblings, 1 reply; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-05 18:06 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Rik van Riel, linux-kernel, marcelo.tosatti

On Wed, Jan 05, 2005 at 02:08:59AM -0800, Andrew Morton wrote:
> Rik van Riel <riel@redhat.com> wrote:
> >
> > Still untested, but posting the concept here anyway, since this
> >  could explain a lot...
> > 
> >  OOM kills have been observed with 70% of the pages in lowmem being
> >  in the writeback state.  If we count those pages in sc->nr_scanned,
> >  the VM should throttle and wait for IO completion, instead of OOM
> >  killing.
> > 
> >  Signed-off-by: Rik van Riel <riel@redhat.com>
> > 
> >  --- linux-2.6.9/mm/vmscan.c.screclaim	2005-01-03 12:17:56.547148905 -0500
> >  +++ linux-2.6.9/mm/vmscan.c	2005-01-03 12:18:16.855965416 -0500
> >  @@ -376,10 +376,10 @@
> > 
> >    		BUG_ON(PageActive(page));
> > 
> >  +		sc->nr_scanned++;
> >    		if (PageWriteback(page))
> >    			goto keep_locked;
> > 
> >  -		sc->nr_scanned++;
> 
> Patch looks very sane.  It in fact restores that which we were doing until
> 12 June 2004, when the rampant `struct scan_control' depredations violated
> the tree.

Agreed.

Another unrelated problem I have in this same area and that can explain
VM troubles at least theoretically, is that blk_congestion_wait is
broken by design. First we cannot wait on random I/O not related to
write back. Second blk_congestion_wait gets trivially fooled by
direct-io for example. Plus the timeout may cause it to return too early
with slow blkdev.

blk_congestion_wait is a fundamental piece to get oom detection right
during writeback and unfortunately it's fundamentally fragile in 2.6
(this as usual wasn't the case in 2.4).

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-05 18:06   ` Andrea Arcangeli
@ 2005-01-05 18:50     ` Rik van Riel
  2005-01-05 17:49       ` Marcelo Tosatti
  0 siblings, 1 reply; 43+ messages in thread
From: Rik van Riel @ 2005-01-05 18:50 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Andrew Morton, linux-kernel, marcelo.tosatti

On Wed, 5 Jan 2005, Andrea Arcangeli wrote:

> Another unrelated problem I have in this same area and that can explain
> VM troubles at least theoretically, is that blk_congestion_wait is
> broken by design. First we cannot wait on random I/O not related to
> write back. Second blk_congestion_wait gets trivially fooled by
> direct-io for example. Plus the timeout may cause it to return too early
> with slow blkdev.

Or the IO that just finished, finished for pages in
another memory zone, or pages we won't scan again in
our current go-around through the VM...

> blk_congestion_wait is a fundamental piece to get oom detection right
> during writeback and unfortunately it's fundamentally fragile in 2.6
> (this as usual wasn't the case in 2.4).

Indeed ;(

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-05 21:44         ` Andrew Morton
@ 2005-01-05 20:32           ` Marcelo Tosatti
  2005-01-05 23:51             ` Nick Piggin
  0 siblings, 1 reply; 43+ messages in thread
From: Marcelo Tosatti @ 2005-01-05 20:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: riel, andrea, linux-kernel


> The caller would need to wait on all the zones which can satisfy the
> caller's allocation request.  A bit messy, although not rocket science. 
> One would have to be careful to avoid additional CPU consumption due to
> delivery of multiple wakeups at each I/O completion.
> 
> We should be able to demonstrate that such a change really fixes some
> problem though.  Otherwise, why bother?

Agreed. The current scheme works well enough, we dont have spurious OOM kills
anymore, which is the only "problem" such change ought to fix. 

You might have performance increase in some situations I believe (because you 
have perzone waitqueues), but I agree its does not seem to be worth the
trouble.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-05 17:49       ` Marcelo Tosatti
@ 2005-01-05 21:44         ` Andrew Morton
  2005-01-05 20:32           ` Marcelo Tosatti
  0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2005-01-05 21:44 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: riel, andrea, linux-kernel

Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
>
> On Wed, Jan 05, 2005 at 01:50:51PM -0500, Rik van Riel wrote:
> > On Wed, 5 Jan 2005, Andrea Arcangeli wrote:
> > 
> > >Another unrelated problem I have in this same area and that can explain
> > >VM troubles at least theoretically, is that blk_congestion_wait is
> > >broken by design. First we cannot wait on random I/O not related to
> > >write back. Second blk_congestion_wait gets trivially fooled by
> > >direct-io for example. Plus the timeout may cause it to return too early
> > >with slow blkdev.

That's true, as we discussed a couple of months back.  But the current code
is nice and simple and has been there for a couple of years with no
observed problems.


> > Or the IO that just finished, finished for pages in
> > another memory zone, or pages we won't scan again in
> > our current go-around through the VM...
> 
> Thing is there is no distinction between pages which have been written out 
> for what purpose at the block level. 
> 
> One can conjecture the following: per-zone waitqueue to be awakened from 
> end_page_writeback() (for PG_reclaim pages only of course), and a function
> to wait on the perzone waitqueue:
> 
>  wait_vm_writeback (zone, timeout);
> 
> Instead of the current blk_congestion_wait() on try_to_free_pages/balance_pgdat.
> 

The caller would need to wait on all the zones which can satisfy the
caller's allocation request.  A bit messy, although not rocket science. 
One would have to be careful to avoid additional CPU consumption due to
delivery of multiple wakeups at each I/O completion.

We should be able to demonstrate that such a change really fixes some
problem though.  Otherwise, why bother?



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-03 17:25 [PATCH][5/?] count writeback pages in nr_scanned Rik van Riel
  2005-01-05 10:08 ` Andrew Morton
@ 2005-01-05 23:26 ` Andrew Morton
  1 sibling, 0 replies; 43+ messages in thread
From: Andrew Morton @ 2005-01-05 23:26 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, andrea, marcelo.tosatti

Rik van Riel <riel@redhat.com> wrote:
>
> OOM kills have been observed with 70% of the pages in lowmem being
>  in the writeback state.  If we count those pages in sc->nr_scanned,
>  the VM should throttle and wait for IO completion, instead of OOM
>  killing.

I'll queue this up:



From: Rik van Riel <riel@redhat.com>

OOM kills have been observed with 70% of the pages in lowmem being in the
writeback state.  If we count those pages in sc->nr_scanned, the VM should
throttle and wait for IO completion, instead of OOM killing.

(akpm: this is how the code was designed to work - we broke it six months
ago).

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 25-akpm/mm/vmscan.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff -puN mm/vmscan.c~vmscan-count-writeback-pages-in-nr_scanned mm/vmscan.c
--- 25/mm/vmscan.c~vmscan-count-writeback-pages-in-nr_scanned	2005-01-05 15:24:53.730874336 -0800
+++ 25-akpm/mm/vmscan.c	2005-01-05 15:25:48.286580608 -0800
@@ -369,14 +369,14 @@ static int shrink_list(struct list_head 
 
 		BUG_ON(PageActive(page));
 
-		if (PageWriteback(page))
-			goto keep_locked;
-
 		sc->nr_scanned++;
 		/* Double the slab pressure for mapped and swapcache pages */
 		if (page_mapped(page) || PageSwapCache(page))
 			sc->nr_scanned++;
 
+		if (PageWriteback(page))
+			goto keep_locked;
+
 		referenced = page_referenced(page, 1, sc->priority <= 0);
 		/* In active use or really unfreeable?  Activate it. */
 		if (referenced && page_mapping_inuse(page))
_


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-05 20:32           ` Marcelo Tosatti
@ 2005-01-05 23:51             ` Nick Piggin
  2005-01-06  1:27               ` Rik van Riel
  0 siblings, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-05 23:51 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Andrew Morton, riel, andrea, linux-kernel

Marcelo Tosatti wrote:
>>The caller would need to wait on all the zones which can satisfy the
>>caller's allocation request.  A bit messy, although not rocket science. 
>>One would have to be careful to avoid additional CPU consumption due to
>>delivery of multiple wakeups at each I/O completion.
>>
>>We should be able to demonstrate that such a change really fixes some
>>problem though.  Otherwise, why bother?
> 
> 
> Agreed. The current scheme works well enough, we dont have spurious OOM kills
> anymore, which is the only "problem" such change ought to fix. 
> 
> You might have performance increase in some situations I believe (because you 
> have perzone waitqueues), but I agree its does not seem to be worth the
> trouble.

I think what Andrea is worried about is that blk_congestion_wait is
fairly vague, and can be a source of instability in the scanning
implementation.

For example, if you have a heavy IO workload that is saturating your
disks, blk_congestion_wait may do the right thing and sleep until
they become uncongested and writeout can continue.

But at 2:00 am, when your backup job is trickling writes into another
block device, blk_congestion_wait returns much earlier, and before
many pages have been cleaned.

Bad example? Yeah maybe, but I think this is what Andrea is getting
at. Would it be a problem to replace those blk_congestion_waits with
unconditional io_schedule_timeout()s? That would be the dumb-but-more
-deterministic solution.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-05 23:51             ` Nick Piggin
@ 2005-01-06  1:27               ` Rik van Riel
  2005-01-06  1:33                 ` Nick Piggin
  2005-01-06  1:36                 ` Andrew Morton
  0 siblings, 2 replies; 43+ messages in thread
From: Rik van Riel @ 2005-01-06  1:27 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Marcelo Tosatti, Andrew Morton, andrea, linux-kernel

On Thu, 6 Jan 2005, Nick Piggin wrote:

> I think what Andrea is worried about is that blk_congestion_wait is 
> fairly vague, and can be a source of instability in the scanning 
> implementation.

The recent OOM kill problem has been happening:
1) with cache pressure on lowmem only, due to a block device write
2) with no block congestion at all
3) with pretty much all pageable lowmme pages in writeback state

It appears the VM has trouble dealing with the situation where
there is no block congestion to wait on...

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  1:27               ` Rik van Riel
@ 2005-01-06  1:33                 ` Nick Piggin
  2005-01-06  1:37                   ` Andrew Morton
  2005-01-06  1:36                 ` Andrew Morton
  1 sibling, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  1:33 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Marcelo Tosatti, Andrew Morton, andrea, linux-kernel

Rik van Riel wrote:
> On Thu, 6 Jan 2005, Nick Piggin wrote:
> 
>> I think what Andrea is worried about is that blk_congestion_wait is 
>> fairly vague, and can be a source of instability in the scanning 
>> implementation.
> 
> 
> The recent OOM kill problem has been happening:
> 1) with cache pressure on lowmem only, due to a block device write
> 2) with no block congestion at all
> 3) with pretty much all pageable lowmme pages in writeback state
> 
> It appears the VM has trouble dealing with the situation where
> there is no block congestion to wait on...
> 

Try, together with your nr_scanned patch, to replace blk_congestion_wait
with io_schedule_timeout.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  1:27               ` Rik van Riel
  2005-01-06  1:33                 ` Nick Piggin
@ 2005-01-06  1:36                 ` Andrew Morton
  2005-01-06  3:42                   ` Rik van Riel
  1 sibling, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2005-01-06  1:36 UTC (permalink / raw)
  To: Rik van Riel; +Cc: nickpiggin, marcelo.tosatti, andrea, linux-kernel

Rik van Riel <riel@redhat.com> wrote:
>
> On Thu, 6 Jan 2005, Nick Piggin wrote:
> 
> > I think what Andrea is worried about is that blk_congestion_wait is 
> > fairly vague, and can be a source of instability in the scanning 
> > implementation.
> 
> The recent OOM kill problem has been happening:
> 1) with cache pressure on lowmem only, due to a block device write
> 2) with no block congestion at all
> 3) with pretty much all pageable lowmme pages in writeback state

You must have a wild number of requests configured in the queue.  Is this
CFQ?

I've done testing with "all of memory under writeback" before and it went
OK.  It's certainly a design objective to handle this well.  But that
testing was before we broke it.

> It appears the VM has trouble dealing with the situation where
> there is no block congestion to wait on...

It's misnamed.  We don't "wait for the queue to come out of congestion". 
We simply throttle until a write completes, or, rarely, timeout.

The bug which you fixed would cause the VM to scan itself to death without
throttling.  Did the fix fix things?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  1:33                 ` Nick Piggin
@ 2005-01-06  1:37                   ` Andrew Morton
  2005-01-06  1:40                     ` Nick Piggin
  0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2005-01-06  1:37 UTC (permalink / raw)
  To: Nick Piggin; +Cc: riel, marcelo.tosatti, andrea, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> Rik van Riel wrote:
> > On Thu, 6 Jan 2005, Nick Piggin wrote:
> > 
> >> I think what Andrea is worried about is that blk_congestion_wait is 
> >> fairly vague, and can be a source of instability in the scanning 
> >> implementation.
> > 
> > 
> > The recent OOM kill problem has been happening:
> > 1) with cache pressure on lowmem only, due to a block device write
> > 2) with no block congestion at all
> > 3) with pretty much all pageable lowmme pages in writeback state
> > 
> > It appears the VM has trouble dealing with the situation where
> > there is no block congestion to wait on...
> > 
> 
> Try, together with your nr_scanned patch, to replace blk_congestion_wait
> with io_schedule_timeout.

Why?  Is the nr_scanned fix insufficient?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  1:37                   ` Andrew Morton
@ 2005-01-06  1:40                     ` Nick Piggin
  2005-01-06  1:52                       ` Andrea Arcangeli
  0 siblings, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  1:40 UTC (permalink / raw)
  To: Andrew Morton; +Cc: riel, marcelo.tosatti, andrea, linux-kernel

Andrew Morton wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
>>Rik van Riel wrote:
>>
>>>On Thu, 6 Jan 2005, Nick Piggin wrote:
>>>
>>>
>>>>I think what Andrea is worried about is that blk_congestion_wait is 
>>>>fairly vague, and can be a source of instability in the scanning 
>>>>implementation.
>>>
>>>
>>>The recent OOM kill problem has been happening:
>>>1) with cache pressure on lowmem only, due to a block device write
>>>2) with no block congestion at all
>>>3) with pretty much all pageable lowmme pages in writeback state
>>>
>>>It appears the VM has trouble dealing with the situation where
>>>there is no block congestion to wait on...
>>>
>>
>>Try, together with your nr_scanned patch, to replace blk_congestion_wait
>>with io_schedule_timeout.
> 
> 
> Why?  Is the nr_scanned fix insufficient?
> 

I thought it sounded like he implied that nr_scanned was insufficient
(otherwise he might have said "to wait on ... but my patch fixes it").



^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  1:40                     ` Nick Piggin
@ 2005-01-06  1:52                       ` Andrea Arcangeli
  0 siblings, 0 replies; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  1:52 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, riel, marcelo.tosatti, linux-kernel

On Thu, Jan 06, 2005 at 12:40:47PM +1100, Nick Piggin wrote:
> I thought it sounded like he implied that nr_scanned was insufficient
> (otherwise he might have said "to wait on ... but my patch fixes it").

BTW, from my part I can't reproduce it (even without the nr_scanned,
that I'm going to apply too anyway just in case), but my hardware may
have different timings dunno.

All I could reproduce were the swap-token issues (I had a flood of
reports about that too, all fixed by now with my 1-6/4 patches). The
other patches agreed here looks good but I don't have actual pending
bugs on that side and I can't reproduce problems either in basic
testing, so they're applied now for correctness.

Thanks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  1:36                 ` Andrew Morton
@ 2005-01-06  3:42                   ` Rik van Riel
  2005-01-06  3:50                     ` Nick Piggin
  0 siblings, 1 reply; 43+ messages in thread
From: Rik van Riel @ 2005-01-06  3:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nickpiggin, marcelo.tosatti, andrea, linux-kernel

On Wed, 5 Jan 2005, Andrew Morton wrote:
> Rik van Riel <riel@redhat.com> wrote:

>> The recent OOM kill problem has been happening:
>> 1) with cache pressure on lowmem only, due to a block device write
>> 2) with no block congestion at all
>> 3) with pretty much all pageable lowmme pages in writeback state
>
> You must have a wild number of requests configured in the queue.  Is 
> this CFQ?

Yes, it is with CFQ.  Around 650MB of lowmem is in writeback
stage, which is over 99% of the active and inactive lowmem
pages...

> I've done testing with "all of memory under writeback" before and it 
> went OK.  It's certainly a design objective to handle this well.  But 
> that testing was before we broke it.

I suspect something might still be broken.  It may take a few
days of continuous testing to trigger the bug, though ...

> The bug which you fixed would cause the VM to scan itself to death 
> without throttling.  Did the fix fix things?

Mostly fixed.  Things are down to the point where it often
takes over a day to bring out the bug.

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  3:42                   ` Rik van Riel
@ 2005-01-06  3:50                     ` Nick Piggin
  2005-01-06  4:26                       ` Andrew Morton
  0 siblings, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  3:50 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Andrew Morton, marcelo.tosatti, andrea, linux-kernel

Rik van Riel wrote:
> On Wed, 5 Jan 2005, Andrew Morton wrote:
> 
>> Rik van Riel <riel@redhat.com> wrote:
> 
> 
>>> The recent OOM kill problem has been happening:
>>> 1) with cache pressure on lowmem only, due to a block device write
>>> 2) with no block congestion at all
>>> 3) with pretty much all pageable lowmme pages in writeback state
>>
>>
>> You must have a wild number of requests configured in the queue.  Is 
>> this CFQ?
> 
> 
> Yes, it is with CFQ.  Around 650MB of lowmem is in writeback
> stage, which is over 99% of the active and inactive lowmem
> pages...
> 
>> I've done testing with "all of memory under writeback" before and it 
>> went OK.  It's certainly a design objective to handle this well.  But 
>> that testing was before we broke it.
> 
> 
> I suspect something might still be broken.  It may take a few
> days of continuous testing to trigger the bug, though ...
> 

It is possible to be those blk_congestion_wait paths, because
the queue simply won't be congested. So doing io_schedule_timeout
might help.

I wonder if reducing the size of the write queue in CFQ would help
too? IIRC, it only really wants a huge read queue.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  3:50                     ` Nick Piggin
@ 2005-01-06  4:26                       ` Andrew Morton
  2005-01-06  4:35                         ` Nick Piggin
  0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2005-01-06  4:26 UTC (permalink / raw)
  To: Nick Piggin; +Cc: riel, marcelo.tosatti, andrea, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> > 
> > I suspect something might still be broken.  It may take a few
> > days of continuous testing to trigger the bug, though ...
> > 
> 
> It is possible to be those blk_congestion_wait paths, because
> the queue simply won't be congested. So doing io_schedule_timeout
> might help.

If the queue is not congested, blk_congestion_wait() will still sleep.  See
freed_request().

> I wonder if reducing the size of the write queue in CFQ would help
> too? IIRC, it only really wants a huge read queue.

Surely it will help - but we need to be able to handle the situation
because memory can still become full of PageWriteback pages if there are
many disks.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  4:26                       ` Andrew Morton
@ 2005-01-06  4:35                         ` Nick Piggin
  2005-01-06  4:47                           ` Andrew Morton
  0 siblings, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  4:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: riel, marcelo.tosatti, andrea, linux-kernel

Andrew Morton wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
>>>I suspect something might still be broken.  It may take a few
>>>days of continuous testing to trigger the bug, though ...
>>>
>>
>>It is possible to be those blk_congestion_wait paths, because
>>the queue simply won't be congested. So doing io_schedule_timeout
>>might help.
> 
> 
> If the queue is not congested, blk_congestion_wait() will still sleep.  See
> freed_request().
> 

Hmm... doesn't look like it to me:

         if (rl->count[rw] < queue_congestion_off_threshold(q))
                 clear_queue_congested(q, rw);

And clear_queue_congested does an unconditional wakeup (if there
is someone sleeping on the congestion queue).

> 
>>I wonder if reducing the size of the write queue in CFQ would help
>>too? IIRC, it only really wants a huge read queue.
> 
> 
> Surely it will help - but we need to be able to handle the situation
> because memory can still become full of PageWriteback pages if there are
> many disks.
> 

Yep.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  4:35                         ` Nick Piggin
@ 2005-01-06  4:47                           ` Andrew Morton
  2005-01-06  4:55                             ` Nick Piggin
  2005-01-06  4:59                             ` [PATCH][5/?] count writeback pages in nr_scanned Andrea Arcangeli
  0 siblings, 2 replies; 43+ messages in thread
From: Andrew Morton @ 2005-01-06  4:47 UTC (permalink / raw)
  To: Nick Piggin; +Cc: riel, marcelo.tosatti, andrea, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
>  > If the queue is not congested, blk_congestion_wait() will still sleep.  See
>  > freed_request().
>  > 
> 
>  Hmm... doesn't look like it to me:
> 
>           if (rl->count[rw] < queue_congestion_off_threshold(q))
>                   clear_queue_congested(q, rw);
> 
>  And clear_queue_congested does an unconditional wakeup (if there
>  is someone sleeping on the congestion queue).

That's my point.  blk_congestion_wait() will always sleep, regardless of
the queue's congestion state.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  4:47                           ` Andrew Morton
@ 2005-01-06  4:55                             ` Nick Piggin
  2005-01-06  5:03                               ` Andrea Arcangeli
  2005-01-06  8:06                               ` Jens Axboe
  2005-01-06  4:59                             ` [PATCH][5/?] count writeback pages in nr_scanned Andrea Arcangeli
  1 sibling, 2 replies; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  4:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: riel, marcelo.tosatti, andrea, linux-kernel, Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 1052 bytes --]

Andrew Morton wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
>> > If the queue is not congested, blk_congestion_wait() will still sleep.  See
>> > freed_request().
>> > 
>>
>> Hmm... doesn't look like it to me:
>>
>>          if (rl->count[rw] < queue_congestion_off_threshold(q))
>>                  clear_queue_congested(q, rw);
>>
>> And clear_queue_congested does an unconditional wakeup (if there
>> is someone sleeping on the congestion queue).
> 
> 
> That's my point.  blk_congestion_wait() will always sleep, regardless of
> the queue's congestion state.
> 

Oh yes, but it will return as soon as a single request is finished.
Which is probably a couple of milliseconds, rather than the 100 we
had hoped for. So the allocators will wake up again and go around
the loop and still make no progress.

However, if you had a plain io_schedule_timeout there, at least you
would sleep for the full extend of the specified timeout.

BTW. Jens, now that I look at freed_request, is the memory barrier
required? If so, what is it protecting?


[-- Attachment #2: blk-no-mb.patch --]
[-- Type: text/plain, Size: 1275 bytes --]



This memory barrier is not needed because the waitqueue will only get
waiters on it in the following situations:

rq->count has exceeded the threshold - however all manipulations of ->count
are performed under the runqueue lock, and so we will correctly pick up any
waiter.

Memory allocation for the request fails. In this case, there is no additional
help provided by the memory barrier. We are guaranteed to eventually wake
up waiters because the request allocation mempool guarantees that if the mem
allocation for a request fails, there must be some requests in flight. They
will wake up waiters when they are retired.



---

 linux-2.6-npiggin/drivers/block/ll_rw_blk.c |    1 -
 1 files changed, 1 deletion(-)

diff -puN drivers/block/ll_rw_blk.c~blk-no-mb drivers/block/ll_rw_blk.c
--- linux-2.6/drivers/block/ll_rw_blk.c~blk-no-mb	2005-01-06 15:37:05.000000000 +1100
+++ linux-2.6-npiggin/drivers/block/ll_rw_blk.c	2005-01-06 15:37:12.000000000 +1100
@@ -1630,7 +1630,6 @@ static void freed_request(request_queue_
 	if (rl->count[rw] < queue_congestion_off_threshold(q))
 		clear_queue_congested(q, rw);
 	if (rl->count[rw]+1 <= q->nr_requests) {
-		smp_mb();
 		if (waitqueue_active(&rl->wait[rw]))
 			wake_up(&rl->wait[rw]);
 		blk_clear_queue_full(q, rw);

_

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  4:47                           ` Andrew Morton
  2005-01-06  4:55                             ` Nick Piggin
@ 2005-01-06  4:59                             ` Andrea Arcangeli
  2005-01-06  5:05                               ` Andrew Morton
  2005-01-06  5:06                               ` Nick Piggin
  1 sibling, 2 replies; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  4:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, riel, marcelo.tosatti, linux-kernel

On Wed, Jan 05, 2005 at 08:47:06PM -0800, Andrew Morton wrote:
> That's my point.  blk_congestion_wait() will always sleep, regardless of

Since I've no pending bugs at all with the mainline codebase I rate this
just a theoretical issue: but even sleeping for a fixed amount of time
is unreliable there, for example if the storage is very slow. That's why
using io_schedule_timeout for that isn't going to be a fix.

The fix is very simple and it is to call wait_on_page_writeback on one
of the pages under writeback. That guarantees some _writeback_ progress
has been made before retrying. That way some random direct-io or a too
short timeout, can't cause malfunction.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  4:55                             ` Nick Piggin
@ 2005-01-06  5:03                               ` Andrea Arcangeli
  2005-01-06  8:06                               ` Jens Axboe
  1 sibling, 0 replies; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  5:03 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, riel, marcelo.tosatti, linux-kernel, Jens Axboe

On Thu, Jan 06, 2005 at 03:55:34PM +1100, Nick Piggin wrote:
> However, if you had a plain io_schedule_timeout there, at least you
> would sleep for the full extend of the specified timeout.

I agree it sure would be safer but OTOH it may screwup performance by
waiting for unnecessary long times on fast stroage.

So it's ok for a test, but still it wouldn't be a final fix since the
timeout may be still too short in some case.

Waiting on one (or more) PG_writeback bitflags to go away should fix it
completely. This is how 2.4 throttles on the writeback I/O too of
course. How we choose which random page to pick may vary though.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  4:59                             ` [PATCH][5/?] count writeback pages in nr_scanned Andrea Arcangeli
@ 2005-01-06  5:05                               ` Andrew Morton
  2005-01-06  5:17                                 ` Andrea Arcangeli
  2005-01-06  5:06                               ` Nick Piggin
  1 sibling, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2005-01-06  5:05 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: nickpiggin, riel, marcelo.tosatti, linux-kernel

Andrea Arcangeli <andrea@suse.de> wrote:
>
> The fix is very simple and it is to call wait_on_page_writeback on one
>  of the pages under writeback.

eek, no.  That was causing waits of five seconds or more.  Fixing this
caused the single greatest improvement in page allocator latency in early
2.5.  We're totally at the mercy of the elevator algorithm this way.

If we're to improve things in there we want to wait on _any_ eligible page
becoming reclaimable, not on a particular page.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  4:59                             ` [PATCH][5/?] count writeback pages in nr_scanned Andrea Arcangeli
  2005-01-06  5:05                               ` Andrew Morton
@ 2005-01-06  5:06                               ` Nick Piggin
  2005-01-06  5:21                                 ` Andrea Arcangeli
  1 sibling, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  5:06 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Andrew Morton, riel, marcelo.tosatti, linux-kernel

Andrea Arcangeli wrote:
> On Wed, Jan 05, 2005 at 08:47:06PM -0800, Andrew Morton wrote:
> 
>>That's my point.  blk_congestion_wait() will always sleep, regardless of
> 
> 
> Since I've no pending bugs at all with the mainline codebase I rate this
> just a theoretical issue: but even sleeping for a fixed amount of time
> is unreliable there, for example if the storage is very slow. That's why
> using io_schedule_timeout for that isn't going to be a fix.
> 
> The fix is very simple and it is to call wait_on_page_writeback on one
> of the pages under writeback. That guarantees some _writeback_ progress
> has been made before retrying. That way some random direct-io or a too
> short timeout, can't cause malfunction.
> 

Yeah I guess that isn't a bad idea.

Doesn't that have the theoretical problem of slowing down reclaim
of dirty pages backed by fast devices when you have slow devices
writing?

But presumably you'd usually have pdflush working away anyway.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:05                               ` Andrew Morton
@ 2005-01-06  5:17                                 ` Andrea Arcangeli
  2005-01-06  5:19                                   ` Nick Piggin
  0 siblings, 1 reply; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  5:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nickpiggin, riel, marcelo.tosatti, linux-kernel

On Wed, Jan 05, 2005 at 09:05:39PM -0800, Andrew Morton wrote:
> Andrea Arcangeli <andrea@suse.de> wrote:
> >
> > The fix is very simple and it is to call wait_on_page_writeback on one
> >  of the pages under writeback.
> 
> eek, no.  That was causing waits of five seconds or more.  Fixing this
> caused the single greatest improvement in page allocator latency in early
> 2.5.  We're totally at the mercy of the elevator algorithm this way.
> 
> If we're to improve things in there we want to wait on _any_ eligible page
> becoming reclaimable, not on a particular page.

I told you one way to fix it. I didn't guarantee it was the most
efficient one.

I sure agree waiting on any page to complete writeback is going to fix
it too. Exactly because this page was a "random" page anyway.

Still my point is that this is a bug, and I prefer to be slow and safe
like 2.4, than fast and unreliable like 2.6.

The slight improvement you suggested of waiting on _any_ random
PG_writeback to go away (instead of one particular one as I did in 2.4)
is going to fix the write throttling equally too as well as the 2.4
logic, but without introducing slowdown that 2.4 had.

It's easy to demonstrate: exactly because the page we pick is random
anyway, we can pick the first random one that has seen PG_writeback
transitioning from 1 to 0. The guarantee we get is the same in terms of
safety of the write throttling, but we also guarantee the best possible
latency this way. And the HZ/x hacks to avoid deadlocks will magically
go away too.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:17                                 ` Andrea Arcangeli
@ 2005-01-06  5:19                                   ` Nick Piggin
  2005-01-06  5:25                                     ` Andrea Arcangeli
  2005-01-06  5:32                                     ` Andrew Morton
  0 siblings, 2 replies; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  5:19 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Andrew Morton, riel, marcelo.tosatti, linux-kernel

Andrea Arcangeli wrote:
> On Wed, Jan 05, 2005 at 09:05:39PM -0800, Andrew Morton wrote:
> 
>>Andrea Arcangeli <andrea@suse.de> wrote:
>>
>>>The fix is very simple and it is to call wait_on_page_writeback on one
>>> of the pages under writeback.
>>
>>eek, no.  That was causing waits of five seconds or more.  Fixing this
>>caused the single greatest improvement in page allocator latency in early
>>2.5.  We're totally at the mercy of the elevator algorithm this way.
>>
>>If we're to improve things in there we want to wait on _any_ eligible page
>>becoming reclaimable, not on a particular page.
> 
> 
> I told you one way to fix it. I didn't guarantee it was the most
> efficient one.
> 
> I sure agree waiting on any page to complete writeback is going to fix
> it too. Exactly because this page was a "random" page anyway.
> 
> Still my point is that this is a bug, and I prefer to be slow and safe
> like 2.4, than fast and unreliable like 2.6.
> 
> The slight improvement you suggested of waiting on _any_ random
> PG_writeback to go away (instead of one particular one as I did in 2.4)
> is going to fix the write throttling equally too as well as the 2.4
> logic, but without introducing slowdown that 2.4 had.
> 
> It's easy to demonstrate: exactly because the page we pick is random
> anyway, we can pick the first random one that has seen PG_writeback
> transitioning from 1 to 0. The guarantee we get is the same in terms of
> safety of the write throttling, but we also guarantee the best possible
> latency this way. And the HZ/x hacks to avoid deadlocks will magically
> go away too.
> 

This is practically what blk_congestion_wait does when the queue
isn't congested though, isn't it?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:06                               ` Nick Piggin
@ 2005-01-06  5:21                                 ` Andrea Arcangeli
  0 siblings, 0 replies; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  5:21 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, riel, marcelo.tosatti, linux-kernel

On Thu, Jan 06, 2005 at 04:06:34PM +1100, Nick Piggin wrote:
> Yeah I guess that isn't a bad idea.
> 
> Doesn't that have the theoretical problem of slowing down reclaim
> of dirty pages backed by fast devices when you have slow devices
> writing?

Andrew just suggested to wait on any random page. That's a minor
modification to that logic. Effectively it's stupid to wait on a
specific random page, when we can wait wait on _any_ random page to
complete the writeback I/O.

With locking done right by setting the state to uninterruptible and
then registering in the new waitqueue when we know at least one page has
PG_writeback set, we can then drop the HZ anti-deadlock hacks too.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:19                                   ` Nick Piggin
@ 2005-01-06  5:25                                     ` Andrea Arcangeli
  2005-01-06  5:36                                       ` Nick Piggin
  2005-01-06  5:37                                       ` Andrew Morton
  2005-01-06  5:32                                     ` Andrew Morton
  1 sibling, 2 replies; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  5:25 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, riel, marcelo.tosatti, linux-kernel

On Thu, Jan 06, 2005 at 04:19:36PM +1100, Nick Piggin wrote:
> This is practically what blk_congestion_wait does when the queue
> isn't congested though, isn't it?

The fundamental difference that makes it reliable is that:

1) only the I/O we're throttling against will be considered for the
   wakeup event, which means only clearing PG_writeback will be
   considered eligible for wakeup
   Currently _all_ unrelated write I/O was considered eligible
   for wakeup events and that could cause spurious oom kills.
2) we won't need unreliable anti-deadlock timeouts anymore

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:19                                   ` Nick Piggin
  2005-01-06  5:25                                     ` Andrea Arcangeli
@ 2005-01-06  5:32                                     ` Andrew Morton
  2005-01-06  5:46                                       ` Andrea Arcangeli
  1 sibling, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2005-01-06  5:32 UTC (permalink / raw)
  To: Nick Piggin; +Cc: andrea, riel, marcelo.tosatti, linux-kernel

Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> Andrea Arcangeli wrote:
> > On Wed, Jan 05, 2005 at 09:05:39PM -0800, Andrew Morton wrote:
> > 
> >>Andrea Arcangeli <andrea@suse.de> wrote:
> >>
> >>>The fix is very simple and it is to call wait_on_page_writeback on one
> >>> of the pages under writeback.
> >>
> >>eek, no.  That was causing waits of five seconds or more.  Fixing this
> >>caused the single greatest improvement in page allocator latency in early
> >>2.5.  We're totally at the mercy of the elevator algorithm this way.
> >>
> >>If we're to improve things in there we want to wait on _any_ eligible page
> >>becoming reclaimable, not on a particular page.
> > 
> > 
> > I told you one way to fix it. I didn't guarantee it was the most
> > efficient one.
> > 

And I've already described the efficient way to "fix" it.  Twice.

> > I sure agree waiting on any page to complete writeback is going to fix
> > it too. Exactly because this page was a "random" page anyway.
> > 
> > Still my point is that this is a bug, and I prefer to be slow and safe
> > like 2.4, than fast and unreliable like 2.6.
> > 
> > The slight improvement you suggested of waiting on _any_ random
> > PG_writeback to go away (instead of one particular one as I did in 2.4)

It's a HUGE improvement.

  "Example: with `mem=512m', running 4 instances of `dbench 100', 2.5.34
   took 35 minutes to compile a kernel.  With this patch, it took three
   minutes, 45 seconds."

Plus this was the change which precipitated all the I/O scheduler
development, because it caused us to keep the queues full all the time and
the old I/O scheduler collapsed.

> > is going to fix the write throttling equally too as well as the 2.4
> > logic, but without introducing slowdown that 2.4 had.
> > 
> > It's easy to demonstrate: exactly because the page we pick is random
> > anyway, we can pick the first random one that has seen PG_writeback
> > transitioning from 1 to 0. The guarantee we get is the same in terms of
> > safety of the write throttling, but we also guarantee the best possible
> > latency this way. And the HZ/x hacks to avoid deadlocks will magically
> > go away too.
> > 
> 
> This is practically what blk_congestion_wait does when the queue
> isn't congested though, isn't it?

Pretty much.  Except:

- Doing a wakeup when a write request is retired corresponds to releasing
  a batch of pages, not a single page.  Usually.

- direct-io writes could confuse it.

For the third time: "fixing" this involves delivering a wakeup to all zones
in the page's classzone in end_page_writeback(), and passing the zone* into
blk_congestion_wait().  Only deliver the wakeup on every Nth page to get a
bit of batching and to reduce CPU consumption.  Then demonstrating that the
change actually improves something.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:25                                     ` Andrea Arcangeli
@ 2005-01-06  5:36                                       ` Nick Piggin
  2005-01-06  5:44                                         ` Nick Piggin
  2005-01-06  5:37                                       ` Andrew Morton
  1 sibling, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  5:36 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Andrew Morton, riel, marcelo.tosatti, linux-kernel

Andrea Arcangeli wrote:
> On Thu, Jan 06, 2005 at 04:19:36PM +1100, Nick Piggin wrote:
> 
>>This is practically what blk_congestion_wait does when the queue
>>isn't congested though, isn't it?
> 
> 
> The fundamental difference that makes it reliable is that:
> 
> 1) only the I/O we're throttling against will be considered for the
>    wakeup event, which means only clearing PG_writeback will be
>    considered eligible for wakeup
>    Currently _all_ unrelated write I/O was considered eligible
>    for wakeup events and that could cause spurious oom kills.

I'm not entirely convinced. In Rik's case it didn't matter, because
all his writeout was in the same zone that reclaim was happening
against (ZONE_NORMAL), so in that case, PG_writeback throttling
will do exactly the same thing as blk_congestion_wait.

I do like your PG_writeback throttling idea for the other reason
that it should behave better on NUMA systems with lots of zones
and lots of disks.


> 2) we won't need unreliable anti-deadlock timeouts anymore
> 

Well I think you do need *something*. If you wake up each time a
single page (or request) has completed, you only complete what,
12 (DEF_PRIORITY) requests before going OOM? In the worst possible
case scenario, which looks like what Rik's running into.

Nick

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:25                                     ` Andrea Arcangeli
  2005-01-06  5:36                                       ` Nick Piggin
@ 2005-01-06  5:37                                       ` Andrew Morton
  2005-01-06  5:59                                         ` Andrea Arcangeli
  2005-01-06 13:28                                         ` Rik van Riel
  1 sibling, 2 replies; 43+ messages in thread
From: Andrew Morton @ 2005-01-06  5:37 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: nickpiggin, riel, marcelo.tosatti, linux-kernel

Andrea Arcangeli <andrea@suse.de> wrote:
>
> 2) we won't need unreliable anti-deadlock timeouts anymore

The timeouts are for:

a) A fallback for backing stores which don't wake up waiters in
   blk_congestion_wait() (eg: NFS).

b) handling the race case where the request queue suddenly goes empty
   before the sleeper gets onto the waitqueue.

It can probably be removed with some work, and additional locking.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:36                                       ` Nick Piggin
@ 2005-01-06  5:44                                         ` Nick Piggin
  0 siblings, 0 replies; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  5:44 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Andrew Morton, riel, marcelo.tosatti, linux-kernel

Nick Piggin wrote:
> Andrea Arcangeli wrote:
> 
>> On Thu, Jan 06, 2005 at 04:19:36PM +1100, Nick Piggin wrote:
>>
>>> This is practically what blk_congestion_wait does when the queue
>>> isn't congested though, isn't it?
>>
>>
>>
>> The fundamental difference that makes it reliable is that:
>>
>> 1) only the I/O we're throttling against will be considered for the
>>    wakeup event, which means only clearing PG_writeback will be
>>    considered eligible for wakeup
>>    Currently _all_ unrelated write I/O was considered eligible
>>    for wakeup events and that could cause spurious oom kills.
> 
> 
> I'm not entirely convinced. In Rik's case it didn't matter, because
> all his writeout was in the same zone that reclaim was happening
> against (ZONE_NORMAL), so in that case, PG_writeback throttling
> will do exactly the same thing as blk_congestion_wait.
> 
> I do like your PG_writeback throttling idea for the other reason
> that it should behave better on NUMA systems with lots of zones
> and lots of disks.
> 

... or Andrew's described fix. I think both would result in pretty
similar behaviour, but Andrew's is probably a bit nicer because it
doesn't require the scanner to have initiated the write.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:32                                     ` Andrew Morton
@ 2005-01-06  5:46                                       ` Andrea Arcangeli
  2005-01-06  5:59                                         ` Andrew Morton
  0 siblings, 1 reply; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  5:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, riel, marcelo.tosatti, linux-kernel

On Wed, Jan 05, 2005 at 09:32:07PM -0800, Andrew Morton wrote:
> > > The slight improvement you suggested of waiting on _any_ random
> > > PG_writeback to go away (instead of one particular one as I did in 2.4)
> 
> It's a HUGE improvement.

I didn't want to question the improvement in wall clock time terms.

> For the third time: "fixing" this involves delivering a wakeup to all zones
> in the page's classzone in end_page_writeback(), and passing the zone* into
> blk_congestion_wait().  Only deliver the wakeup on every Nth page to get a
> bit of batching and to reduce CPU consumption.  Then demonstrating that the
> change actually improves something.

Since I cannot reproduce oom kills with writeback, I sure can't
demonstrate it on bare hardware with unmodified kernel.

But I dislike code that works by luck, and sure I could demonstrate it
if I bothered to write an artificial testcase on simulated hardware.
This is the only reason I mentioned this bug in the first place, not
because I'm reproducing it.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:37                                       ` Andrew Morton
@ 2005-01-06  5:59                                         ` Andrea Arcangeli
  2005-01-06 13:28                                         ` Rik van Riel
  1 sibling, 0 replies; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  5:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nickpiggin, riel, marcelo.tosatti, linux-kernel

On Wed, Jan 05, 2005 at 09:37:04PM -0800, Andrew Morton wrote:
> Andrea Arcangeli <andrea@suse.de> wrote:
> >
> > 2) we won't need unreliable anti-deadlock timeouts anymore
> 
> The timeouts are for:
> 
> a) A fallback for backing stores which don't wake up waiters in
>    blk_congestion_wait() (eg: NFS).

that anti-deadlock will be unnecessary too with the new logic.

> b) handling the race case where the request queue suddenly goes empty
>    before the sleeper gets onto the waitqueue.

as I mentioned with proper locking setting task in uninterruptible and
then registering into the new per classzone waitqueue, the timeout will
be unnecessary even for this.

> It can probably be removed with some work, and additional locking.

The additional locking will then remove the current locking in
blk_congestion_wait so it's new locking but it will replace the current
locking. But I agree registering in the waitqueue inside the
blk_congestion_wait was simpler. It's just I've an hard time to like the
timeout.  Timeout is always wrong when it triggers: if it triggers it
always triggers either too late (wasted resources) or too early (early
oom kills). So unless it messes everything up, it'd be nice to the
locking strict. anyway point 1 and 2 can be implemented separately, at
first we can leave the timeout if the race is too hard to handle. 

Ideally if we keep the total number of oustanding writebacks
per-classzone (not sure if we account for it already somewhere, I guess
if something we've the global number and not the per-classzone one), we
could remove the timeout without having to expose the locking outside
blk_congestion_wait. With the number of oustanding writebacks
per-classzone we could truly fix the race and obsolete the timeout in a
self contained manner. Though it will require a proper amount of memory
barriers around the account increment/decrement/read.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:46                                       ` Andrea Arcangeli
@ 2005-01-06  5:59                                         ` Andrew Morton
  2005-01-06  6:16                                           ` Andrea Arcangeli
  0 siblings, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2005-01-06  5:59 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: nickpiggin, riel, marcelo.tosatti, linux-kernel

Andrea Arcangeli <andrea@suse.de> wrote:
>
> But I dislike code that works by luck,

yup, it is code which works by luck.  And it's a bit inaccurate - there is
potential for (small) further latency improvement in there.

But we do need to watch the CPU consumption in there - end_page_writeback()
already figures fairly high sometimes.

> and sure I could demonstrate it
>  if I bothered to write an artificial testcase on simulated hardware.

Could well be so.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:59                                         ` Andrew Morton
@ 2005-01-06  6:16                                           ` Andrea Arcangeli
  0 siblings, 0 replies; 43+ messages in thread
From: Andrea Arcangeli @ 2005-01-06  6:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nickpiggin, riel, marcelo.tosatti, linux-kernel

On Wed, Jan 05, 2005 at 09:59:09PM -0800, Andrew Morton wrote:
> But we do need to watch the CPU consumption in there - end_page_writeback()
> already figures fairly high sometimes.

The check for waitqueue_active will avoid all wake_up calls if nobody is
waiting in blk_congestion_wait. Plus it won't need to wakeup on all
classzones, if we have the other side to reigster on all the zones
composing the classzone.

The wakeup is going to be more frequent than the waiter registration.

So we can register the waiter on _all_ the "interested" zones that
composes the classzone, instead of the other way around that you
suggested in the previous email.

That will reduce the wakeup to a single waitqueue_active, and it
shouldn't be too bad.

Same goes for the number of outstanding writebacks, we can collect that
number per zone and have blk_congestion_wait returning immediatly if it
went to zero on all zones composing the classzone after registering.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  4:55                             ` Nick Piggin
  2005-01-06  5:03                               ` Andrea Arcangeli
@ 2005-01-06  8:06                               ` Jens Axboe
  2005-01-06  8:16                                 ` memory barrier in ll_rw_blk.c (was Re: [PATCH][5/?] count writeback pages in nr_scanned) Nick Piggin
  1 sibling, 1 reply; 43+ messages in thread
From: Jens Axboe @ 2005-01-06  8:06 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, riel, marcelo.tosatti, andrea, linux-kernel

On Thu, Jan 06 2005, Nick Piggin wrote:
> Andrew Morton wrote:
> >Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> >
> >>> If the queue is not congested, blk_congestion_wait() will still sleep.  
> >>See
> >>> freed_request().
> >>> 
> >>
> >>Hmm... doesn't look like it to me:
> >>
> >>         if (rl->count[rw] < queue_congestion_off_threshold(q))
> >>                 clear_queue_congested(q, rw);
> >>
> >>And clear_queue_congested does an unconditional wakeup (if there
> >>is someone sleeping on the congestion queue).
> >
> >
> >That's my point.  blk_congestion_wait() will always sleep, regardless of
> >the queue's congestion state.
> >
> 
> Oh yes, but it will return as soon as a single request is finished.
> Which is probably a couple of milliseconds, rather than the 100 we
> had hoped for. So the allocators will wake up again and go around
> the loop and still make no progress.
> 
> However, if you had a plain io_schedule_timeout there, at least you
> would sleep for the full extend of the specified timeout.
> 
> BTW. Jens, now that I look at freed_request, is the memory barrier
> required? If so, what is it protecting?
> 

> 
> 
> This memory barrier is not needed because the waitqueue will only get
> waiters on it in the following situations:
> 
> rq->count has exceeded the threshold - however all manipulations of ->count
> are performed under the runqueue lock, and so we will correctly pick up any
> waiter.
> 
> Memory allocation for the request fails. In this case, there is no additional
> help provided by the memory barrier. We are guaranteed to eventually wake
> up waiters because the request allocation mempool guarantees that if the mem
> allocation for a request fails, there must be some requests in flight. They
> will wake up waiters when they are retired.

Not sure I agree completely. Yes it will work, but only because it tests
<= q->nr_requests and I don't think that 'eventually' is good enough :-)

The actual waitqueue manipulation doesn't happen under the queue lock,
so the memory barrier is needed to pickup the change on SMP. So I'd like
to keep the barrier.

I'd prefer to add smp_mb() to waitqueue_active() actually!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 43+ messages in thread

* memory barrier in ll_rw_blk.c (was Re: [PATCH][5/?] count writeback pages in nr_scanned)
  2005-01-06  8:06                               ` Jens Axboe
@ 2005-01-06  8:16                                 ` Nick Piggin
  2005-01-06  8:32                                   ` Jens Axboe
  0 siblings, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  8:16 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, riel, marcelo.tosatti, andrea, linux-kernel

Jens Axboe wrote:
> On Thu, Jan 06 2005, Nick Piggin wrote:

>>
>>This memory barrier is not needed because the waitqueue will only get
>>waiters on it in the following situations:
>>
>>rq->count has exceeded the threshold - however all manipulations of ->count
>>are performed under the runqueue lock, and so we will correctly pick up any
>>waiter.
>>
>>Memory allocation for the request fails. In this case, there is no additional
>>help provided by the memory barrier. We are guaranteed to eventually wake
>>up waiters because the request allocation mempool guarantees that if the mem
>>allocation for a request fails, there must be some requests in flight. They
>>will wake up waiters when they are retired.
> 
> 
> Not sure I agree completely. Yes it will work, but only because it tests
> <= q->nr_requests and I don't think that 'eventually' is good enough :-)
> 
> The actual waitqueue manipulation doesn't happen under the queue lock,
> so the memory barrier is needed to pickup the change on SMP. So I'd like
> to keep the barrier.
> 

No that's right... but between the prepare_to_wait and the io_schedule,
get_request takes the lock and checks nr_requests. I think we are safe?

> I'd prefer to add smp_mb() to waitqueue_active() actually!
> 

That may be a good idea (I haven't really taken much notice of how other
code uses it).

I'm not worried about any possible performance advantages of removing it,
rather just having a memory barrier without comments can be perplexing.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: memory barrier in ll_rw_blk.c (was Re: [PATCH][5/?] count writeback pages in nr_scanned)
  2005-01-06  8:16                                 ` memory barrier in ll_rw_blk.c (was Re: [PATCH][5/?] count writeback pages in nr_scanned) Nick Piggin
@ 2005-01-06  8:32                                   ` Jens Axboe
  2005-01-06  8:53                                     ` Nick Piggin
  0 siblings, 1 reply; 43+ messages in thread
From: Jens Axboe @ 2005-01-06  8:32 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, riel, marcelo.tosatti, andrea, linux-kernel

On Thu, Jan 06 2005, Nick Piggin wrote:
> Jens Axboe wrote:
> >On Thu, Jan 06 2005, Nick Piggin wrote:
> 
> >>
> >>This memory barrier is not needed because the waitqueue will only get
> >>waiters on it in the following situations:
> >>
> >>rq->count has exceeded the threshold - however all manipulations of 
> >>->count
> >>are performed under the runqueue lock, and so we will correctly pick up 
> >>any
> >>waiter.
> >>
> >>Memory allocation for the request fails. In this case, there is no 
> >>additional
> >>help provided by the memory barrier. We are guaranteed to eventually wake
> >>up waiters because the request allocation mempool guarantees that if the 
> >>mem
> >>allocation for a request fails, there must be some requests in flight. 
> >>They
> >>will wake up waiters when they are retired.
> >
> >
> >Not sure I agree completely. Yes it will work, but only because it tests
> ><= q->nr_requests and I don't think that 'eventually' is good enough :-)
> >
> >The actual waitqueue manipulation doesn't happen under the queue lock,
> >so the memory barrier is needed to pickup the change on SMP. So I'd like
> >to keep the barrier.
> >
> 
> No that's right... but between the prepare_to_wait and the io_schedule,
> get_request takes the lock and checks nr_requests. I think we are safe?

It looks like it, yes you are right. But it looks to be needed a few
lines further down instead, though :-)

===== drivers/block/ll_rw_blk.c 1.281 vs edited =====
--- 1.281/drivers/block/ll_rw_blk.c     2004-12-01 09:13:57 +01:00
+++ edited/drivers/block/ll_rw_blk.c    2005-01-06 09:32:19 +01:00
@@ -1630,11 +1630,11 @@
        if (rl->count[rw] < queue_congestion_off_threshold(q))
                clear_queue_congested(q, rw);
        if (rl->count[rw]+1 <= q->nr_requests) {
-               smp_mb();
                if (waitqueue_active(&rl->wait[rw]))
                        wake_up(&rl->wait[rw]);
                blk_clear_queue_full(q, rw);
        }
+       smp_mb();
        if (unlikely(waitqueue_active(&rl->drain)) &&
            !rl->count[READ] && !rl->count[WRITE])
                wake_up(&rl->drain);

> >I'd prefer to add smp_mb() to waitqueue_active() actually!
> >
> 
> That may be a good idea (I haven't really taken much notice of how other
> code uses it).
> 
> I'm not worried about any possible performance advantages of removing it,
> rather just having a memory barrier without comments can be perplexing.

I fully agree, subtle things like that should always be commented.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: memory barrier in ll_rw_blk.c (was Re: [PATCH][5/?] count writeback pages in nr_scanned)
  2005-01-06  8:32                                   ` Jens Axboe
@ 2005-01-06  8:53                                     ` Nick Piggin
  2005-01-06 12:00                                       ` Jens Axboe
  0 siblings, 1 reply; 43+ messages in thread
From: Nick Piggin @ 2005-01-06  8:53 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Andrew Morton, riel, marcelo.tosatti, andrea, linux-kernel

Jens Axboe wrote:
> On Thu, Jan 06 2005, Nick Piggin wrote:
> 

>>No that's right... but between the prepare_to_wait and the io_schedule,
>>get_request takes the lock and checks nr_requests. I think we are safe?
> 
> 
> It looks like it, yes you are right. But it looks to be needed a few
> lines further down instead, though :-)
> 
> ===== drivers/block/ll_rw_blk.c 1.281 vs edited =====
> --- 1.281/drivers/block/ll_rw_blk.c     2004-12-01 09:13:57 +01:00
> +++ edited/drivers/block/ll_rw_blk.c    2005-01-06 09:32:19 +01:00
> @@ -1630,11 +1630,11 @@
>         if (rl->count[rw] < queue_congestion_off_threshold(q))
>                 clear_queue_congested(q, rw);
>         if (rl->count[rw]+1 <= q->nr_requests) {
> -               smp_mb();
>                 if (waitqueue_active(&rl->wait[rw]))
>                         wake_up(&rl->wait[rw]);
>                 blk_clear_queue_full(q, rw);
>         }
> +       smp_mb();
>         if (unlikely(waitqueue_active(&rl->drain)) &&
>             !rl->count[READ] && !rl->count[WRITE])
>                 wake_up(&rl->drain);
> 

Yes, looks like you're right there.

Any point in doing it like this

	if (!rl->count[READ] && !rl->count[WRITE]) {
		smb_mb();
		if (unlikely(waitqueue_active(...)))
			wake_up()
	}

I wonder? I don't have any feeling of how memory barriers impact performance
on a very parallel system with CPUs that do lots of memory reordering like
POWER5.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: memory barrier in ll_rw_blk.c (was Re: [PATCH][5/?] count writeback pages in nr_scanned)
  2005-01-06  8:53                                     ` Nick Piggin
@ 2005-01-06 12:00                                       ` Jens Axboe
  0 siblings, 0 replies; 43+ messages in thread
From: Jens Axboe @ 2005-01-06 12:00 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, riel, marcelo.tosatti, andrea, linux-kernel

On Thu, Jan 06 2005, Nick Piggin wrote:
> Jens Axboe wrote:
> >On Thu, Jan 06 2005, Nick Piggin wrote:
> >
> 
> >>No that's right... but between the prepare_to_wait and the io_schedule,
> >>get_request takes the lock and checks nr_requests. I think we are safe?
> >
> >
> >It looks like it, yes you are right. But it looks to be needed a few
> >lines further down instead, though :-)
> >
> >===== drivers/block/ll_rw_blk.c 1.281 vs edited =====
> >--- 1.281/drivers/block/ll_rw_blk.c     2004-12-01 09:13:57 +01:00
> >+++ edited/drivers/block/ll_rw_blk.c    2005-01-06 09:32:19 +01:00
> >@@ -1630,11 +1630,11 @@
> >        if (rl->count[rw] < queue_congestion_off_threshold(q))
> >                clear_queue_congested(q, rw);
> >        if (rl->count[rw]+1 <= q->nr_requests) {
> >-               smp_mb();
> >                if (waitqueue_active(&rl->wait[rw]))
> >                        wake_up(&rl->wait[rw]);
> >                blk_clear_queue_full(q, rw);
> >        }
> >+       smp_mb();
> >        if (unlikely(waitqueue_active(&rl->drain)) &&
> >            !rl->count[READ] && !rl->count[WRITE])
> >                wake_up(&rl->drain);
> >
> 
> Yes, looks like you're right there.
> 
> Any point in doing it like this
> 
> 	if (!rl->count[READ] && !rl->count[WRITE]) {
> 		smb_mb();
> 		if (unlikely(waitqueue_active(...)))
> 			wake_up()
> 	}
> 
> I wonder? I don't have any feeling of how memory barriers impact performance
> on a very parallel system with CPUs that do lots of memory reordering like
> POWER5.

I had the same idea - I think it's a good idea, it's definitely an
optimization worth doing.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH][5/?] count writeback pages in nr_scanned
  2005-01-06  5:37                                       ` Andrew Morton
  2005-01-06  5:59                                         ` Andrea Arcangeli
@ 2005-01-06 13:28                                         ` Rik van Riel
  1 sibling, 0 replies; 43+ messages in thread
From: Rik van Riel @ 2005-01-06 13:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andrea Arcangeli, nickpiggin, marcelo.tosatti, linux-kernel

On Wed, 5 Jan 2005, Andrew Morton wrote:

> The timeouts are for:
>
> a) A fallback for backing stores which don't wake up waiters in
>   blk_congestion_wait() (eg: NFS).

This is buggy, btw.  NFS has a "fun" deadlock where it can
send so many writes over the wire at once that the PF_MEMALLOC
allocations eat up all memory and there's no memory left to
receive ACKs from the NFS server ...

NFS will need to act congested when memory is low and there
are already some writeouts underway.

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2005-01-06 13:29 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-03 17:25 [PATCH][5/?] count writeback pages in nr_scanned Rik van Riel
2005-01-05 10:08 ` Andrew Morton
2005-01-05 18:06   ` Andrea Arcangeli
2005-01-05 18:50     ` Rik van Riel
2005-01-05 17:49       ` Marcelo Tosatti
2005-01-05 21:44         ` Andrew Morton
2005-01-05 20:32           ` Marcelo Tosatti
2005-01-05 23:51             ` Nick Piggin
2005-01-06  1:27               ` Rik van Riel
2005-01-06  1:33                 ` Nick Piggin
2005-01-06  1:37                   ` Andrew Morton
2005-01-06  1:40                     ` Nick Piggin
2005-01-06  1:52                       ` Andrea Arcangeli
2005-01-06  1:36                 ` Andrew Morton
2005-01-06  3:42                   ` Rik van Riel
2005-01-06  3:50                     ` Nick Piggin
2005-01-06  4:26                       ` Andrew Morton
2005-01-06  4:35                         ` Nick Piggin
2005-01-06  4:47                           ` Andrew Morton
2005-01-06  4:55                             ` Nick Piggin
2005-01-06  5:03                               ` Andrea Arcangeli
2005-01-06  8:06                               ` Jens Axboe
2005-01-06  8:16                                 ` memory barrier in ll_rw_blk.c (was Re: [PATCH][5/?] count writeback pages in nr_scanned) Nick Piggin
2005-01-06  8:32                                   ` Jens Axboe
2005-01-06  8:53                                     ` Nick Piggin
2005-01-06 12:00                                       ` Jens Axboe
2005-01-06  4:59                             ` [PATCH][5/?] count writeback pages in nr_scanned Andrea Arcangeli
2005-01-06  5:05                               ` Andrew Morton
2005-01-06  5:17                                 ` Andrea Arcangeli
2005-01-06  5:19                                   ` Nick Piggin
2005-01-06  5:25                                     ` Andrea Arcangeli
2005-01-06  5:36                                       ` Nick Piggin
2005-01-06  5:44                                         ` Nick Piggin
2005-01-06  5:37                                       ` Andrew Morton
2005-01-06  5:59                                         ` Andrea Arcangeli
2005-01-06 13:28                                         ` Rik van Riel
2005-01-06  5:32                                     ` Andrew Morton
2005-01-06  5:46                                       ` Andrea Arcangeli
2005-01-06  5:59                                         ` Andrew Morton
2005-01-06  6:16                                           ` Andrea Arcangeli
2005-01-06  5:06                               ` Nick Piggin
2005-01-06  5:21                                 ` Andrea Arcangeli
2005-01-05 23:26 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).