All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: John Stoffel <john@stoffel.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	miklos@szeredi.hu, akpm@linux-foundation.org, neilb@suse.de,
	dgc@sgi.com, tomoki.sekiyama.qu@hitachi.com,
	nikita@clusterfs.com, trond.myklebust@fys.uio.no,
	yingchao.zhou@gmail.com, richard@rsk.demon.co.uk,
	torvalds@linux-foundation.org
Subject: Re: [PATCH 21/23] mm: per device dirty threshold
Date: Wed, 12 Sep 2007 10:45:57 +0200	[thread overview]
Message-ID: <1189586757.21778.96.camel@twins> (raw)
In-Reply-To: <18151.20636.425784.226044@stoffel.org>

[-- Attachment #1: Type: text/plain, Size: 3025 bytes --]

On Tue, 2007-09-11 at 22:36 -0400, John Stoffel wrote:
> Peter> Scale writeback cache per backing device, proportional to its
> Peter> writeout speed.  By decoupling the BDI dirty thresholds a
> Peter> number of problems we currently have will go away, namely:
> 
> Ah, this clarifies my questions!  Thanks!
> 
> Peter>  - mutual interference starvation (for any number of BDIs);
> Peter>  - deadlocks with stacked BDIs (loop, FUSE and local NFS mounts).
> 
> Peter> It might be that all dirty pages are for a single BDI while
> Peter> other BDIs are idling. By giving each BDI a 'fair' share of the
> Peter> dirty limit, each one can have dirty pages outstanding and make
> Peter> progress.
> 
> Question, can you change (shrink) the limit on a BDI while it has IO
> in flight?  And what will that do to the system?  I.e. if you have one
> device doing IO, so that it has a majority of the dirty limit.  Then
> another device starts IO, and it's a *faster* device, how
> quickly/slowly does the BDI dirty limits change for both the old and
> new device?  

Yes, it can change while in use. A measure of how quickly it can change
is roughly: it can half in a dirty_limit worth of writeout.

What will happen is that those processes doing heavy IO on the slower
device will get throttled more aggressively until its below its new
threshold again - however all the time it will keep on writing at (full)
speed because it will have this backlog to rid itself of, and by doing
that it completes writeouts which ensure it will keep part of the dirty
limit for itself, and thus can always make progress.

You can monitor this by looking at /sys/block/sd*/queue/cache_size while
doing such a thing. It should stabilise quite 'quickly'.

> Peter> A global threshold also creates a deadlock for stacked BDIs;
> Peter> when A writes to B, and A generates enough dirty pages to get
> Peter> throttled, B will never start writeback until the dirty pages
> Peter> go away. Again, by giving each BDI its own 'independent' dirty
> Peter> limit, this problem is avoided.
> 
> Peter> So the problem is to determine how to distribute the total
> Peter> dirty limit across the BDIs fairly and efficiently. A DBI that
> 
> You mean BDI here, not DBI.  

Uhh, yeah, obviously :-)

> Peter> has a large dirty limit but does not have any dirty pages
> Peter> outstanding is a waste.
> 
> Peter> What is done is to keep a floating proportion between the DBIs
> Peter> based on writeback completions. This way faster/more active
> Peter> devices get a larger share than slower/idle devices.
> 
> Does a slower device get a BDI which is calculated to keep it's limit
> under a certain number of seconds of outstanding IO?  This way no
> device can build up more than say 15 seconds of outstanding IO to
> flush at any one time.  

Perhaps already answered above, as long as there is dirty stuff to write
out it will keep completing writes and thus gain a stable share of the
dirty limit.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

  reply	other threads:[~2007-09-12  8:46 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-11 19:53 [PATCH 00/23] per device dirty throttling -v10 Peter Zijlstra
2007-09-11 19:53 ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 01/23] nfs: remove congestion_end() Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 02/23] lib: percpu_counter_add Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 03/23] lib: percpu_counter_sub Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 04/23] lib: percpu_counter variable batch Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 05/23] lib: make percpu_counter_add take s64 Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 06/23] lib: percpu_counter_set Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 07/23] lib: percpu_counter_sum_positive Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 08/23] lib: percpu_count_sum() Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:53 ` [PATCH 09/23] lib: percpu_counter_init error handling Peter Zijlstra
2007-09-11 19:53   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 10/23] lib: percpu_counter_init_irq Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 11/23] mm: bdi init hooks Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 12/23] containers: " Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 13/23] mtd: " Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 14/23] mtd: clean up the backing_dev_info usage Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 15/23] mtd: give mtdconcat devices their own backing_dev_info Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 16/23] mm: scalable bdi statistics counters Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 17/23] mm: count reclaimable pages per BDI Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 18/23] mm: count writeback " Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 19/23] mm: expose BDI statistics in sysfs Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 20/23] lib: floating proportions Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 21/23] mm: per device dirty threshold Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-12  2:36   ` John Stoffel
2007-09-12  2:36     ` John Stoffel
2007-09-12  8:45     ` Peter Zijlstra [this message]
2007-09-11 19:54 ` [PATCH 22/23] mm: dirty balancing for tasks Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-11 19:54 ` [PATCH 23/23] debug: sysfs files for the current ratio/size/total Peter Zijlstra
2007-09-11 19:54   ` Peter Zijlstra
2007-09-12  2:31 ` [PATCH 00/23] per device dirty throttling -v10 John Stoffel
2007-09-12  2:31   ` John Stoffel
2007-09-12  9:00   ` Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2007-08-16  7:45 [PATCH 00/23] per device dirty throttling -v9 Peter Zijlstra
2007-08-16  7:45 ` [PATCH 21/23] mm: per device dirty threshold Peter Zijlstra
2007-08-16  7:45   ` Peter Zijlstra
2007-08-03 12:37 [PATCH 00/23] per device dirty throttling -v8 Peter Zijlstra
2007-08-03 12:37 ` [PATCH 21/23] mm: per device dirty threshold Peter Zijlstra
2007-08-03 12:37   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1189586757.21778.96.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=dgc@sgi.com \
    --cc=john@stoffel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=miklos@szeredi.hu \
    --cc=neilb@suse.de \
    --cc=nikita@clusterfs.com \
    --cc=richard@rsk.demon.co.uk \
    --cc=tomoki.sekiyama.qu@hitachi.com \
    --cc=torvalds@linux-foundation.org \
    --cc=trond.myklebust@fys.uio.no \
    --cc=yingchao.zhou@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.