linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@phunq.net>
To: Bill Davidsen <davidsen@tmr.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC] [PATCH] A clean approach to writeout throttling
Date: Thu, 6 Dec 2007 16:04:41 -0800	[thread overview]
Message-ID: <200712061604.41490.phillips@phunq.net> (raw)
In-Reply-To: <47586F59.5090507@tmr.com>

On Thursday 06 December 2007 13:53, Bill Davidsen wrote:
> Daniel Phillips wrote:
> The problem is that you (a) may or may not know just how bad a worst
> case can be, and (b) may block unnecessarily by being pessimistic.

True, but after a quick introspect I realized that that issue (it's 
really a single issue) is not any worse than the way I planned to wave 
my hands at the issue of programmers constructing their metrics wrongly 
and thereby breaking the throttling assumptions.

Which is to say that I am now entirely convince by Andrew's argument and 
am prepardc to reroll the patch along the lines he suggests.  The 
result will be somewhat bigger.  Only a minor change is required to the 
main mechanism: we will now account things entirely in units of pages 
instead of abstract units, eliminating a whole class of things to go 
wrong.  I like that.  Accounting variables get shifted to a new home, 
maybe.  Must try a few ideas and see what works.

Anyway, the key idea is that task struct will gain a field pointing at a 
handle for the "block device stack", whatever that is (this is sure to 
evolve over time) and alloc_pages will know how to account pages to 
that object.  The submit_bio and bio->endio bits change hardly at all.

The runner up key idea is that we will gain a notion of "block device 
stack" (or block stack for short, so that we may implement block 
stackers) which for the time being will simply be Device Mapper's 
notion of device stack, however many warts that may have.  It's there 
now and we use it for ddsnap.

The other player in this is Peterz's swap over network use case, which 
does not involve a device mapper device.  Maybe it should?  Otherwise 
we will need a variant notion of block device stack, and the two 
threads of work should merge eventually.  There is little harm in 
starting this effort in two different places, quite the contrary.

In the meantime we do have a strategy that works, posted at the head of 
this thread, for anybody who needs it now.

> The dummy transaction would be nice, but it would be perfect if you
> could send the real transaction down with a max memory limit and a
> flag, have each level check and decrement the max by what's actually
> needed, and then return some pass/fail status for that particular
> transaction. Clearly every level in the stack would have to know how
> to do that. It would seem that once excess memory use was detected
> the transaction could be failed without deadlock.

The function of the dummy transaction will be to establish roughly what 
kind of footprint for a single transaction we see on that block IO 
path.  Then we will make the reservation _hugely_ greater than that, to 
accommodate 1000 or so of those.  A transaction blocks if it actually 
tries to use more than that.  We close the "many partial submissions 
all deadlock together" hole by ... <insert handwaving here>.  Can't go 
wrong, right?

Agreed that drivers should pay special attention to our dummy 
transaction and try to use the maximum possible resources when they see 
one.  More handwaving, but this is progress.

Regards,

Daniel

  reply	other threads:[~2007-12-07  0:04 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-06  0:03 [RFC] [PATCH] A clean approach to writeout throttling Daniel Phillips
2007-12-06  1:24 ` Andrew Morton
2007-12-06  6:21   ` Daniel Phillips
2007-12-06  7:31     ` Andrew Morton
2007-12-06  9:48       ` Daniel Phillips
2007-12-06 11:55         ` Andrew Morton
2007-12-06 15:52           ` Rik van Riel
2007-12-06 17:34             ` Andrew Morton
2007-12-06 17:48               ` Rik van Riel
2007-12-06 20:04           ` Daniel Phillips
2007-12-06 20:27             ` Andrew Morton
2007-12-06 21:27               ` Daniel Phillips
2007-12-06 21:53     ` Bill Davidsen
2007-12-07  0:04       ` Daniel Phillips [this message]
2007-12-07  0:29         ` Andrew Morton
2007-12-07  7:13           ` Daniel Phillips
2007-12-10  9:20             ` Daniel Phillips
2007-12-10 10:47 ` Jens Axboe
2007-12-10 11:23   ` [RFC] [PATCH] A clean aEvgeniy pproach " Daniel Phillips
2007-12-10 11:41     ` Jens Axboe
2007-12-10 12:13       ` Daniel Phillips
2007-12-10 12:16         ` Jens Axboe
2007-12-10 12:27           ` Daniel Phillips
2007-12-10 12:32             ` Jens Axboe
2007-12-10 13:04               ` Daniel Phillips
2007-12-10 13:19                 ` Jens Axboe
2007-12-10 13:26                   ` Daniel Phillips
2007-12-10 13:30                     ` Jens Axboe
2007-12-10 13:43                       ` Daniel Phillips
2007-12-10 13:53                         ` Jens Axboe
2007-12-10 14:17                           ` Daniel Phillips
2007-12-11 13:15                             ` Jens Axboe
2007-12-11 19:38                               ` Daniel Phillips
2007-12-11 20:01                                 ` Jens Axboe
2007-12-11 20:11                                   ` Daniel Phillips
2007-12-11 20:07                               ` Daniel Phillips
2007-12-10 11:33   ` [RFC] [PATCH] A clean approach " Daniel Phillips
2007-12-10 21:31 ` Jonathan Corbet
2007-12-10 22:06   ` Pekka Enberg
2007-12-11  4:21   ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200712061604.41490.phillips@phunq.net \
    --to=phillips@phunq.net \
    --cc=akpm@linux-foundation.org \
    --cc=davidsen@tmr.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).