linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@zip.com.au>
To: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Linus Torvalds <torvalds@transmeta.com>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <andrea@suse.de>
Subject: Re: 2.4.14-pre6
Date: Thu, 01 Nov 2001 12:55:41 -0800	[thread overview]
Message-ID: <3BE1B6CD.7DA43A6C@zip.com.au> (raw)
In-Reply-To: message from Linus Torvalds on Wednesday October 31, <Pine.LNX.4.33.0110310809200.32460-100000@penguin.transmeta.com> <15329.8658.642254.284398@notabene.cse.unsw.edu.au>

Neil Brown wrote:
> 
> ...
> What I would like is that as soon as a buffer was marked "dirty", it
> would get passed down to the driver (or at least to the
> block-device-layer) with something like
>     submit_bh(WRITEA, bh);
> i.e. a write ahead. (or is it write-behind...)
> The device handler (the elevator algorithm for normal disks, other
> code for other devices) could keep them ordered in whatever way it
> chooses, and feed them into the queues at some appropriate time.
> 

Sounds sensible to me.

In many ways, it's similar to the current scheme when it's used
with an enormous request queue - all writeable blocks in the
system are candidates for request merging.  But your proposal
is a whole lot smarter.

In particular, the current kupdate scheme of writing the
dirty block list out in six chunks, five seconds apart
does potentially miss out on a very large number of merging
opportunities.  Your proposal would fix that.

Another potential microoptimisation would be to write out
clean blocks if that helps merging.  So if we see a write
for blocks 1,2,3,5,6,7 and block 4 is known to be in memory,
then write it out too.  I suspect this would be a win for
ATA but a loss for SCSI.  Not sure.

But I have a gut feel that all this is in the noisefloor,
compared to The Big Problem.  It's just a matter of identifying
and fixing TBP.  Fixing the fdatasync() thing didn't help,
because ext2_write_inode() for a new file has to read the
inode block from disk.  Fixing that, by doing an async preread
of the inode's block in ext2_new_inode() didn't help either,
I suspect because my working set was so large that the VM
tossed out my preread before I got to use it.  A few more days
poking is needed.



Oh.  I have a gripe concerning prune_icache().  The design
idea behind keventd is that it's a "process context bottom
half handler".  It's used for things like cardbus hotplug
interrupt handlers, handling tty hangups, etc.  It should
probably run SCHED_FIFO.

Using keventd to synchronously flush large amounts of 
data out to disk constitutes gross abuse - it's being blocked
from performing its designed duties for many seconds.  Can we
please not do that?  We already have kswapd, kupdate, bdflush,
which should be sufficient.

-

  reply	other threads:[~2001-11-01 21:01 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-10-31 16:15 2.4.14-pre6 Linus Torvalds
2001-10-31 18:36 ` 2.4.14-pre6 Andrew Morton
2001-10-31 19:06   ` 2.4.14-pre6 Linus Torvalds
2001-11-01 10:20 ` 2.4.14-pre6 Neil Brown
2001-11-01 20:55   ` Andrew Morton [this message]
2001-11-02  8:00     ` 2.4.14-pre6 Helge Hafting
2001-11-04 22:34     ` 2.4.14-pre6 Pavel Machek
2001-11-04 23:16       ` 2.4.14-pre6 Daniel Phillips
2001-11-01 21:28   ` 2.4.14-pre6 Chris Mason
  -- strict thread matches above, loose matches on Subject: below --
2001-10-31  8:00 2.4.14-pre6 Linus Torvalds
2001-10-31  9:10 ` 2.4.14-pre6 Andrew Morton
2001-10-31  9:29   ` 2.4.14-pre6 Jens Axboe
2001-10-31  9:30 ` 2.4.14-pre6 bert hubert
2001-10-31 19:27 ` 2.4.14-pre6 Michael Peddemors
2001-10-31 19:38   ` 2.4.14-pre6 Linus Torvalds
2001-10-31 19:55     ` 2.4.14-pre6 Mike Castle
2001-10-31 20:02     ` 2.4.14-pre6 Rik van Riel
2001-10-31 23:18     ` 2.4.14-pre6 Erik Andersen
2001-10-31 23:40       ` 2.4.14-pre6 Dax Kelson
2001-10-31 23:57         ` 2.4.14-pre6 Michael Peddemors
2001-10-31 19:52 ` 2.4.14-pre6 Philipp Matthias Hahn
2001-10-31 21:05   ` 2.4.14-pre6 H. Peter Anvin
2001-11-01 19:14 ` 2.4.14-pre6 Pozsar Balazs
2001-11-02 12:01 ` 2.4.14-pre6 Pavel Machek
2001-11-05 20:43   ` 2.4.14-pre6 Charles Cazabon
2001-11-05 20:49   ` 2.4.14-pre6 Linus Torvalds
2001-11-05 21:04   ` 2.4.14-pre6 Johannes Erdfelt
2001-11-05 21:08   ` 2.4.14-pre6 Wilson
2001-11-05 21:27   ` 2.4.14-pre6 Josh Fryman
2001-11-05 19:04     ` 2.4.14-pre6 Gérard Roudier
2001-11-02 16:48 ` 2.4.14-pre6 jogi
2001-11-03 12:47   ` 2.4.14-pre6 Mike Galbraith
2001-11-03 18:01     ` 2.4.14-pre6 Linus Torvalds
2001-11-03 19:07       ` 2.4.14-pre6 Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3BE1B6CD.7DA43A6C@zip.com.au \
    --to=akpm@zip.com.au \
    --cc=andrea@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).