linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andries.Brouwer@cwi.nl
To: Andries.Brouwer@cwi.nl, akpm@zip.com.au
Cc: linux-kernel@vger.kernel.org
Subject: Re: readahead
Date: Tue, 16 Apr 2002 21:10:59 +0200 (MEST)	[thread overview]
Message-ID: <UTC200204161910.g3GJAx009370.aeb@smtp.cwi.nl> (raw)

    From: Andrew Morton <akpm@zip.com.au>

    > In the good old days we had tunable readahead.
    > Very good, especially for special purposes.

    readahead is tunable, but the window size is stored
    at the request queue layer.  If it has never been
    set, or if the device doesn't have a request queue,
    you get the defaults.

    Do these cards not have a request queue?

The kernel views them as SCSI disks.
So yes, I can do

   blockdev --setra 0 /dev/sdc

Unfortunately that does not help in the least.
Indeed, the only user of the readahead info is
readahead.c: get_max_readahead() and it does

        blk_ra_kbytes = blk_get_readahead(inode->i_dev) / 2;
        if (blk_ra_kbytes < VM_MIN_READAHEAD)
                blk_ra_kbytes = VM_MAX_READAHEAD;

We need to distinguish between undefined, and explicily zero.
Also, overriding the value explicitly given by the user
is a bad idea.
     
    > I recall the days where I tried to get something off
    > a bad SCSI disk, and the kernel would die in the retries
    > trying to read a bad block, while the data I needed was
    > not in the block but just before. Set readahead to zero
    > and all was fine.

    Yes, but things should be OK as-is.  If the readahead attempt
    gets an I/O error, do_generic_file_read will notice the non-uptodate
    page and will issue a single-page read.  So everything up to
    a page's distance from the bad block should be recoverable.
    That's the theory; can't say that I've tested it.

It is really important to be able to tell the kernel to read and
write only the blocks it has been asked to read and write and
not to touch anything else.

In my SCSI example you go easily past "an I/O error", but what
this driver would do is retry a few times, reset the device,
retry again, reset the scsi bus, and then the kernel would crash
or hang forever. Maybe things are better today, but one does
not want to depend on complicated subsystems recovering
from their errors. There must just not be any errors.

In my situation yesterday night entirely different things play a role.
This card has a mapping from logical to physical blocks, but a
logical block only has a corresponding physical block when it has
been written at least once. So readahead will ask for blocks that
do not exist yet. (The driver that I put on ftp now recognizes this
situation and returns an all zero block, instead of an error.)

There are other situations where reading something has side effects.
A very common side effect is time delay.

So, for some devices I want to be able to kill read-ahead, even
before the kernel looks at the partition table.
Fortunately, I think that 2.5 will include the code that moves
partition table reading code out of the kernel, so this is
really possible.

    If the driver is actually dying over the bad block, well, foo.

    Yup.  Permitting a window size of zero is on my todo list,
    but it would require that the device have a request queue.

It has.

Andries

             reply	other threads:[~2002-04-16 19:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-04-16 19:10 Andries.Brouwer [this message]
2002-04-16 19:23 ` readahead Andrew Morton
2002-04-16 19:33   ` readahead Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2005-09-27  2:38 Readahead Alan Stern
2005-09-27  3:06 ` Readahead Randy.Dunlap
2005-09-27  4:24 ` Readahead Andrew Morton
2005-09-28 18:40   ` Readahead Alan Stern
2003-11-01 17:22 READAHEAD Voluspa
2003-10-30 19:23 READAHEAD age
2003-10-30 21:44 ` READAHEAD Andrew Morton
2003-10-31  7:43   ` READAHEAD Nuno Silva
2003-10-31  8:03     ` READAHEAD Andrew Morton
2003-10-31 12:20   ` READAHEAD age
2003-10-31  9:28     ` READAHEAD Andrew Morton
2003-10-31  9:29       ` READAHEAD Andrew Morton
2003-11-01  9:15         ` READAHEAD age
2003-11-03  0:15         ` READAHEAD Derek Foreman
2002-04-16 20:21 readahead Andries.Brouwer
2002-04-16 13:54 readahead Andries.Brouwer
2002-04-16 16:08 ` readahead Steven Cole
2002-04-16 18:25 ` readahead Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=UTC200204161910.g3GJAx009370.aeb@smtp.cwi.nl \
    --to=andries.brouwer@cwi.nl \
    --cc=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).