From: Neil Brown <neilb@cse.unsw.edu.au>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: Christoph Hellwig <hch@ns.caldera.de>,
Ben LaHaise <bcrl@redhat.com>, Ingo Molnar <mingo@elte.hu>,
"Stephen C. Tweedie" <sct@redhat.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
Manfred Spraul <manfred@colorfullife.com>,
Steve Lord <lord@sgi.com>,
Linux Kernel List <linux-kernel@vger.kernel.org>,
kiobuf-io-devel@lists.sourceforge.net,
Ingo Molnar <mingo@redhat.com>
Subject: Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait
Date: Thu, 8 Feb 2001 11:34:46 +1100 (EST) [thread overview]
Message-ID: <14977.59814.532773.466631@notabene.cse.unsw.edu.au> (raw)
In-Reply-To: message from Linus Torvalds on Wednesday February 7
In-Reply-To: <20010207192622.A23859@caldera.de> <Pine.LNX.4.10.10102071032390.4623-100000@penguin.transmeta.com>
On Wednesday February 7, torvalds@transmeta.com wrote:
>
>
> On Wed, 7 Feb 2001, Christoph Hellwig wrote:
>
> > On Tue, Feb 06, 2001 at 12:59:02PM -0800, Linus Torvalds wrote:
> > >
> > > Actually, they really aren't.
> > >
> > > They kind of _used_ to be, but more and more they've moved away from that
> > > historical use. Check in particular the page cache, and as a really
> > > extreme case the swap cache version of the page cache.
> >
> > Yes. And that exactly why I think it's ugly to have the left-over
> > caching stuff in the same data sctruture as the IO buffer.
>
> I do agree.
>
> I would not be opposed to factoring out the "pure block IO" part from the
> bh struct. It should not even be very hard. You'd do something like
>
> struct block_io {
> .. here is the stuff needed for block IO ..
> };
>
> struct buffer_head {
> struct block_io io;
> .. here is the stuff needed for hashing etc ..
> }
>
> and then you make "generic_make_request()" and everything lower down take
> just the "struct block_io".
>
I was just thinking the same, or a similar thing.
I wanted to do
struct io_head {
stuff
};
struct buffer_head {
struct io_head;
more stuff;
}
so that, as an unnamed substructure, the content of the struct io_head
would automagically be promoted to appear to be content of
buffer_head.
However I then remembered (when it didn't work) that unnamed
substructures are a feature of the Plan-9 C compiler, not the GNU
Compiler Collection. (Any gcc coders out there think this would be a
good thing to add?
http://plan9.bell-labs.com/sys/doc/compiler.html
)
Anyway, I produced the same result in a rather ugly way with #defines
and modified raid5 to use 32byte block_io structures instead of the
80+ byte buffer_heads, and it ... doesn't quite work :-( it boots
fine, but raid5 dies and the Oops message is a few kilometers away.
Anyway, I think the concept it fine.
Patch is below for your inspection.
It occurs to me that Stephen's desire to pass lots of requests through
make_request all at once isn't a bad idea and could be done by simply
linking the io_heads together with b_reqnext.
This would require:
1/ all callers of generic_make_request (there are 3) to initialise
b_reqnext
2/ all registered make_request_fn functions (there are again 3 I
think) to cope with following b_reqnext
It shouldn't be too hard to make the elevator code take advantage of
any ordering that it fines in the list.
I don't have a patch which does this.
NeilBrown
--- ./include/linux/fs.h 2001/02/07 22:45:37 1.1
+++ ./include/linux/fs.h 2001/02/07 23:09:05
@@ -207,6 +207,7 @@
#define BH_Protected 6 /* 1 if the buffer is protected */
/*
+ * THIS COMMENT NO-LONGER CORRECT.
* Try to keep the most commonly used fields in single cache lines (16
* bytes) to improve performance. This ordering should be
* particularly beneficial on 32-bit processors.
@@ -217,31 +218,43 @@
* The second 16 bytes we use for lru buffer scans, as used by
* sync_buffers() and refill_freelist(). -- sct
*/
+
+/*
+ * io_head is all that is needed by device drivers.
+ */
+#define io_head_fields \
+ unsigned long b_state; /* buffer state bitmap (see above) */ \
+ struct buffer_head *b_reqnext; /* request queue */ \
+ unsigned short b_size; /* block size */ \
+ kdev_t b_rdev; /* Real device */ \
+ unsigned long b_rsector; /* Real buffer location on disk */ \
+ char * b_data; /* pointer to data block (512 byte) */ \
+ void (*b_end_io)(struct buffer_head *bh, int uptodate); /* I/O completion */ \
+ void *b_private; /* reserved for b_end_io */ \
+ struct page *b_page; /* the page this bh is mapped to */ \
+ /* this line intensionally left blank */
+struct io_head {
+ io_head_fields
+};
+
+/* buffer_head adds all the stuff needed by the buffer cache */
struct buffer_head {
- /* First cache line: */
+ io_head_fields
+
struct buffer_head *b_next; /* Hash queue list */
unsigned long b_blocknr; /* block number */
- unsigned short b_size; /* block size */
unsigned short b_list; /* List that this buffer appears */
kdev_t b_dev; /* device (B_FREE = free) */
atomic_t b_count; /* users using this block */
- kdev_t b_rdev; /* Real device */
- unsigned long b_state; /* buffer state bitmap (see above) */
unsigned long b_flushtime; /* Time when (dirty) buffer should be written */
struct buffer_head *b_next_free;/* lru/free list linkage */
struct buffer_head *b_prev_free;/* doubly linked list of buffers */
struct buffer_head *b_this_page;/* circular list of buffers in one page */
- struct buffer_head *b_reqnext; /* request queue */
struct buffer_head **b_pprev; /* doubly linked list of hash-queue */
- char * b_data; /* pointer to data block (512 byte) */
- struct page *b_page; /* the page this bh is mapped to */
- void (*b_end_io)(struct buffer_head *bh, int uptodate); /* I/O completion */
- void *b_private; /* reserved for b_end_io */
- unsigned long b_rsector; /* Real buffer location on disk */
wait_queue_head_t b_wait;
struct inode * b_inode;
--- ./drivers/md/raid5.c 2001/02/06 05:43:31 1.2
+++ ./drivers/md/raid5.c 2001/02/07 23:15:36
@@ -151,18 +151,16 @@
for (i=0; i<num; i++) {
struct page *page;
- bh = kmalloc(sizeof(struct buffer_head), priority);
+ bh = kmalloc(sizeof(struct io_head), priority);
if (!bh)
return 1;
- memset(bh, 0, sizeof (struct buffer_head));
- init_waitqueue_head(&bh->b_wait);
+ memset(bh, 0, sizeof (struct io_head));
page = alloc_page(priority);
bh->b_data = page_address(page);
if (!bh->b_data) {
kfree(bh);
return 1;
}
- atomic_set(&bh->b_count, 0);
bh->b_page = page;
sh->bh_cache[i] = bh;
@@ -412,7 +410,7 @@
spin_lock_irqsave(&conf->device_lock, flags);
}
} else {
- md_error(mddev_to_kdev(conf->mddev), bh->b_dev);
+ md_error(mddev_to_kdev(conf->mddev), conf->disks[i].dev);
clear_bit(BH_Uptodate, &bh->b_state);
}
clear_bit(BH_Lock, &bh->b_state);
@@ -440,7 +438,7 @@
md_spin_lock_irqsave(&conf->device_lock, flags);
if (!uptodate)
- md_error(mddev_to_kdev(conf->mddev), bh->b_dev);
+ md_error(mddev_to_kdev(conf->mddev), conf->disks[i].dev);
clear_bit(BH_Lock, &bh->b_state);
set_bit(STRIPE_HANDLE, &sh->state);
__release_stripe(conf, sh);
@@ -456,12 +454,10 @@
unsigned long block = sh->sector / (sh->size >> 9);
init_buffer(bh, raid5_end_read_request, sh);
- bh->b_dev = conf->disks[i].dev;
bh->b_blocknr = block;
bh->b_state = (1 << BH_Req) | (1 << BH_Mapped);
bh->b_size = sh->size;
- bh->b_list = BUF_LOCKED;
return bh;
}
@@ -1085,15 +1081,14 @@
else
bh->b_end_io = raid5_end_write_request;
if (conf->disks[i].operational)
- bh->b_dev = conf->disks[i].dev;
+ bh->b_rdev = conf->disks[i].dev;
else if (conf->spare && action[i] == WRITE+1)
- bh->b_dev = conf->spare->dev;
+ bh->b_rdev = conf->spare->dev;
else skip=1;
if (!skip) {
PRINTK("for %ld schedule op %d on disc %d\n", sh->sector, action[i]-1, i);
atomic_inc(&sh->count);
- bh->b_rdev = bh->b_dev;
- bh->b_rsector = bh->b_blocknr * (bh->b_size>>9);
+ bh->b_rsector = sh->sector;
generic_make_request(action[i]-1, bh);
} else {
PRINTK("skip op %d on disc %d for sector %ld\n", action[i]-1, i, sh->sector);
@@ -1502,7 +1497,7 @@
}
memory = conf->max_nr_stripes * (sizeof(struct stripe_head) +
- conf->raid_disks * ((sizeof(struct buffer_head) + PAGE_SIZE))) / 1024;
+ conf->raid_disks * ((sizeof(struct io_head) + PAGE_SIZE))) / 1024;
if (grow_stripes(conf, conf->max_nr_stripes, GFP_KERNEL)) {
printk(KERN_ERR "raid5: couldn't allocate %dkB for buffers\n", memory);
shrink_stripes(conf, conf->max_nr_stripes);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2001-02-08 0:40 UTC|newest]
Thread overview: 186+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-02-01 14:44 [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains bsuparna
2001-02-01 15:09 ` Christoph Hellwig
2001-02-01 16:08 ` Steve Lord
2001-02-01 16:49 ` Stephen C. Tweedie
2001-02-01 17:02 ` Christoph Hellwig
2001-02-01 17:34 ` Alan Cox
2001-02-01 17:49 ` Stephen C. Tweedie
2001-02-01 17:09 ` Chaitanya Tumuluri
2001-02-01 20:33 ` Christoph Hellwig
2001-02-01 20:56 ` Steve Lord
2001-02-01 20:59 ` Christoph Hellwig
2001-02-01 21:17 ` Steve Lord
2001-02-01 21:44 ` Stephen C. Tweedie
2001-02-01 22:07 ` Stephen C. Tweedie
2001-02-02 12:02 ` Christoph Hellwig
2001-02-05 12:19 ` Stephen C. Tweedie
2001-02-05 21:28 ` Ingo Molnar
2001-02-05 22:58 ` Stephen C. Tweedie
2001-02-05 23:06 ` Alan Cox
2001-02-05 23:16 ` Stephen C. Tweedie
2001-02-06 0:19 ` Manfred Spraul
2001-02-03 20:28 ` Linus Torvalds
2001-02-05 11:03 ` Stephen C. Tweedie
2001-02-05 12:00 ` Manfred Spraul
2001-02-05 15:03 ` Stephen C. Tweedie
2001-02-05 15:19 ` Alan Cox
2001-02-05 17:20 ` Stephen C. Tweedie
2001-02-05 17:29 ` Alan Cox
2001-02-05 18:49 ` Stephen C. Tweedie
2001-02-05 19:04 ` Alan Cox
2001-02-05 19:09 ` Linus Torvalds
2001-02-05 19:16 ` [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait Alan Cox
2001-02-05 19:28 ` Linus Torvalds
2001-02-05 20:54 ` Stephen C. Tweedie
2001-02-05 21:08 ` David Lang
2001-02-05 21:51 ` Alan Cox
2001-02-06 0:07 ` Stephen C. Tweedie
2001-02-06 17:00 ` Christoph Hellwig
2001-02-06 17:05 ` Stephen C. Tweedie
2001-02-06 17:14 ` Jens Axboe
2001-02-06 17:22 ` Christoph Hellwig
2001-02-06 18:26 ` Stephen C. Tweedie
2001-02-06 17:37 ` Ben LaHaise
2001-02-06 18:00 ` Jens Axboe
2001-02-06 18:09 ` Ben LaHaise
2001-02-06 19:35 ` Jens Axboe
2001-02-06 18:14 ` Linus Torvalds
2001-02-08 11:21 ` Andi Kleen
2001-02-08 14:11 ` Martin Dalecki
2001-02-08 17:59 ` Linus Torvalds
2001-02-06 18:18 ` Ingo Molnar
2001-02-06 18:25 ` Ben LaHaise
2001-02-06 18:35 ` Ingo Molnar
2001-02-06 18:54 ` Ben LaHaise
2001-02-06 18:58 ` Ingo Molnar
2001-02-06 19:11 ` Ben LaHaise
2001-02-06 19:32 ` Jens Axboe
2001-02-06 19:32 ` Ingo Molnar
2001-02-06 19:32 ` Linus Torvalds
2001-02-06 19:44 ` Ingo Molnar
2001-02-06 19:49 ` Ben LaHaise
2001-02-06 19:57 ` Ingo Molnar
2001-02-06 20:07 ` Jens Axboe
2001-02-06 20:25 ` Ben LaHaise
2001-02-06 20:41 ` Manfred Spraul
2001-02-06 20:50 ` Jens Axboe
2001-02-06 21:26 ` Manfred Spraul
2001-02-06 21:42 ` Linus Torvalds
2001-02-06 20:16 ` Marcelo Tosatti
2001-02-06 22:09 ` Jens Axboe
2001-02-06 22:26 ` Linus Torvalds
2001-02-06 21:13 ` Marcelo Tosatti
2001-02-06 23:26 ` Linus Torvalds
2001-02-07 23:17 ` select() returning busy for regular files [was Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait] Pavel Machek
2001-02-08 13:57 ` Ben LaHaise
2001-02-08 17:52 ` Linus Torvalds
2001-02-08 15:06 ` [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait Ben LaHaise
2001-02-08 13:44 ` Marcelo Tosatti
2001-02-08 13:45 ` Marcelo Tosatti
2001-02-07 23:15 ` Pavel Machek
2001-02-08 13:22 ` Stephen C. Tweedie
2001-02-08 12:03 ` Marcelo Tosatti
2001-02-08 15:46 ` Mikulas Patocka
2001-02-08 14:05 ` Marcelo Tosatti
2001-02-08 16:11 ` Mikulas Patocka
2001-02-08 14:44 ` Marcelo Tosatti
2001-02-08 16:57 ` Rik van Riel
2001-02-08 17:13 ` James Sutherland
2001-02-08 18:38 ` Linus Torvalds
2001-02-09 12:17 ` Martin Dalecki
2001-02-08 15:55 ` Jens Axboe
2001-02-08 18:09 ` Linus Torvalds
2001-02-08 14:52 ` Mikulas Patocka
2001-02-08 19:50 ` Stephen C. Tweedie
2001-02-11 21:30 ` Pavel Machek
2001-02-06 21:57 ` Manfred Spraul
2001-02-06 22:13 ` Linus Torvalds
2001-02-06 22:26 ` Andre Hedrick
2001-02-06 20:49 ` Jens Axboe
2001-02-07 0:21 ` Stephen C. Tweedie
2001-02-07 0:25 ` Ingo Molnar
2001-02-07 0:36 ` Stephen C. Tweedie
2001-02-07 0:50 ` Linus Torvalds
2001-02-07 1:49 ` Stephen C. Tweedie
2001-02-07 2:37 ` Linus Torvalds
2001-02-07 14:52 ` Stephen C. Tweedie
2001-02-07 19:12 ` Richard Gooch
2001-02-07 20:03 ` Stephen C. Tweedie
2001-02-07 1:51 ` Jeff V. Merkey
2001-02-07 1:01 ` Ingo Molnar
2001-02-07 1:59 ` Jeff V. Merkey
2001-02-07 1:02 ` Jens Axboe
2001-02-07 1:19 ` Linus Torvalds
2001-02-07 1:39 ` Jens Axboe
2001-02-07 1:45 ` Linus Torvalds
2001-02-07 1:55 ` Jens Axboe
2001-02-07 9:10 ` David Howells
2001-02-07 12:16 ` Stephen C. Tweedie
2001-02-07 2:00 ` Jeff V. Merkey
2001-02-07 1:06 ` Ingo Molnar
2001-02-07 1:09 ` Jens Axboe
2001-02-07 1:11 ` Ingo Molnar
2001-02-07 1:26 ` Linus Torvalds
2001-02-07 2:07 ` Jeff V. Merkey
2001-02-07 1:08 ` Jens Axboe
2001-02-07 2:08 ` Jeff V. Merkey
2001-02-07 1:42 ` Jeff V. Merkey
2001-02-07 0:42 ` Linus Torvalds
2001-02-07 0:35 ` Jens Axboe
2001-02-07 0:41 ` Linus Torvalds
2001-02-07 1:27 ` Stephen C. Tweedie
2001-02-07 1:40 ` Linus Torvalds
2001-02-12 10:07 ` Jamie Lokier
2001-02-06 20:26 ` Linus Torvalds
2001-02-06 20:25 ` Christoph Hellwig
2001-02-06 20:35 ` Ingo Molnar
2001-02-06 19:05 ` Marcelo Tosatti
2001-02-06 20:59 ` Ingo Molnar
2001-02-06 21:20 ` Steve Lord
2001-02-07 18:27 ` Christoph Hellwig
2001-02-06 20:59 ` Linus Torvalds
2001-02-07 18:26 ` Christoph Hellwig
2001-02-07 18:36 ` Linus Torvalds
2001-02-07 18:44 ` Christoph Hellwig
2001-02-08 0:34 ` Neil Brown [this message]
2001-02-06 19:46 ` Ingo Molnar
2001-02-06 20:16 ` Ben LaHaise
2001-02-06 20:22 ` Ingo Molnar
2001-02-06 19:20 ` Linus Torvalds
2001-02-06 0:31 ` Roman Zippel
2001-02-06 1:01 ` Linus Torvalds
2001-02-06 9:22 ` Roman Zippel
2001-02-06 9:30 ` Ingo Molnar
2001-02-06 1:08 ` David S. Miller
2001-02-05 22:09 ` [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains Ingo Molnar
2001-02-05 16:56 ` Linus Torvalds
2001-02-05 17:27 ` [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait Alan Cox
2001-02-05 16:36 ` [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains Linus Torvalds
2001-02-05 19:08 ` Stephen C. Tweedie
2001-02-01 17:49 ` Christoph Hellwig
2001-02-01 17:58 ` Alan Cox
2001-02-01 18:32 ` Rik van Riel
2001-02-01 18:59 ` yodaiken
2001-02-01 19:33 ` Stephen C. Tweedie
2001-02-01 18:51 ` bcrl
2001-02-01 16:16 ` Stephen C. Tweedie
2001-02-01 17:05 ` Christoph Hellwig
2001-02-01 17:09 ` Christoph Hellwig
2001-02-01 17:41 ` Stephen C. Tweedie
2001-02-01 18:14 ` Christoph Hellwig
2001-02-01 18:25 ` Alan Cox
2001-02-01 18:39 ` Rik van Riel
2001-02-01 18:46 ` [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait Alan Cox
2001-02-01 18:48 ` [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains Christoph Hellwig
2001-02-01 18:57 ` Alan Cox
2001-02-01 19:00 ` Christoph Hellwig
2001-02-01 19:32 ` Stephen C. Tweedie
2001-02-01 20:46 ` Christoph Hellwig
2001-02-01 21:25 ` Stephen C. Tweedie
2001-02-02 11:51 ` Christoph Hellwig
2001-02-02 14:04 ` Stephen C. Tweedie
2001-02-02 4:18 ` bcrl
2001-02-02 12:12 ` Christoph Hellwig
2001-02-01 20:04 ` Chaitanya Tumuluri
[not found] <CA2569E9.004A4E23.00@d73mta05.au.ibm.com>
2001-02-04 16:46 ` [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait Alan Cox
2001-02-12 14:56 bsuparna
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=14977.59814.532773.466631@notabene.cse.unsw.edu.au \
--to=neilb@cse.unsw.edu.au \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=bcrl@redhat.com \
--cc=hch@ns.caldera.de \
--cc=kiobuf-io-devel@lists.sourceforge.net \
--cc=linux-kernel@vger.kernel.org \
--cc=lord@sgi.com \
--cc=manfred@colorfullife.com \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=sct@redhat.com \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).