linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Patch: linux-2.5.20/fs/bio.c - ll_rw_kio made incorrect assumptions about queue handler's capabilities
@ 2002-06-06  0:04 Adam J. Richter
  2002-06-06  0:22 ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Adam J. Richter @ 2002-06-06  0:04 UTC (permalink / raw)
  To: akpm, axboe; +Cc: linux-kernel

Arg!  I forgot to attach the patch.  Here it is.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

--- linux-2.5.20/fs/bio.c	2002-06-02 18:44:40.000000000 -0700
+++ linux/fs/bio.c	2002-06-05 15:46:16.000000000 -0700
@@ -338,10 +338,12 @@
  **/
 void ll_rw_kio(int rw, struct kiobuf *kio, struct block_device *bdev, sector_t sector)
 {
-	int i, offset, size, err, map_i, total_nr_pages, nr_pages;
+	int offset, size, err, map_i, total_nr_pages, nr_pages;
 	struct bio_vec *bvec;
 	struct bio *bio;
 	kdev_t dev = to_kdev_t(bdev->bd_dev);
+	request_queue_t *q = bdev->bd_queue;
+	int max_bytes, max_pages;
 
 	err = 0;
 	if ((rw & WRITE) && is_read_only(dev)) {
@@ -355,6 +357,7 @@
 		goto out;
 	}
 
+	max_bytes = q->max_sectors << 9;
 	/*
 	 * maybe kio is bigger than the max we can easily map into a bio.
 	 * if so, split it up in appropriately sized chunks.
@@ -367,8 +370,14 @@
 
 	map_i = 0;
 
+	max_pages = (max_bytes + PAGE_MASK) >> PAGE_SHIFT;
+	if (max_pages > q->max_phys_segments)
+		max_pages = q->max_phys_segments;
+	if (max_pages > q->max_hw_segments)
+		max_pages = q->max_hw_segments;
+
 next_chunk:
-	nr_pages = BIO_MAX_SECTORS >> (PAGE_SHIFT - 9);
+	nr_pages = max_pages;
 	if (nr_pages > total_nr_pages)
 		nr_pages = total_nr_pages;
 
@@ -389,15 +398,17 @@
 	bio->bi_private = kio;
 
 	bvec = bio->bi_io_vec;
-	for (i = 0; i < nr_pages; i++, bvec++, map_i++) {
+	for (;total_nr_pages > 0; bvec++) {
 		int nbytes = PAGE_SIZE - offset;
 
 		if (nbytes > size)
 			nbytes = size;
+		if (nbytes > max_bytes)
+			nbytes = max_bytes;
 
 		BUG_ON(kio->maplist[map_i] == NULL);
 
-		if (bio->bi_size + nbytes > (BIO_MAX_SECTORS << 9))
+		if (bio->bi_size + nbytes > max_bytes)
 			goto queue_io;
 
 		bio->bi_vcnt++;
@@ -407,14 +418,14 @@
 		bvec->bv_len = nbytes;
 		bvec->bv_offset = offset;
 
-		/*
-		 * kiobuf only has an offset into the first page
-		 */
-		offset = 0;
+		offset = (offset + nbytes) & PAGE_MASK;
+		if (offset == 0) {
+			total_nr_pages--;
+			map_i++;
+		}
 
 		sector += nbytes >> 9;
 		size -= nbytes;
-		total_nr_pages--;
 		kio->offset += nbytes;
 	}
 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Patch: linux-2.5.20/fs/bio.c - ll_rw_kio made incorrect assumptions  about queue handler's capabilities
  2002-06-06  0:04 Patch: linux-2.5.20/fs/bio.c - ll_rw_kio made incorrect assumptions about queue handler's capabilities Adam J. Richter
@ 2002-06-06  0:22 ` Andrew Morton
  0 siblings, 0 replies; 4+ messages in thread
From: Andrew Morton @ 2002-06-06  0:22 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: axboe, linux-kernel

"Adam J. Richter" wrote:
> 
> ..
> +       max_bytes = q->max_sectors << 9;

Does this not also need to be done in fs/mpage.c?  It's
just using BIO_MAX_SIZE.

What particular problem are you trying to solve here?

>         /*
>          * maybe kio is bigger than the max we can easily map into a bio.
>          * if so, split it up in appropriately sized chunks.
> @@ -367,8 +370,14 @@
> 
>         map_i = 0;
> 
> +       max_pages = (max_bytes + PAGE_MASK) >> PAGE_SHIFT;
> +       if (max_pages > q->max_phys_segments)
> +               max_pages = q->max_phys_segments;
> +       if (max_pages > q->max_hw_segments)
> +               max_pages = q->max_hw_segments;
> +

I think probably this should be implemented as a block API
function.

This is going to drag us back into the BIO splitting quagmire.

>  next_chunk:
> -       nr_pages = BIO_MAX_SECTORS >> (PAGE_SHIFT - 9);
> +       nr_pages = max_pages;

hmm.  So BIO is based on PAGE_SIZE pages.  Not PAGE_CACHE_SIZE.
I currently have:

                unsigned nr_bvecs = MPAGE_BIO_MAX_SIZE / PAGE_CACHE_SIZE;

Which is about the only sane way in which the pagecache BIO
assembly code can go from "bytes" to "number of pages".
It's going to get interesting if someone makes PAGE_SIZE != PAGE_CACHE_SIZE.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Patch: linux-2.5.20/fs/bio.c - ll_rw_kio made incorrect assumptions about queue handler's capabilities
@ 2002-06-06  1:02 Adam J. Richter
  0 siblings, 0 replies; 4+ messages in thread
From: Adam J. Richter @ 2002-06-06  1:02 UTC (permalink / raw)
  To: akpm; +Cc: axboe, linux-kernel

Andrew Morton <akpm@zip.com.au> writes:
>What particular problem are you trying to solve here?

	In 2.5.20, ll_rw_kio could submit bio's that are too
big for the underlying queue to handle (they might have more
sectors than q->max_sectors, more iovec's than q->max_phys_segments
or q->max_hw_segments).

	As a hypothetical example, suppose that ll_rw_kio is
called to copy 148kB of data (37 4kB pages on x86) to a Compaq
Smart2 raid controller.  Assume also that the pages are not
contiguous in physical memory and there is no iommu to make them
look contiguous to a DMA device, so there will be no merging.

	In this example, ll_rw_kio in linux-2.5.20 would handle this
call by building a single bio with 37 iovec's and pass it in a single
call to submit_bio.  Nothing in the request merging code in
drivers/block/ll_rw_blk.c will break up this bio.  blk_recount_segments
will fill in bio->bi_phys_segments and bio->bi_hw_segments as 37
(because there is no merging to make it any smaller).
blk_recalc_rq_segments will then set rq->nr_phys_segments to 37
(the request will contain only this one bio, becasue bio->bi_phys_segments
already excceds q->max_phys_segments).

	Eventually, do_cciss_request (in drivers/block/cciss.c) will
extract the request containing the bio with bio->bi_phys_segments == 37.
request and generate a kernel panic because the number of segements
exceeds 32.  In the case of linux-2.5.20/drivers/block/cciss.c, this
happens around line 1799:

        creq = elv_next_request(q);
        if (creq->nr_phys_segments > MAXSGENTRIES)
                BUG();


Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Patch: linux-2.5.20/fs/bio.c - ll_rw_kio made incorrect assumptions about queue handler's capabilities
@ 2002-06-05 23:02 Adam J. Richter
  0 siblings, 0 replies; 4+ messages in thread
From: Adam J. Richter @ 2002-06-05 23:02 UTC (permalink / raw)
  To: axboe; +Cc: linux-kernel

	I have not tested this patch, and I am not sure how to.
Perhaps you or someone else can suggest a program that exercises
ll_rw_kio.

	It seems to me that ll_rw_kio in linux-2.5.20 can generate
bio's that have more segments or more sectors than their queues declare
that they can handle.  For example, you could have block driver
that has max_hw_segments == 1, max_phys_segments == 1, max_sectors == 1.

	The intermediate bio merging code in drivers/block/ll_rw_blk.c
obeys these constraints when it comes to merging requests that
also are all within these constraints already, but it does break
up bio's that were to big for the underlying queue at the time they
were submitted.  As far as I can tell, it is the responsibility of
anything that calls submit_bio or generic_make_request to ensure that the
bio that is submitted is within the request_queue_t's limits
if the driver that pulls requests off of the queue is going to
be able to use max_hw_segments, max_phys_segments and max_sectors
as guarantees.

	I have attached an improved version of the patch that I
posted to linux-kernel yesterday.  There have been no comments
about that posting.  This new version also handles the case
where the queue's maximum number of sectors is too little to
hold even one full page.  I am cc'ing this version to linux-kernel
as well, in case anyone wants to comment.

	Jens, does this patch look OK to you?  If it does look OK,
is there anything else that I should do (e.g., is there some test I should
run?), or are do you want to take it from here and handle submitting
it to Linus?

-- 
Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2002-06-06  1:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-06-06  0:04 Patch: linux-2.5.20/fs/bio.c - ll_rw_kio made incorrect assumptions about queue handler's capabilities Adam J. Richter
2002-06-06  0:22 ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2002-06-06  1:02 Adam J. Richter
2002-06-05 23:02 Adam J. Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).