linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] block layer support for DMA IOMMU bypass mode
@ 2003-07-01 16:46 James Bottomley
  2003-07-01 17:09 ` Andi Kleen
                   ` (3 more replies)
  0 siblings, 4 replies; 52+ messages in thread
From: James Bottomley @ 2003-07-01 16:46 UTC (permalink / raw)
  To: Jens Axboe, Grant Grundler, ak, davem, suparna
  Cc: Linux Kernel, Alex Williamson, Bjorn Helgaas

[-- Attachment #1: Type: text/plain, Size: 2014 bytes --]

Background:

Quite a few servers on the market today include an IOMMU to try to patch
up bus to memory accessibility issues.  However, a fair number come with
the caveat that actually using the IOMMU is expensive.

Some IOMMUs come with a "bypass" mode, where the IOMMU won't try to
translate the physical address coming from the device but will instead
place it directly on the memory bus. For some machines (ia-64, and
possibly x86_64) any address not programmed into the IOMMU for
translation is viewed as a bypass.  For others (parisc SBA) you have to
assert specific address bus bits to get the bypass.

All IOMMUs supporting the bypass mode will allow it to be used
selectively, so a DMA transfer may include SG both bypass and mapped
segments.

The Problem:

At the moment, the block layer assumes segments may be virtually
mergeable (i.e. two phsically discondiguous pages may be treated as a
single SG entity for DMA because the IOMMU will patch up the
discontinuity) if an IOMMU is present in the system.  This effectively
stymies using bypass mode, because segments may not be virtually merged
in a bypass operation.

The Solution:

Is to teach the block layer not to virtually merge segments if either
segment may be bypassed.  To that end, the block layer has to know what
the physical dma mask is (not the bounce limit, which is different) and
it must also know the address bits that must be asserted in bypass
mode.  To that end, I've introduced a new #define for asm/io.h

BIO_VMERGE_BYPASS_MASK

Which is set either to the physical bits that have to be asserted, or
simply to an address (like 0x1) that will always pass the device's
dma_mask.

I've also introduced a new block layer callback

blk_queue_dma_mask(q, dma_mask)

Who's job is to set the physical dma_mask of the queue (it defaults to
0xffffffff)

You can see how this works in the attached patch (block layer only; the
DMA engines of platforms wishing to take advantage of bypassing would
also have to be altered).

Comments?

James


[-- Attachment #2: tmp.diff --]
[-- Type: text/plain, Size: 7368 bytes --]

===== drivers/block/DAC960.c 1.60 vs edited =====
--- 1.60/drivers/block/DAC960.c	Fri Jun  6 01:37:51 2003
+++ edited/drivers/block/DAC960.c	Tue Jul  1 10:51:21 2003
@@ -2475,6 +2475,8 @@
   RequestQueue = &Controller->RequestQueue;
   blk_init_queue(RequestQueue, DAC960_RequestFunction, &Controller->queue_lock);
   blk_queue_bounce_limit(RequestQueue, Controller->BounceBufferLimit);
+  blk_queue_dma_mask(RequestQueue, Controller->BounceBufferLimit);
+
   RequestQueue->queuedata = Controller;
   blk_queue_max_hw_segments(RequestQueue,
 			    Controller->DriverScatterGatherLimit);
===== drivers/block/cciss.c 1.82 vs edited =====
--- 1.82/drivers/block/cciss.c	Thu Jun  5 08:17:28 2003
+++ edited/drivers/block/cciss.c	Tue Jul  1 10:49:56 2003
@@ -2541,6 +2541,7 @@
 	spin_lock_init(&hba[i]->lock);
         blk_init_queue(q, do_cciss_request, &hba[i]->lock);
 	blk_queue_bounce_limit(q, hba[i]->pdev->dma_mask);
+	blk_queue_dma_mask(q, hba[i]->pdev->dma_mask);
 
 	/* This is a hardware imposed limit. */
 	blk_queue_max_hw_segments(q, MAXSGENTRIES);
===== drivers/block/cpqarray.c 1.77 vs edited =====
--- 1.77/drivers/block/cpqarray.c	Wed Jun 11 01:33:24 2003
+++ edited/drivers/block/cpqarray.c	Tue Jul  1 10:51:52 2003
@@ -389,6 +389,7 @@
 		spin_lock_init(&hba[i]->lock);
 		blk_init_queue(q, do_ida_request, &hba[i]->lock);
 		blk_queue_bounce_limit(q, hba[i]->pci_dev->dma_mask);
+		blk_queue_dma_mask(q, hba[i]->pci_dev->dma_mask);
 
 		/* This is a hardware imposed limit. */
 		blk_queue_max_hw_segments(q, SG_MAX);
===== drivers/block/ll_rw_blk.c 1.174 vs edited =====
--- 1.174/drivers/block/ll_rw_blk.c	Mon Jun  2 20:32:46 2003
+++ edited/drivers/block/ll_rw_blk.c	Tue Jul  1 10:39:46 2003
@@ -213,12 +213,32 @@
 	 * by default assume old behaviour and bounce for any highmem page
 	 */
 	blk_queue_bounce_limit(q, BLK_BOUNCE_HIGH);
+	/*
+	 * and assume a 32 bit dma mask 
+	 */
+	blk_queue_dma_mask(q, 0xffffffff);
 
 	init_waitqueue_head(&q->queue_wait);
 	INIT_LIST_HEAD(&q->plug_list);
 }
 
 /**
+ * blk_queue_dma_mask - set queue dma mask
+ * @q:	the request queue for the device
+ * @dma_addr:	bus address limit
+ *
+ * Description:
+ *    This will set the device physical DMA mask.  This is used by
+ *    the bio layer to arrange the segments correctly for IOMMUs that
+ *    can be programmed in bypass mode.  Note: setting this does *not*
+ *    change whether the device goes through an IOMMU or not
+ **/
+void blk_queue_dma_mask(request_queue_t *q, u64 dma_mask)
+{
+	q->dma_mask = dma_mask;
+}
+
+/**
  * blk_queue_bounce_limit - set bounce buffer limit for queue
  * @q:  the request queue for the device
  * @dma_addr:   bus address limit
@@ -746,7 +766,7 @@
 			continue;
 		}
 new_segment:
-		if (!bvprv || !BIOVEC_VIRT_MERGEABLE(bvprv, bv))
+		if (!bvprv || !BIOVEC_VIRT_MERGEABLE(q, bvprv, bv))
 			nr_hw_segs++;
 
 		nr_phys_segs++;
@@ -787,7 +807,7 @@
 	if (!(q->queue_flags & (1 << QUEUE_FLAG_CLUSTER)))
 		return 0;
 
-	if (!BIOVEC_VIRT_MERGEABLE(__BVEC_END(bio), __BVEC_START(nxt)))
+	if (!BIOVEC_VIRT_MERGEABLE(q, __BVEC_END(bio), __BVEC_START(nxt)))
 		return 0;
 	if (bio->bi_size + nxt->bi_size > q->max_segment_size)
 		return 0;
@@ -909,7 +929,7 @@
 		return 0;
 	}
 
-	if (BIOVEC_VIRT_MERGEABLE(__BVEC_END(req->biotail), __BVEC_START(bio)))
+	if (BIOVEC_VIRT_MERGEABLE(q, __BVEC_END(req->biotail), __BVEC_START(bio)))
 		return ll_new_mergeable(q, req, bio);
 
 	return ll_new_hw_segment(q, req, bio);
@@ -924,7 +944,7 @@
 		return 0;
 	}
 
-	if (BIOVEC_VIRT_MERGEABLE(__BVEC_END(bio), __BVEC_START(req->bio)))
+	if (BIOVEC_VIRT_MERGEABLE(q, __BVEC_END(bio), __BVEC_START(req->bio)))
 		return ll_new_mergeable(q, req, bio);
 
 	return ll_new_hw_segment(q, req, bio);
===== drivers/ide/ide-lib.c 1.8 vs edited =====
--- 1.8/drivers/ide/ide-lib.c	Thu Mar  6 17:27:52 2003
+++ edited/drivers/ide/ide-lib.c	Tue Jul  1 10:49:14 2003
@@ -406,6 +406,9 @@
 			addr = HWIF(drive)->pci_dev->dma_mask;
 	}
 
+	if((HWIF(drive)->pci_dev))
+		blk_queue_dma_mask(&drive->queue,
+				   HWIF(drive)->pci_dev->dma_mask);
 	blk_queue_bounce_limit(&drive->queue, addr);
 }
 
===== drivers/scsi/scsi_lib.c 1.99 vs edited =====
--- 1.99/drivers/scsi/scsi_lib.c	Sun Jun 29 20:14:44 2003
+++ edited/drivers/scsi/scsi_lib.c	Tue Jul  1 10:54:19 2003
@@ -1256,6 +1256,8 @@
 	blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
 	blk_queue_max_sectors(q, shost->max_sectors);
 	blk_queue_bounce_limit(q, scsi_calculate_bounce_limit(shost));
+	if(scsi_get_device(shost) && scsi_get_device(shost)->dma_mask)
+		blk_queue_dma_mask(q, *scsi_get_device(shost)->dma_mask);
 
 	if (!shost->use_clustering)
 		clear_bit(QUEUE_FLAG_CLUSTER, &q->queue_flags);
===== include/linux/bio.h 1.32 vs edited =====
--- 1.32/include/linux/bio.h	Tue Jun 10 07:54:25 2003
+++ edited/include/linux/bio.h	Tue Jul  1 11:20:26 2003
@@ -30,6 +30,11 @@
 #define BIO_VMERGE_BOUNDARY	0
 #endif
 
+/* Can the IOMMU (if any) work in bypass mode */
+#ifndef BIO_VMERGE_BYPASS_MASK
+#define BIO_VMERGE_BYPASS_MASK	0
+#endif
+
 #define BIO_DEBUG
 
 #ifdef BIO_DEBUG
@@ -156,6 +161,13 @@
 
 #define __bio_kunmap_atomic(addr, kmtype) kunmap_atomic(addr, kmtype)
 
+/* can the bio vec be directly physically addressed by the device */
+#define __BVEC_PHYS_DIRECT_OK(q, vec) \
+	((bvec_to_phys(vec) & (q)->dma_mask) == bvec_to_phys(vec))
+/* Is the queue dma_mask eligible to be bypassed */
+#define __BIO_CAN_BYPASS(q) \
+	((BIO_VMERGE_BYPASS_MASK) && ((q)->dma_mask & (BIO_VMERGE_BYPASS_MASK)) == (BIO_VMERGE_BYPASS_MASK))
+
 /*
  * merge helpers etc
  */
@@ -164,8 +176,10 @@
 #define __BVEC_START(bio)	bio_iovec_idx((bio), 0)
 #define BIOVEC_PHYS_MERGEABLE(vec1, vec2)	\
 	((bvec_to_phys((vec1)) + (vec1)->bv_len) == bvec_to_phys((vec2)))
-#define BIOVEC_VIRT_MERGEABLE(vec1, vec2)	\
-	((((bvec_to_phys((vec1)) + (vec1)->bv_len) | bvec_to_phys((vec2))) & (BIO_VMERGE_BOUNDARY - 1)) == 0)
+#define BIOVEC_VIRT_MERGEABLE(q, vec1, vec2)	\
+	(((((bvec_to_phys((vec1)) + (vec1)->bv_len) | bvec_to_phys((vec2))) & (BIO_VMERGE_BOUNDARY - 1)) == 0) \
+	&& !( __BIO_CAN_BYPASS(q) && (__BVEC_PHYS_DIRECT_OK(q, vec1) \
+				      || __BVEC_PHYS_DIRECT_OK(q, vec2))))
 #define __BIO_SEG_BOUNDARY(addr1, addr2, mask) \
 	(((addr1) | (mask)) == (((addr2) - 1) | (mask)))
 #define BIOVEC_SEG_BOUNDARY(q, b1, b2) \
===== include/linux/blkdev.h 1.109 vs edited =====
--- 1.109/include/linux/blkdev.h	Wed Jun 11 20:17:55 2003
+++ edited/include/linux/blkdev.h	Tue Jul  1 10:39:18 2003
@@ -255,6 +255,12 @@
 	unsigned long		bounce_pfn;
 	int			bounce_gfp;
 
+	/*
+	 * The physical dma_mask for the queue (used to make IOMMU
+	 * bypass decisions)
+	 */
+	u64			dma_mask;
+
 	struct list_head	plug_list;
 
 	/*
@@ -458,6 +464,7 @@
 extern void blk_cleanup_queue(request_queue_t *);
 extern void blk_queue_make_request(request_queue_t *, make_request_fn *);
 extern void blk_queue_bounce_limit(request_queue_t *, u64);
+extern void blk_queue_dma_mask(request_queue_t *, u64);
 extern void blk_queue_max_sectors(request_queue_t *, unsigned short);
 extern void blk_queue_max_phys_segments(request_queue_t *, unsigned short);
 extern void blk_queue_max_hw_segments(request_queue_t *, unsigned short);

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 16:46 [RFC] block layer support for DMA IOMMU bypass mode James Bottomley
@ 2003-07-01 17:09 ` Andi Kleen
  2003-07-01 17:28   ` James Bottomley
  2003-07-01 19:19 ` Grant Grundler
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 52+ messages in thread
From: Andi Kleen @ 2003-07-01 17:09 UTC (permalink / raw)
  To: James Bottomley
  Cc: axboe, grundler, davem, suparna, linux-kernel, alex_williamson,
	bjorn_helgaas

On 01 Jul 2003 11:46:12 -0500
James Bottomley <James.Bottomley@steeleye.com> wrote:

> 
> Some IOMMUs come with a "bypass" mode, where the IOMMU won't try to
> translate the physical address coming from the device but will instead
> place it directly on the memory bus. For some machines (ia-64, and
> possibly x86_64) any address not programmed into the IOMMU for

That's the case on x86_64 yes.


> The Problem:
> 
> At the moment, the block layer assumes segments may be virtually
> mergeable (i.e. two phsically discondiguous pages may be treated as a
> single SG entity for DMA because the IOMMU will patch up the
> discontinuity) if an IOMMU is present in the system.  This effectively
> stymies using bypass mode, because segments may not be virtually merged
> in a bypass operation.

I assume on 2.5 has this problem, not 2.4, right?

> 
> The Solution:
> 
> Is to teach the block layer not to virtually merge segments if either
> segment may be bypassed.  To that end, the block layer has to know what
> the physical dma mask is (not the bounce limit, which is different) and
> it must also know the address bits that must be asserted in bypass
> mode.  To that end, I've introduced a new #define for asm/io.h
> 
> BIO_VMERGE_BYPASS_MASK

But a mask is not good for AMD64 because there is no guarantee 
that the bypass/iommu address is checkable using a mask
(K8 uses an memory hole for IOMMU purposes and for various
reasons the hole can be anywhere in the address space)

This means x86_64 needs an function. Also the name is quite weird and 
the issue is not really BIO  specific. How about just calling it
iommu_address() ?


-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 17:09 ` Andi Kleen
@ 2003-07-01 17:28   ` James Bottomley
  2003-07-01 17:42     ` Andi Kleen
  2003-07-01 17:54     ` H. Peter Anvin
  0 siblings, 2 replies; 52+ messages in thread
From: James Bottomley @ 2003-07-01 17:28 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jens Axboe, Grant Grundler, davem, suparna, Linux Kernel,
	alex_williamson, bjorn_helgaas

On Tue, 2003-07-01 at 12:09, Andi Kleen wrote:
> I assume on 2.5 has this problem, not 2.4, right?

Yes, sorry, I'm so focussed on 2.5 I keep forgetting 2.4.

> But a mask is not good for AMD64 because there is no guarantee 
> that the bypass/iommu address is checkable using a mask
> (K8 uses an memory hole for IOMMU purposes and for various
> reasons the hole can be anywhere in the address space)
> 
> This means x86_64 needs an function. Also the name is quite weird and 
> the issue is not really BIO  specific. How about just calling it
> iommu_address() ?

The name was simply to be consistent with BIO_VMERGE_BOUNDARY which is
another asm/io.h setting for this.

Could you elaborate more on the amd64 IOMMU window.  Is this a window
where IOMMU mapping always takes place?

I'm a bit reluctant to put a function like this in because the block
layer does a very good job of being separate from the dma layer. 
Maintaining this separation is one of the reasons I added a dma_mask to
the request_queue, not a generic device pointer.

James



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 17:28   ` James Bottomley
@ 2003-07-01 17:42     ` Andi Kleen
  2003-07-01 19:22       ` Grant Grundler
  2003-07-01 19:56       ` James Bottomley
  2003-07-01 17:54     ` H. Peter Anvin
  1 sibling, 2 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-01 17:42 UTC (permalink / raw)
  To: James Bottomley
  Cc: axboe, grundler, davem, suparna, linux-kernel, alex_williamson,
	bjorn_helgaas

On 01 Jul 2003 12:28:47 -0500
James Bottomley <James.Bottomley@steeleye.com> wrote:


> Could you elaborate more on the amd64 IOMMU window.  Is this a window
> where IOMMU mapping always takes place?

Yes.

K8 doesn't have a real IOMMU. Instead it extended the AGP aperture to work
for PCI devices too.  The AGP aperture is a hole in memory configured 
at boot, normally mapped directly below 4GB, but it can be elsewhere
(it's actually an BIOS option on machines without AGP chip and when 
the BIOS option is off Linux allocates some memory and puts the hole
on top of it. This allocated hole can be anywhere in the first 4GB) 
Inside the AGP aperture memory is always remapped, you get a bus abort
when you access an area in there that is not mapped.

In short to detect it it needs to test against an address range, 
a mask is not enough.

> 
> I'm a bit reluctant to put a function like this in because the block
> layer does a very good job of being separate from the dma layer. 
> Maintaining this separation is one of the reasons I added a dma_mask to
> the request_queue, not a generic device pointer.

Not sure I understand why you want to do this in the block layer.
It's a generic extension of the PCI DMA API. The block devices/layer itself
has no business knowing such intimate details about the pci dma 
implementation, it should just ask.

-Andi


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 17:28   ` James Bottomley
  2003-07-01 17:42     ` Andi Kleen
@ 2003-07-01 17:54     ` H. Peter Anvin
  1 sibling, 0 replies; 52+ messages in thread
From: H. Peter Anvin @ 2003-07-01 17:54 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <1057080529.2003.62.camel@mulgrave>
By author:    James Bottomley <James.Bottomley@steeleye.com>
In newsgroup: linux.dev.kernel
> 
> The name was simply to be consistent with BIO_VMERGE_BOUNDARY which is
> another asm/io.h setting for this.
> 
> Could you elaborate more on the amd64 IOMMU window.  Is this a window
> where IOMMU mapping always takes place?
> 

It's a window (in the form of a BAR - base and mask) within which
IOMMU mapping always takes place.  Outside the window everything is
bypass.

This applies to all x86-64 machines and some i386 machines, in
particular those i386 chipsets with "full GART" support as opposed to
"AGP only GART" (my terminology.)

Andi likes to say this isn't a real IOMMU (mostly because it doesn't
solve the legacy region problem), but I disagree with that view.  It
still would be nicer if it covered more address space, though.

I don't know if it would be worthwhile to support "full GART" on the
i386 systems which support it.

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 16:46 [RFC] block layer support for DMA IOMMU bypass mode James Bottomley
  2003-07-01 17:09 ` Andi Kleen
@ 2003-07-01 19:19 ` Grant Grundler
  2003-07-01 19:59   ` Alex Williamson
  2003-07-01 20:03   ` James Bottomley
  2003-07-01 22:51 ` David S. Miller
  2003-07-01 23:57 ` [RFC] block layer support for DMA IOMMU bypass mode II Andi Kleen
  3 siblings, 2 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-01 19:19 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jens Axboe, ak, davem, suparna, Linux Kernel, Alex Williamson,
	Bjorn Helgaas

On Tue, Jul 01, 2003 at 11:46:12AM -0500, James Bottomley wrote:
...
> However, a fair number come with
> the caveat that actually using the IOMMU is expensive.

Clarification:
IOMMU mapping is slightly more expensive than direct physical on HP boxes.
(yes davem, you've told me how wonderful sparc IOMMU is ;^)
But obviously alot less expensive than bounce buffers.

> The Problem:
> 
> At the moment, the block layer assumes segments may be virtually
> mergeable (i.e. two phsically discondiguous pages may be treated as a
> single SG entity for DMA because the IOMMU will patch up the
> discontinuity) if an IOMMU is present in the system.

The symptom is drivers which have limits on DMA entries will return
errors (or crash) when the IOMMU code doesn't actually merge as much
as the BIO code expected.

Specifically, sym53c8xx_2 only takes 96 DMA entries per IO and davidm
hit that pretty easily on ia64.
MPT/Fusion (LSI u32) doesn't seem to have a limit.
IDE limit is PAGE_SIZE/8 (or 16k/8=2k for ia64).
I haven't checked other drivers.

...
> +/* Is the queue dma_mask eligible to be bypassed */
> +#define __BIO_CAN_BYPASS(q) \
> +	((BIO_VMERGE_BYPASS_MASK) && ((q)->dma_mask & (BIO_VMERGE_BYPASS_MASK)) == (BIO_VMERGE_BYPASS_MASK))

Like Andi, I had suggested a callback into IOMMU code here.
But I'm pretty sure james proposal will work for ia64 and parisc.

Ideally, I don't like to see two seperate chunks of code performing
the "let's see what I can merge now" loops. Instead, BIO could merge
"intra-page" segments and call the IOMMU code to "merge" remaining
"inter-page" segments. IOMMU code needs to know how many physical entries
are allowed (when to stop processing) and could return the number of
sg list entries it was able to merge.

thanks james!
grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 17:42     ` Andi Kleen
@ 2003-07-01 19:22       ` Grant Grundler
  2003-07-01 19:56       ` James Bottomley
  1 sibling, 0 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-01 19:22 UTC (permalink / raw)
  To: Andi Kleen
  Cc: James Bottomley, axboe, grundler, davem, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

On Tue, Jul 01, 2003 at 07:42:41PM +0200, Andi Kleen wrote:
> K8 doesn't have a real IOMMU. Instead it extended the AGP aperture to work
> for PCI devices too.

*gag*...sounds like exactly the opposite HP ZX1 workstations do.
They used part of the SBA IOMMU for AGP GART.

thanks,
grant


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 17:42     ` Andi Kleen
  2003-07-01 19:22       ` Grant Grundler
@ 2003-07-01 19:56       ` James Bottomley
  1 sibling, 0 replies; 52+ messages in thread
From: James Bottomley @ 2003-07-01 19:56 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jens Axboe, Grant Grundler, davem, suparna, Linux Kernel,
	alex_williamson, bjorn_helgaas

On Tue, 2003-07-01 at 12:42, Andi Kleen wrote:
> K8 doesn't have a real IOMMU. Instead it extended the AGP aperture to work
> for PCI devices too.  The AGP aperture is a hole in memory configured 
> at boot, normally mapped directly below 4GB, but it can be elsewhere
> (it's actually an BIOS option on machines without AGP chip and when 
> the BIOS option is off Linux allocates some memory and puts the hole
> on top of it. This allocated hole can be anywhere in the first 4GB) 
> Inside the AGP aperture memory is always remapped, you get a bus abort
> when you access an area in there that is not mapped.
> 
> In short to detect it it needs to test against an address range, 
> a mask is not enough.

It sounds like basically anything not physically in the window is
bypassable, so you just set BIO_VMERGE_BYPASS_MASK to 1.  Thus, any
segment within the device's dma_mask gets bypassed, and anything that's
not has to be remapped within the window.

I don't see where you need to put extra information into the virtual
merging process.

> > I'm a bit reluctant to put a function like this in because the block
> > layer does a very good job of being separate from the dma layer. 
> > Maintaining this separation is one of the reasons I added a dma_mask to
> > the request_queue, not a generic device pointer.
> 
> Not sure I understand why you want to do this in the block layer.
> It's a generic extension of the PCI DMA API. The block devices/layer itself
> has no business knowing such intimate details about the pci dma 
> implementation, it should just ask.

Virtual merging is already part of the block layer.  It actually
interferes with the ability to bypass the IOMMU because you can't merge
virtually if you want to do a bypass.

James



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 19:19 ` Grant Grundler
@ 2003-07-01 19:59   ` Alex Williamson
  2003-07-01 20:11     ` James Bottomley
  2003-07-01 20:03   ` James Bottomley
  1 sibling, 1 reply; 52+ messages in thread
From: Alex Williamson @ 2003-07-01 19:59 UTC (permalink / raw)
  To: Grant Grundler
  Cc: James Bottomley, Jens Axboe, ak, davem, suparna, Linux Kernel,
	Bjorn Helgaas

Grant Grundler wrote:
> 
> But I'm pretty sure james proposal will work for ia64 and parisc.
> 

   The thing that's got me concerned about this is that it allows
for sg lists that contains both entries that the block layer
expects will be mapped into the iommu and ones that it expects
to bypass.  I don't like the implications of parsing through
sg lists looking for bypass-able and non-bypass-able groupings.
This seems like a lot more overhead than we have now and the
complexity of merging partially bypass-able scatterlists seems
time consuming.

   The current ia64 sba_iommu does a quick and dirty sg bypass
check.  If the device can dma to any memory address, the entire
sg list is bypassed.  If not, the entire list is coalesced and
mapped by the iommu.  The idea being that true performance
devices will have 64bit dma masks and be able to quickly bypass.
Everything else will at least get the benefit of coalescing
entries to make more efficient dma.  The coalescing is a bit
simpler since it's the entire list as well.  With this proposal,
we'd have to add a lot of complexity to partially bypass sg
lists.  I don't necessarily see that as a benefit.  Thanks,

	Alex

-- 
Alex Williamson                             HP Linux & Open Source Lab

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 19:19 ` Grant Grundler
  2003-07-01 19:59   ` Alex Williamson
@ 2003-07-01 20:03   ` James Bottomley
  2003-07-01 23:01     ` Grant Grundler
  1 sibling, 1 reply; 52+ messages in thread
From: James Bottomley @ 2003-07-01 20:03 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Jens Axboe, ak, davem, suparna, Linux Kernel, Alex Williamson,
	Bjorn Helgaas

On Tue, 2003-07-01 at 14:19, Grant Grundler wrote:
> On Tue, Jul 01, 2003 at 11:46:12AM -0500, James Bottomley wrote:
> > +/* Is the queue dma_mask eligible to be bypassed */
> > +#define __BIO_CAN_BYPASS(q) \
> > +	((BIO_VMERGE_BYPASS_MASK) && ((q)->dma_mask & (BIO_VMERGE_BYPASS_MASK)) == (BIO_VMERGE_BYPASS_MASK))
> 
> Like Andi, I had suggested a callback into IOMMU code here.
> But I'm pretty sure james proposal will work for ia64 and parisc.

OK, the core of my objection to this is that at the moment there's no
entangling of the bio layer and the DMA layer.  The bio layer works with
a nice finite list of generic or per-queue constraints; it doesn't care
currently what the underlying device or IOMMU does.  Putting such a
callback in would add this entanglement.

It could be that the bio people will be OK with this, and I'm just
worrying about nothing, but in that case, they need to say so...

James



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 19:59   ` Alex Williamson
@ 2003-07-01 20:11     ` James Bottomley
  0 siblings, 0 replies; 52+ messages in thread
From: James Bottomley @ 2003-07-01 20:11 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Grant Grundler, Jens Axboe, ak, davem, suparna, Linux Kernel,
	Bjorn Helgaas

On Tue, 2003-07-01 at 14:59, Alex Williamson wrote:
>    The thing that's got me concerned about this is that it allows
> for sg lists that contains both entries that the block layer
> expects will be mapped into the iommu and ones that it expects
> to bypass.  I don't like the implications of parsing through
> sg lists looking for bypass-able and non-bypass-able groupings.
> This seems like a lot more overhead than we have now and the
> complexity of merging partially bypass-able scatterlists seems
> time consuming.
> 
>    The current ia64 sba_iommu does a quick and dirty sg bypass
> check.  If the device can dma to any memory address, the entire
> sg list is bypassed.  If not, the entire list is coalesced and
> mapped by the iommu.  The idea being that true performance
> devices will have 64bit dma masks and be able to quickly bypass.
> Everything else will at least get the benefit of coalescing
> entries to make more efficient dma.  The coalescing is a bit
> simpler since it's the entire list as well.  With this proposal,
> we'd have to add a lot of complexity to partially bypass sg
> lists.  I don't necessarily see that as a benefit.  Thanks,

But if that's all you want, you simply set the BIO_VMERGE_BYPASS_MASK to
the full u64 set bitmask.  Then it will only turn off virtual merging
for devices that have a fully set dma_mask, and your simple test will
work.

James



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 16:46 [RFC] block layer support for DMA IOMMU bypass mode James Bottomley
  2003-07-01 17:09 ` Andi Kleen
  2003-07-01 19:19 ` Grant Grundler
@ 2003-07-01 22:51 ` David S. Miller
  2003-07-01 23:57 ` [RFC] block layer support for DMA IOMMU bypass mode II Andi Kleen
  3 siblings, 0 replies; 52+ messages in thread
From: David S. Miller @ 2003-07-01 22:51 UTC (permalink / raw)
  To: James.Bottomley
  Cc: axboe, grundler, ak, suparna, linux-kernel, alex_williamson,
	bjorn_helgaas


I personally don't care how this is done, as long as I can make
all the overhead from the checks go away on my platform by
defining the interface macro to do nothing :-)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 20:03   ` James Bottomley
@ 2003-07-01 23:01     ` Grant Grundler
  2003-07-02 15:52       ` James Bottomley
  0 siblings, 1 reply; 52+ messages in thread
From: Grant Grundler @ 2003-07-01 23:01 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jens Axboe, ak, davem, suparna, Linux Kernel, Alex Williamson,
	Bjorn Helgaas

On Tue, Jul 01, 2003 at 03:03:45PM -0500, James Bottomley wrote:
> OK, the core of my objection to this is that at the moment there's no
> entangling of the bio layer and the DMA layer.

I agree this is a good thing.

> The bio layer works with
> a nice finite list of generic or per-queue constraints; it doesn't care
> currently what the underlying device or IOMMU does.

I don't agree. This whole discussion revolves around getting BIO code and
IOMMU code to agree on how block merging works for a given platform.
Using a callback into IOMMU code means the BIO truly doesn't have to know.
The platform specific IOMMU could just tell BIO code what it wants to
know (how many SG entries would fit into a limited number of physical
mappings).

> Putting such a callback in would add this entanglement.

yes, sort of. But I think this entanglement is present even for machines
that don't have an IOMMU because of bounce buffers.  But if ia64's swiotlb
would be made generic to cover buffer bouncing....

> It could be that the bio people will be OK with this, and I'm just
> worrying about nothing, but in that case, they need to say so...

Would that be Jens Axboe/Dave Miller/et al?

thanks,
grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-01 16:46 [RFC] block layer support for DMA IOMMU bypass mode James Bottomley
                   ` (2 preceding siblings ...)
  2003-07-01 22:51 ` David S. Miller
@ 2003-07-01 23:57 ` Andi Kleen
  2003-07-02  0:03   ` David S. Miller
  2003-07-02 16:55   ` Grant Grundler
  3 siblings, 2 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-01 23:57 UTC (permalink / raw)
  To: James Bottomley
  Cc: axboe, grundler, davem, suparna, linux-kernel, alex_williamson,
	bjorn_helgaas

On 01 Jul 2003 11:46:12 -0500
James Bottomley <James.Bottomley@steeleye.com> wrote:

On further thought about the issue:

The K8 IOMMU cannot support this virtually contiguous thing. The reason
is that there is no guarantee that an entry in a sglist is a multiple
of page size. And the aperture can only map 4K sized chunks, like 
a CPU MMU. So e.g. when you have an sglist with multiple 1K entries there is 
no way to get them continuous in IOMMU space (short of copying)

This means I just need a flag to turn this assumption off in the block layer.

Currently it doesn't even guarantee that pci_map_sg is continuous for page sized chunks - pci_map_sg is essentially just a loop that calls pci_map_single 
and is quite possible that all the entries are spread over the IOMMU hole.

Also James do you remember when these changes were added to the block layer?
We have a weird IDE corruption here and I'm wondering if it is related
to this.

-Andi



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-01 23:57 ` [RFC] block layer support for DMA IOMMU bypass mode II Andi Kleen
@ 2003-07-02  0:03   ` David S. Miller
  2003-07-02  0:22     ` Andi Kleen
  2003-07-02 16:55   ` Grant Grundler
  1 sibling, 1 reply; 52+ messages in thread
From: David S. Miller @ 2003-07-02  0:03 UTC (permalink / raw)
  To: ak
  Cc: James.Bottomley, axboe, grundler, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

   From: Andi Kleen <ak@suse.de>
   Date: Wed, 2 Jul 2003 01:57:01 +0200
   
   The K8 IOMMU cannot support this virtually contiguous thing. The reason
   is that there is no guarantee that an entry in a sglist is a multiple
   of page size. And the aperture can only map 4K sized chunks, like 
   a CPU MMU. So e.g. when you have an sglist with multiple 1K entries there is 
What do you mean?  You map only one 4K chunk, and this is used
for all the sub-1K mappings.

I can only map 8K sized chunks on the sparc64 IOMMU and this
works perfectly fine.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02  0:22     ` Andi Kleen
@ 2003-07-02  0:21       ` David S. Miller
  2003-07-02 16:53       ` Grant Grundler
  1 sibling, 0 replies; 52+ messages in thread
From: David S. Miller @ 2003-07-02  0:21 UTC (permalink / raw)
  To: ak
  Cc: James.Bottomley, axboe, grundler, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

   From: Andi Kleen <ak@suse.de>
   Date: Wed, 2 Jul 2003 02:22:44 +0200

   On Tue, 01 Jul 2003 17:03:23 -0700 (PDT)
   "David S. Miller" <davem@redhat.com> wrote:
   
   > What do you mean?  You map only one 4K chunk, and this is used
   > for all the sub-1K mappings.
   
   How should this work when the 1K mappings are spread all over memory?
   
   Maybe I'm missing something but from James description it sounds like the 
   block layer assumes that it can pass in a sglist with arbitary elements 
   and get it back remapped to continuous DMA addresses.
   
It assumes it can pass in an sglist with arbitrary "virtually
contiguous" elements and get back a continuous DMA address.

The BIO_VMERGE_BOUNDRY defines the IOMMU page size and therefore
what "virtually contiguous" means.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02  0:03   ` David S. Miller
@ 2003-07-02  0:22     ` Andi Kleen
  2003-07-02  0:21       ` David S. Miller
  2003-07-02 16:53       ` Grant Grundler
  0 siblings, 2 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-02  0:22 UTC (permalink / raw)
  To: David S. Miller
  Cc: James.Bottomley, axboe, grundler, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

On Tue, 01 Jul 2003 17:03:23 -0700 (PDT)
"David S. Miller" <davem@redhat.com> wrote:

>    From: Andi Kleen <ak@suse.de>
>    Date: Wed, 2 Jul 2003 01:57:01 +0200
>    
>    The K8 IOMMU cannot support this virtually contiguous thing. The reason
>    is that there is no guarantee that an entry in a sglist is a multiple
>    of page size. And the aperture can only map 4K sized chunks, like 
>    a CPU MMU. So e.g. when you have an sglist with multiple 1K entries there is 
> What do you mean?  You map only one 4K chunk, and this is used
> for all the sub-1K mappings.

How should this work when the 1K mappings are spread all over memory?

Maybe I'm missing something but from James description it sounds like the 
block layer assumes that it can pass in a sglist with arbitary elements 
and get it back remapped to continuous DMA addresses.

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode
  2003-07-01 23:01     ` Grant Grundler
@ 2003-07-02 15:52       ` James Bottomley
  0 siblings, 0 replies; 52+ messages in thread
From: James Bottomley @ 2003-07-02 15:52 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Jens Axboe, ak, davem, suparna, Linux Kernel, Alex Williamson,
	Bjorn Helgaas

On Tue, 2003-07-01 at 18:01, Grant Grundler wrote:
> > The bio layer works with
> > a nice finite list of generic or per-queue constraints; it doesn't care
> > currently what the underlying device or IOMMU does.
> 
> I don't agree. This whole discussion revolves around getting BIO code and
> IOMMU code to agree on how block merging works for a given platform.
> Using a callback into IOMMU code means the BIO truly doesn't have to know.
> The platform specific IOMMU could just tell BIO code what it wants to
> know (how many SG entries would fit into a limited number of physical
> mappings).

Ah, but the point is that currently the only inputs the IOMMU has to the
bio layer are parameters.  I'd like to keep it this way unless there's a
really, really good reason not to.  At the moment it seems that the
proposed parameter covers all of IA64's needs and may cover AMD64's as
well.

> > Putting such a callback in would add this entanglement.
> 
> yes, sort of. But I think this entanglement is present even for machines
> that don't have an IOMMU because of bounce buffers.  But if ia64's swiotlb
> would be made generic to cover buffer bouncing....

Well, not to get into the "where should ZONE_NORMAL end" argument again,
but I was hoping that GFP_DMA32 would elminate the IA64 platform's need
for this.  __blk_queue_bounce() strikes me as being much more heavily
exercised than the swiotlb, so I think it should be the one to remain. 
It also has more context information to fail gracefully.

James



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02  0:22     ` Andi Kleen
  2003-07-02  0:21       ` David S. Miller
@ 2003-07-02 16:53       ` Grant Grundler
  2003-07-02 17:19         ` Andi Kleen
  1 sibling, 1 reply; 52+ messages in thread
From: Grant Grundler @ 2003-07-02 16:53 UTC (permalink / raw)
  To: Andi Kleen
  Cc: David S. Miller, James.Bottomley, axboe, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

On Wed, Jul 02, 2003 at 02:22:44AM +0200, Andi Kleen wrote:
> > What do you mean?  You map only one 4K chunk, and this is used
> > for all the sub-1K mappings.
> 
> How should this work when the 1K mappings are spread all over memory?

It couldn't merge in this case.

> Maybe I'm missing something but from James description it sounds like the 
> block layer assumes that it can pass in a sglist with arbitary elements 
> and get it back remapped to continuous DMA addresses.

In the x86-64 case, If the 1k elements are not physically contigous,
I think most of them would get their own mapping.

For x86-64, if an entry ends on a 4k alignment and the next one starts
on a 4k alignment, could those be merged into one DMA segment that uses
two adjacent mapping entries?

Anyway, using a 4k FS block size (eg ext2) would be more efficient
by allowing a 1:1 of SG elements and DMA mappings.

grant


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-01 23:57 ` [RFC] block layer support for DMA IOMMU bypass mode II Andi Kleen
  2003-07-02  0:03   ` David S. Miller
@ 2003-07-02 16:55   ` Grant Grundler
  2003-07-02 17:20     ` Andi Kleen
  2003-07-02 21:16     ` Alan Cox
  1 sibling, 2 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-02 16:55 UTC (permalink / raw)
  To: Andi Kleen
  Cc: James Bottomley, axboe, grundler, davem, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

On Wed, Jul 02, 2003 at 01:57:01AM +0200, Andi Kleen wrote:
> The K8 IOMMU cannot support this virtually contiguous thing. The reason
> is that there is no guarantee that an entry in a sglist is a multiple
> of page size.  And the aperture can only map 4K sized chunks, like 
> a CPU MMU. So e.g. when you have an sglist with multiple 1K entries there is 
> no way to get them continuous in IOMMU space (short of copying)

Can two adjacent IOMMU entries be used to map two 1K buffers?
Assume the 1st buffer ends on a 4k alignment and the next one
starts on a 4k alignment.

grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02 16:53       ` Grant Grundler
@ 2003-07-02 17:19         ` Andi Kleen
  0 siblings, 0 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-02 17:19 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Andi Kleen, David S. Miller, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Wed, Jul 02, 2003 at 10:53:33AM -0600, Grant Grundler wrote:
> 
> > Maybe I'm missing something but from James description it sounds like the 
> > block layer assumes that it can pass in a sglist with arbitary elements 
> > and get it back remapped to continuous DMA addresses.
> 
> In the x86-64 case, If the 1k elements are not physically contigous,
> I think most of them would get their own mapping.

Yes, but it won't be continguous in bus space.

> 
> For x86-64, if an entry ends on a 4k alignment and the next one starts
> on a 4k alignment, could those be merged into one DMA segment that uses
> two adjacent mapping entries?

Yes, it is now in the version I wrote last night, but not in the 
previous code that's in the tree.

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02 16:55   ` Grant Grundler
@ 2003-07-02 17:20     ` Andi Kleen
  2003-07-02 17:37       ` Grant Grundler
  2003-07-02 21:16     ` Alan Cox
  1 sibling, 1 reply; 52+ messages in thread
From: Andi Kleen @ 2003-07-02 17:20 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Andi Kleen, James Bottomley, axboe, davem, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

On Wed, Jul 02, 2003 at 10:55:10AM -0600, Grant Grundler wrote:
> On Wed, Jul 02, 2003 at 01:57:01AM +0200, Andi Kleen wrote:
> > The K8 IOMMU cannot support this virtually contiguous thing. The reason
> > is that there is no guarantee that an entry in a sglist is a multiple
> > of page size.  And the aperture can only map 4K sized chunks, like 
> > a CPU MMU. So e.g. when you have an sglist with multiple 1K entries there is 
> > no way to get them continuous in IOMMU space (short of copying)
> 
> Can two adjacent IOMMU entries be used to map two 1K buffers?
> Assume the 1st buffer ends on a 4k alignment and the next one
> starts on a 4k alignment.

Yes, it could. But is that situation likely/worth to handle? 

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02 17:20     ` Andi Kleen
@ 2003-07-02 17:37       ` Grant Grundler
  0 siblings, 0 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-02 17:37 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Grant Grundler, James Bottomley, axboe, davem, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Wed, Jul 02, 2003 at 07:20:26PM +0200, Andi Kleen wrote:
...
> > Can two adjacent IOMMU entries be used to map two 1K buffers?
> > Assume the 1st buffer ends on a 4k alignment and the next one
> > starts on a 4k alignment.
> 
> Yes, it could. But is that situation likely/worth to handle? 

Probably.  It would reduce the number of mappings by 25% (3 instead of 4).
My assumption is two adjecent IOMMU entries have contigious bus addresses.

I was trying to figure out if x86-64 should be setting
BIO_VMERGE_BOUNDARY to 0 or 4k.

It sounds like x86-64 could support "#define BIO_VMERGE_BOUNDARY 4096"
if the IOMMU code will return one DMA address for two SG list entries
in the above example.

hth,
grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02 16:55   ` Grant Grundler
  2003-07-02 17:20     ` Andi Kleen
@ 2003-07-02 21:16     ` Alan Cox
  2003-07-02 23:56       ` Andi Kleen
  1 sibling, 1 reply; 52+ messages in thread
From: Alan Cox @ 2003-07-02 21:16 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Andi Kleen, James Bottomley, axboe, davem, suparna,
	Linux Kernel Mailing List, alex_williamson, bjorn_helgaas

On Mer, 2003-07-02 at 17:55, Grant Grundler wrote:
> On Wed, Jul 02, 2003 at 01:57:01AM +0200, Andi Kleen wrote:
> > The K8 IOMMU cannot support this virtually contiguous thing. The reason
> > is that there is no guarantee that an entry in a sglist is a multiple
> > of page size.  And the aperture can only map 4K sized chunks, like 
> > a CPU MMU. So e.g. when you have an sglist with multiple 1K entries there is 
> > no way to get them continuous in IOMMU space (short of copying)
> 
> Can two adjacent IOMMU entries be used to map two 1K buffers?
> Assume the 1st buffer ends on a 4k alignment and the next one
> starts on a 4k alignment.

When I played with optimising merging on some 2.4 I2O and aacraid
controller stuff I found two things

1.	We allocate pages in reverse order so most merges cant occur
2.	If you use a 4K fs as most people do now the issue is irrelevant
3.	Almost every 1K mergable was part of the same 4K page anyway


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02 21:16     ` Alan Cox
@ 2003-07-02 23:56       ` Andi Kleen
  2003-07-03 20:26         ` Alan Cox
  0 siblings, 1 reply; 52+ messages in thread
From: Andi Kleen @ 2003-07-02 23:56 UTC (permalink / raw)
  To: Alan Cox
  Cc: Grant Grundler, Andi Kleen, James Bottomley, axboe, davem,
	suparna, Linux Kernel Mailing List, alex_williamson,
	bjorn_helgaas

> 1.	We allocate pages in reverse order so most merges cant occur

I added an printk and I get quite a lot of merges during bootup
with normal IDE.

(sometimes 12+ segments) 

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-02 23:56       ` Andi Kleen
@ 2003-07-03 20:26         ` Alan Cox
  2003-07-03 21:24           ` Andi Kleen
  0 siblings, 1 reply; 52+ messages in thread
From: Alan Cox @ 2003-07-03 20:26 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Grant Grundler, James Bottomley, axboe, davem, suparna,
	Linux Kernel Mailing List, alex_williamson, bjorn_helgaas

On Iau, 2003-07-03 at 00:56, Andi Kleen wrote:
> > 1.	We allocate pages in reverse order so most merges cant occur
> 
> I added an printk and I get quite a lot of merges during bootup
> with normal IDE.
> 
> (sometimes 12+ segments) 

Thats merging adjacent blocks with non adjacent page targets using the
IOMMU right - I was doing mergign without an IOMMU which is a little
different and turns out to be a waste of cpu


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-03 20:26         ` Alan Cox
@ 2003-07-03 21:24           ` Andi Kleen
  2003-07-03 22:19             ` Grant Grundler
  2003-07-08  2:14             ` David S. Miller
  0 siblings, 2 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-03 21:24 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andi Kleen, Grant Grundler, James Bottomley, axboe, davem,
	suparna, Linux Kernel Mailing List, alex_williamson,
	bjorn_helgaas

On Thu, Jul 03, 2003 at 09:26:29PM +0100, Alan Cox wrote:
> On Iau, 2003-07-03 at 00:56, Andi Kleen wrote:
> > > 1.	We allocate pages in reverse order so most merges cant occur
> > 
> > I added an printk and I get quite a lot of merges during bootup
> > with normal IDE.
> > 
> > (sometimes 12+ segments) 
> 
> Thats merging adjacent blocks with non adjacent page targets using the
> IOMMU right - I was doing mergign without an IOMMU which is a little

Yep. 

> different and turns out to be a waste of cpu

Understandable. Especially when memory fragments after some uptime.

But of course it doesn't help much in practice because all the interesting
block devices support DAC anyways and the IOMMU is disabled for that.

Also it's likely cheaper just submit more segments than to have the IOMMU
overhead
(at least for sane devices, if not it may be worth to artificially limit the
dma mask of the device to force IOMMU on IA64 and x86-64) 

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-03 21:24           ` Andi Kleen
@ 2003-07-03 22:19             ` Grant Grundler
  2003-07-08  2:14             ` David S. Miller
  1 sibling, 0 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-03 22:19 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Alan Cox, Grant Grundler, James Bottomley, axboe, davem, suparna,
	Linux Kernel Mailing List, alex_williamson, bjorn_helgaas

On Thu, Jul 03, 2003 at 11:24:15PM +0200, Andi Kleen wrote:
...
> Also it's likely cheaper just submit more segments than to have the IOMMU
> overhead

It depends on the device. If using something like 8237A to master DMA cycles,
then CPU cost of merging is relatively cheap. If sending the SG list is
just a sequence of MMIO space writes, then passing the raw list is cheaper.
ZX1 and PARISC IOMMUs clearly add some overhead both in terms of CPU
utilization (manage IOMMU) and DMA latency (IOMMU TLB misses sometimes).

...
> (at least for sane devices, if not it may be worth to artificially limit the
> dma mask of the device to force IOMMU on IA64 and x86-64) 

Agreed. We are only doing that until BIO code and IOMMU code can
agree on how merging works without requiring the IOMMU.

thanks,
grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-03 21:24           ` Andi Kleen
  2003-07-03 22:19             ` Grant Grundler
@ 2003-07-08  2:14             ` David S. Miller
  2003-07-08 19:34               ` Andi Kleen
  1 sibling, 1 reply; 52+ messages in thread
From: David S. Miller @ 2003-07-08  2:14 UTC (permalink / raw)
  To: ak
  Cc: alan, grundler, James.Bottomley, axboe, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

   From: Andi Kleen <ak@suse.de>
   Date: Thu, 3 Jul 2003 23:24:15 +0200

   But of course it doesn't help much in practice because all the interesting
   block devices support DAC anyways and the IOMMU is disabled for that.
   
Platform dependant.  SAC DMA transfers are faster on sparc64 so
we only allow the device to specify a 32-bit DMA mask successfully.

And actually, I would recommend other platforms that have a IOMMU do
this too (unless there is some other reason not to) since virtual
merging causes less scatter-gather entries to be used in the device
and thus you can stuff more requests into it.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08  2:14             ` David S. Miller
@ 2003-07-08 19:34               ` Andi Kleen
  2003-07-08 19:47                 ` Jeff Garzik
  2003-07-08 22:04                 ` David S. Miller
  0 siblings, 2 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-08 19:34 UTC (permalink / raw)
  To: David S. Miller
  Cc: alan, grundler, James.Bottomley, axboe, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

On Mon, 07 Jul 2003 19:14:38 -0700 (PDT)
"David S. Miller" <davem@redhat.com> wrote:

>    From: Andi Kleen <ak@suse.de>
>    Date: Thu, 3 Jul 2003 23:24:15 +0200
> 
>    But of course it doesn't help much in practice because all the interesting
>    block devices support DAC anyways and the IOMMU is disabled for that.
>    
> Platform dependant.  SAC DMA transfers are faster on sparc64 so
> we only allow the device to specify a 32-bit DMA mask successfully.
> 
> And actually, I would recommend other platforms that have a IOMMU do
> this too (unless there is some other reason not to) since virtual
> merging causes less scatter-gather entries to be used in the device
> and thus you can stuff more requests into it.

Do you know a common PCI block device that would benefit from this (performs significantly
better with short sg lists)? It would be interesting to test.

I don't want to use the IOMMU for production for SAC on AMD64 because
on some of the boxes the available IOMMU area is quite small. e.g. the single
processor boxes typically only have a 128MB aperture set up, which means
the IOMMU hole is only 64MB (other 64MB for AGP).And some of them do not even have a 
BIOS option to enlarge it (I can allocate a bigger one myself, but it costs
memory). The boxes that have more than 4GB memory at least typically 
support enlarging it. 

Overflow is typically deadly because the API does not allow proper
error handling and most drivers don't check for it. That's especially
risky for block devices: while pci_map_sg can at least return an error
not everybody checks for it and when you get an overflow the next
super block write with such an unchecked error will destroy the file 
system.

Also networking tests have shown that it costs around 10% performance.
These are old numbers and some optimizations have been done since then
so it may be better now.
 
-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 19:34               ` Andi Kleen
@ 2003-07-08 19:47                 ` Jeff Garzik
  2003-07-08 20:10                   ` Andi Kleen
  2003-07-08 20:11                   ` Grant Grundler
  2003-07-08 22:04                 ` David S. Miller
  1 sibling, 2 replies; 52+ messages in thread
From: Jeff Garzik @ 2003-07-08 19:47 UTC (permalink / raw)
  To: Andi Kleen
  Cc: David S. Miller, alan, grundler, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Tue, Jul 08, 2003 at 09:34:27PM +0200, Andi Kleen wrote:
> Overflow is typically deadly because the API does not allow proper
> error handling and most drivers don't check for it. That's especially
> risky for block devices: while pci_map_sg can at least return an error
> not everybody checks for it and when you get an overflow the next
> super block write with such an unchecked error will destroy the file 
> system.

Personally, I've always thought we were kidding ourselves by not doing
the error checking you describe.  From my somewhat-narrow perspective of
network drivers and the libata storage driver, you have to deal with
atomic allocations _anyway_ ...  so why not make sure IOMMU overflow
properly fails at the pci_map_foo level?

	Jeff




^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 19:47                 ` Jeff Garzik
@ 2003-07-08 20:10                   ` Andi Kleen
  2003-07-08 20:11                   ` Grant Grundler
  1 sibling, 0 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-08 20:10 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: davem, alan, grundler, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Tue, 8 Jul 2003 15:47:44 -0400
Jeff Garzik <jgarzik@pobox.com> wrote:

> Personally, I've always thought we were kidding ourselves by not doing
> the error checking you describe.  From my somewhat-narrow perspective of
> network drivers and the libata storage driver, you have to deal with
> atomic allocations _anyway_ ...  so why not make sure IOMMU overflow
> properly fails at the pci_map_foo level?

pci_map_single currently has no defined error return, but if you could 
persuade all your drivers colleagues to fix their drivers to check
this I'm sure things would be better.

(on AMD64 the check could be trivially implemented as a macro because errors
always return a well defined address)

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 19:47                 ` Jeff Garzik
  2003-07-08 20:10                   ` Andi Kleen
@ 2003-07-08 20:11                   ` Grant Grundler
  1 sibling, 0 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-08 20:11 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Andi Kleen, David S. Miller, alan, grundler, James.Bottomley,
	axboe, suparna, linux-kernel, alex_williamson, bjorn_helgaas

On Tue, Jul 08, 2003 at 03:47:44PM -0400, Jeff Garzik wrote:
> Personally, I've always thought we were kidding ourselves by not doing
> the error checking you describe. 

Amen. When I pointed this out a few years back, it was made it clear this
was a design choice : not providing a return value was simpler for driver
writers. I agree it's simpler and don't pretend to know what's best
for other driver writers. Sounds like a topic for 2.7 though.

grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 19:34               ` Andi Kleen
  2003-07-08 19:47                 ` Jeff Garzik
@ 2003-07-08 22:04                 ` David S. Miller
  2003-07-08 22:25                   ` Grant Grundler
  1 sibling, 1 reply; 52+ messages in thread
From: David S. Miller @ 2003-07-08 22:04 UTC (permalink / raw)
  To: ak
  Cc: alan, grundler, James.Bottomley, axboe, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

   From: Andi Kleen <ak@suse.de>
   Date: Tue, 8 Jul 2003 21:34:27 +0200

   Do you know a common PCI block device that would benefit from this
   (performs significantly better with short sg lists)? It would be
   interesting to test.
   
%10 to %15 on sym53c8xx devices found on sparc64 boxes.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 22:25                   ` Grant Grundler
@ 2003-07-08 22:23                     ` David S. Miller
  2003-07-09 18:55                       ` Andi Kleen
                                         ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: David S. Miller @ 2003-07-08 22:23 UTC (permalink / raw)
  To: grundler
  Cc: ak, alan, James.Bottomley, axboe, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

   From: Grant Grundler <grundler@parisc-linux.org>
   Date: Tue, 8 Jul 2003 16:25:45 -0600

   On Tue, Jul 08, 2003 at 03:04:33PM -0700, David S. Miller wrote:
   >    Do you know a common PCI block device that would benefit from this
   >    (performs significantly better with short sg lists)? It would be
   >    interesting to test.
   >    
   > %10 to %15 on sym53c8xx devices found on sparc64 boxes.
   
   Which workload?

dbench type stuff, but that's a hard thing to test these days with
the block I/O schedulers changing so much.  Try to keep that part
constant in the with/vs/without VIO_VMERGE!=0 testing :)


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 22:04                 ` David S. Miller
@ 2003-07-08 22:25                   ` Grant Grundler
  2003-07-08 22:23                     ` David S. Miller
  0 siblings, 1 reply; 52+ messages in thread
From: Grant Grundler @ 2003-07-08 22:25 UTC (permalink / raw)
  To: David S. Miller
  Cc: ak, alan, grundler, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Tue, Jul 08, 2003 at 03:04:33PM -0700, David S. Miller wrote:
>    Do you know a common PCI block device that would benefit from this
>    (performs significantly better with short sg lists)? It would be
>    interesting to test.
>    
> %10 to %15 on sym53c8xx devices found on sparc64 boxes.

Which workload?
I'd like to test this on our ia64 boxes.

thanks,
grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 22:23                     ` David S. Miller
@ 2003-07-09 18:55                       ` Andi Kleen
  2003-07-23 11:40                       ` Grant Grundler
  2003-07-23 13:20                       ` Grant Grundler
  2 siblings, 0 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-09 18:55 UTC (permalink / raw)
  To: David S. Miller
  Cc: grundler, alan, James.Bottomley, axboe, suparna, linux-kernel,
	alex_williamson, bjorn_helgaas

On Tue, 08 Jul 2003 15:23:14 -0700 (PDT)
"David S. Miller" <davem@redhat.com> wrote:

>    From: Grant Grundler <grundler@parisc-linux.org>
>    Date: Tue, 8 Jul 2003 16:25:45 -0600
> 
>    On Tue, Jul 08, 2003 at 03:04:33PM -0700, David S. Miller wrote:
>    >    Do you know a common PCI block device that would benefit from this
>    >    (performs significantly better with short sg lists)? It would be
>    >    interesting to test.
>    >    
>    > %10 to %15 on sym53c8xx devices found on sparc64 boxes.
>    
>    Which workload?
> 
> dbench type stuff, but that's a hard thing to test these days with
> the block I/O schedulers changing so much.  Try to keep that part
> constant in the with/vs/without VIO_VMERGE!=0 testing :)

With MPT-Fusion and reaim "new dbase" load it seems to be slightly faster
with forced IOMMU merging on Opteron, but the differences are quite small (~4%) and could
be measurement errors.

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 22:23                     ` David S. Miller
  2003-07-09 18:55                       ` Andi Kleen
@ 2003-07-23 11:40                       ` Grant Grundler
  2003-07-28 11:15                         ` Andi Kleen
  2003-07-23 13:20                       ` Grant Grundler
  2 siblings, 1 reply; 52+ messages in thread
From: Grant Grundler @ 2003-07-23 11:40 UTC (permalink / raw)
  To: David S. Miller
  Cc: grundler, ak, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Tue, Jul 08, 2003 at 03:23:14PM -0700, David S. Miller wrote:
>    From: Grant Grundler <grundler@parisc-linux.org>
>    Date: Tue, 8 Jul 2003 16:25:45 -0600
> 
>    On Tue, Jul 08, 2003 at 03:04:33PM -0700, David S. Miller wrote:
>    >    Do you know a common PCI block device that would benefit from this
>    >    (performs significantly better with short sg lists)? It would be
>    >    interesting to test.
>    >    
>    > %10 to %15 on sym53c8xx devices found on sparc64 boxes.
>    
>    Which workload?
> 
> dbench type stuff, 

Without more specific guidance, dbench looks like a load of crap.
"dbench 50" is claiming 850MB/s throughput on a system with 1 disk @
u320 and 2 disks on a seperate 40 MB/s (Ultra Wide SE SCSI). More details 
are appended below.  I'll try again with lmbench or bonnie.

Andi, if you could pass me details about the "reaim new dbase" (ie how
many devices I need, where to get it) I could make time to try that in
the next couple of weeks.

> but that's a hard thing to test these days with
> the block I/O schedulers changing so much.  Try to keep that part
> constant in the with/vs/without VIO_VMERGE!=0 testing :)

yes - James Bottomley was asking for the same info.

thanks,
grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-08 22:23                     ` David S. Miller
  2003-07-09 18:55                       ` Andi Kleen
  2003-07-23 11:40                       ` Grant Grundler
@ 2003-07-23 13:20                       ` Grant Grundler
  2003-07-23 15:30                         ` Jens Axboe
  2 siblings, 1 reply; 52+ messages in thread
From: Grant Grundler @ 2003-07-23 13:20 UTC (permalink / raw)
  To: David S. Miller
  Cc: grundler, ak, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

[-- Attachment #1: Type: text/plain, Size: 1675 bytes --]

On Tue, Jul 08, 2003 at 03:23:14PM -0700, David S. Miller wrote:
> dbench type stuff,

realizing dbench is blissfully ignorant of the system (2GB RAM),
for grins I ran "dbench 500" to see what would happen. The throughput
rate dbench reported continued to decline to ~20MB/s. This is about what
I would expect for one disk a 40MB/s SCSI bus.

Then dbench started spewing errors:
...
(7) ERROR: handle 13781 was not found
(6) open clients/client428 failed for handle 13781 (No such file or
directory)
(7) ERROR: handle 13781 was not found
(6) open clients/client423 failed for handle 13781 (No such file or directory)
(7) ERROR: handle 13781 was not found
(6) open clients/client48 failed for handle 13781 (No such file or directory)
(7) ERROR: handle 13781 was not found
(6) open clients/client55 failed for handle 13781 (No such file or directory)
(7) ERROR: handle 13781 was not found
(6) open clients/client419 failed for handle 13781 (No such file or directory)
(7) ERROR: handle 13781 was not found
(6) open clients/client415 failed for handle 13781 (No such file or directory)
...
write failed on handle 13783
write failed on handle 13707
write failed on handle 13808
write failed on handle 13117
write failed on handle 13850
write failed on handle 14000
write failed on handle 13767
write failed on handle 13787
...

NFC what that's all about. sorry - I have to punt on digging deeper.
I really need more guidance on
	(a) how much memory I should be testing with
	(b) how many spindles would be useful (I've got ~15 on each box)
	(c) how to tell dbench to use the FS mounted on the target disks.

I've attached the iommu stats in case anyone finds that useful.

grant

[-- Attachment #2: dbench-zx1-01 --]
[-- Type: text/plain, Size: 560 bytes --]

Hewlett Packard zx1 IOC rev 2.2
IO PDIR size    : 524288 bytes (65536 entries)
IO PDIR entries : 65224 free  312 used (0%)
Resource bitmap : 8192 bytes (65536 pages)
  Bitmap search : 63/106/605 (min/avg/max CPU Cycles)
pci_map_single():       139846 calls        139881 pages (avg 1000/1000)
pci_unmap_single:       473108 calls        736788 pages (avg 1557/1000)
pci_map_sg()    :        51256 calls        597211 pages (avg 11651/1000)
pci_map_sg()    : 734319 entries 333551 filled
pci_unmap_sg()  :       189496 calls        735471 pages (avg 3881/1000)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-23 13:20                       ` Grant Grundler
@ 2003-07-23 15:30                         ` Jens Axboe
  0 siblings, 0 replies; 52+ messages in thread
From: Jens Axboe @ 2003-07-23 15:30 UTC (permalink / raw)
  To: Grant Grundler
  Cc: David S. Miller, ak, alan, James.Bottomley, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Wed, Jul 23 2003, Grant Grundler wrote:
> On Tue, Jul 08, 2003 at 03:23:14PM -0700, David S. Miller wrote:
> > dbench type stuff,
> 
> realizing dbench is blissfully ignorant of the system (2GB RAM),
> for grins I ran "dbench 500" to see what would happen. The throughput
> rate dbench reported continued to decline to ~20MB/s. This is about what
> I would expect for one disk a 40MB/s SCSI bus.
> 
> Then dbench started spewing errors:
> ..
> (7) ERROR: handle 13781 was not found
> (6) open clients/client428 failed for handle 13781 (No such file or
> directory)
> (7) ERROR: handle 13781 was not found
> (6) open clients/client423 failed for handle 13781 (No such file or directory)
> (7) ERROR: handle 13781 was not found
> (6) open clients/client48 failed for handle 13781 (No such file or directory)
> (7) ERROR: handle 13781 was not found
> (6) open clients/client55 failed for handle 13781 (No such file or directory)
> (7) ERROR: handle 13781 was not found
> (6) open clients/client419 failed for handle 13781 (No such file or directory)
> (7) ERROR: handle 13781 was not found
> (6) open clients/client415 failed for handle 13781 (No such file or directory)
> ..
> write failed on handle 13783
> write failed on handle 13707
> write failed on handle 13808
> write failed on handle 13117
> write failed on handle 13850
> write failed on handle 14000
> write failed on handle 13767
> write failed on handle 13787
> ..
> 
> NFC what that's all about. sorry - I have to punt on digging deeper.

You are running out of disk space, most likely :-)

> I really need more guidance on
> 	(a) how much memory I should be testing with

With 2G of RAM, you need lots of clients. Would be much saner to just
boot with 256M, or something like that.

> 	(b) how many spindles would be useful (I've got ~15 on each box)
> 	(c) how to tell dbench to use the FS mounted on the target disks.
> 
> I've attached the iommu stats in case anyone finds that useful.

To be honest, I don't think dbench is terrible useful for this. It often
suffers from the butterfly effect, so with the small improvements
virtual merging should so will most likely be lost in the noise.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-23 11:40                       ` Grant Grundler
@ 2003-07-28 11:15                         ` Andi Kleen
  2003-07-28 14:59                           ` Grant Grundler
                                             ` (2 more replies)
  0 siblings, 3 replies; 52+ messages in thread
From: Andi Kleen @ 2003-07-28 11:15 UTC (permalink / raw)
  To: Grant Grundler
  Cc: davem, grundler, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Wed, 23 Jul 2003 05:40:06 -0600
Grant Grundler <grundler@parisc-linux.org> wrote:


> 
> Andi, if you could pass me details about the "reaim new dbase" (ie how
> many devices I need, where to get it) I could make time to try that in
> the next couple of weeks.

Download reaim from sourceforge

Use the workfile.new_dbase test

Run it with 100-500 users (reaim -f workfile... -s 100 -e 500 -i 100) 

I tested with ext3 on a single SCSI disk.


-Andi


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-28 11:15                         ` Andi Kleen
@ 2003-07-28 14:59                           ` Grant Grundler
  2003-07-30  2:31                           ` Grant Grundler
  2003-07-30  4:42                           ` Grant Grundler
  2 siblings, 0 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-28 14:59 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Grant Grundler, davem, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Mon, Jul 28, 2003 at 01:15:13PM +0200, Andi Kleen wrote:
> Download reaim from sourceforge

http://lwn.net/Articles/20733/
	"(couldn't think of a better name, sorry)"

I was happy when "apt-get install reaim" just worked... *sigh*
But figured out "reaim" != "re-aim-7".
debian doesn't know anything about re-aim-7. :^(

http://sourceforge.org/projects/re-aim-7
	We're Sorry.
	The SourceForge.net Website is currently down for maintenance.
	We will be back shortly

willy mentioned it's on OSDL too. Will look for that next.

> Use the workfile.new_dbase test
> Run it with 100-500 users (reaim -f workfile... -s 100 -e 500 -i 100) 
> I tested with ext3 on a single SCSI disk.

thanks Andi - hopefully I can generate results this afternoon
when I've got connectivity again.

grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-28 11:15                         ` Andi Kleen
  2003-07-28 14:59                           ` Grant Grundler
@ 2003-07-30  2:31                           ` Grant Grundler
  2003-08-01 21:51                             ` Cliff White
  2003-07-30  4:42                           ` Grant Grundler
  2 siblings, 1 reply; 52+ messages in thread
From: Grant Grundler @ 2003-07-30  2:31 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Grant Grundler, davem, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Mon, Jul 28, 2003 at 01:15:13PM +0200, Andi Kleen wrote:
> Run it with 100-500 users (reaim -f workfile... -s 100 -e 500 -i 100) 
> I tested with ext3 on a single SCSI disk.

andi, davem, jens,
sorry for the long delay. Here's the data for ZX1 using u320 HBA
(LSI 53c1030) and ST373453LC disk (running U160 IIRC).
If you need more runs on this config, please ask.

Executive summary: < %1 difference for this config.

I'd still like to try a 53c1010 but don't have any installed right now.
I suspect 53c1010 is alot less efficient at retrieving SG lists
and will see a bigger difference in perf.


One minor issue when starting re-aim-7, but not during the run:

reaim(343): floating-point assist fault at ip 4000000000017a81, isr 0000020000000008                                                                            
reaim(343): floating-point assist fault at ip 4000000000017a61, isr 0000020000000008


For the record, I retrieved source from:
    http://umn.dl.sourceforge.net/sourceforge/re-aim-7/reaim-0.1.8.tar.gz

This should be renamed to osdl-aim-7 (or something like that).
The name space collision is unfortunate and annoying.

hth,
grant




#define BIO_VMERGE_BOUNDARY     0 /* (ia64_max_iommu_merge_mask + 1) */
iota:/mnt# reaim -f /mnt/usr/local/share/reaim/workfile.new_dbase -s100 -e 500 -i 100

REAIM Workload
Times are in seconds - Child times from tms.cstime and tms.cutime

Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
Forked  Time     SysTime UTime   Minute     Child      Time     Percent 
100     110.25   12.67   206.35  5605.19    56.05      3.48     3.29     96   
200     219.07   25.56   411.93  5642.06    28.21      7.83     3.73     96   
300     327.59   38.23   615.83  5659.46    18.86      12.79    4.09     95   
400     436.66   51.19   821.30  5661.19    14.15      18.34    4.42     95   
500     548.21   65.76   1029.15 5636.54    11.27      23.34    4.47     95   
Max Jobs per Minute 5661.19
iota:/mnt#



#define BIO_VMERGE_BOUNDARY     (ia64_max_iommu_merge_mask + 1)                 

iota:/mnt# PATH=$PATH:/mnt/usr/local/bin       
iota:/mnt# reaim -f /mnt/usr/local/share/reaim/workfile.new_dbase -s100 -e 500 -i 100

REAIM Workload
Times are in seconds - Child times from tms.cstime and tms.cutime

Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
Forked  Time     SysTime UTime   Minute     Child      Time     Percent 
100     108.72   12.47   203.78  5684.17    56.84      4.46     4.32     95   
200     217.64   25.59   408.90  5679.16    28.40      8.65     4.16     95   
300     326.48   37.88   613.62  5678.81    18.93      13.80    4.44     95   
400     434.87   50.53   817.64  5684.46    14.21      17.40    4.18     95   
500     544.69   65.23   1022.92 5672.92    11.35      21.53    4.12     95   
Max Jobs per Minute 5684.46

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-28 11:15                         ` Andi Kleen
  2003-07-28 14:59                           ` Grant Grundler
  2003-07-30  2:31                           ` Grant Grundler
@ 2003-07-30  4:42                           ` Grant Grundler
  2003-07-30  4:51                             ` David S. Miller
  2003-07-30 14:20                             ` James Bottomley
  2 siblings, 2 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-30  4:42 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Grant Grundler, davem, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Mon, Jul 28, 2003 at 01:15:13PM +0200, Andi Kleen wrote:
> Run it with 100-500 users (reaim -f workfile... -s 100 -e 500 -i 100) 

jejb was wondering if 4k pages would cause different behaviors becuase
of file system vs page size (4k vs 16k).  ia64 uses 16k by default.
I've rebuilt the kernel with 4k page size and VMERGE != 0.
The substantially worse performance feels like a rat hole because
of 4x pressure on CPU TLB.

Ideally, we need a workload to test BIO code without a file system.
Any suggestions?

grant


iota:/mnt# reaim -f /mnt/usr/local/share/reaim/workfile.new_dbase -s100 -e 500 -i 100
REAIM Workload                                                                  
Times are in seconds - Child times from tms.cstime and tms.cutime               

Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI     
Forked  Time     SysTime UTime   Minute     Child      Time     Percent         
100     118.90   21.17   214.03  5197.78    51.98      4.52     3.98     96     
200     236.75   42.54   429.94  5220.63    26.10      9.43     4.16     95     
300     354.94   64.47   644.80  5223.34    17.41      14.47    4.27     95     
400     474.50   87.01   861.09  5209.66    13.02      24.76    5.59     94     
500     594.26   109.80  1077.00 5199.78    10.40      25.36    4.49     95     
Max Jobs per Minute 5223.34                                                     
iota:/mnt# 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-30  4:42                           ` Grant Grundler
@ 2003-07-30  4:51                             ` David S. Miller
  2003-07-30 13:06                               ` Grant Grundler
  2003-07-30 16:02                               ` Grant Grundler
  2003-07-30 14:20                             ` James Bottomley
  1 sibling, 2 replies; 52+ messages in thread
From: David S. Miller @ 2003-07-30  4:51 UTC (permalink / raw)
  To: Grant Grundler
  Cc: ak, grundler, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Tue, 29 Jul 2003 22:42:56 -0600
Grant Grundler <grundler@parisc-linux.org> wrote:

> On Mon, Jul 28, 2003 at 01:15:13PM +0200, Andi Kleen wrote:
> > Run it with 100-500 users (reaim -f workfile... -s 100 -e 500 -i 100) 
> 
> jejb was wondering if 4k pages would cause different behaviors becuase
> of file system vs page size (4k vs 16k).  ia64 uses 16k by default.
> I've rebuilt the kernel with 4k page size and VMERGE != 0.
> The substantially worse performance feels like a rat hole because
> of 4x pressure on CPU TLB.

Make an ext2 filesystem with 16K blocks :-)

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-30  4:51                             ` David S. Miller
@ 2003-07-30 13:06                               ` Grant Grundler
  2003-07-30 16:02                               ` Grant Grundler
  1 sibling, 0 replies; 52+ messages in thread
From: Grant Grundler @ 2003-07-30 13:06 UTC (permalink / raw)
  To: David S. Miller
  Cc: Grant Grundler, ak, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Tue, Jul 29, 2003 at 09:51:18PM -0700, David S. Miller wrote:
> Make an ext2 filesystem with 16K blocks :-)

heh - right. I thought you were going to tell me I needed to
install DIMMs that support 4k pages :^)

grant

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-30  4:42                           ` Grant Grundler
  2003-07-30  4:51                             ` David S. Miller
@ 2003-07-30 14:20                             ` James Bottomley
  1 sibling, 0 replies; 52+ messages in thread
From: James Bottomley @ 2003-07-30 14:20 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Andi Kleen, davem, Alan Cox, Jens Axboe, suparna, Linux Kernel,
	alex_williamson, bjorn_helgaas

On Tue, 2003-07-29 at 23:42, Grant Grundler wrote:
> On Mon, Jul 28, 2003 at 01:15:13PM +0200, Andi Kleen wrote:
> > Run it with 100-500 users (reaim -f workfile... -s 100 -e 500 -i 100) 
> 
> jejb was wondering if 4k pages would cause different behaviors becuase
> of file system vs page size (4k vs 16k).  ia64 uses 16k by default.
> I've rebuilt the kernel with 4k page size and VMERGE != 0.
> The substantially worse performance feels like a rat hole because
> of 4x pressure on CPU TLB.

OK, I admit it, it was a rat hole.  Provided reaim uses large files, we
should only get block<->page fragmentation at the edges, and obviously,
reaim has to use large files otherwise it's not testing the virtual
merging properly...

James



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-30  4:51                             ` David S. Miller
  2003-07-30 13:06                               ` Grant Grundler
@ 2003-07-30 16:02                               ` Grant Grundler
  2003-07-30 16:36                                 ` Andi Kleen
  1 sibling, 1 reply; 52+ messages in thread
From: Grant Grundler @ 2003-07-30 16:02 UTC (permalink / raw)
  To: David S. Miller
  Cc: Grant Grundler, ak, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Tue, Jul 29, 2003 at 09:51:18PM -0700, David S. Miller wrote:
> Make an ext2 filesystem with 16K blocks :-)

Executive summary:
	looks the same as previous 4k block/16k page w/VMERGE enabled.

davem, I thought you were joking...I've submitted a oneliner to
Ted Tyso to increase EXT2_MAX_BLOCK_LOG_SIZE to 64k.
kudos to willy for quickly digging this up.
16k block size Works For Me (tm).

appended is the re-aim-7 results for 16k page/block on ext2.

grant


iota:/mnt# PATH=$PATH:/mnt/usr/local/bin
iota:/mnt# reaim -f /mnt/usr/local/share/reaim/workfile.new_dbase -s100 -e 500 -i 100
REAIM Workload
Times are in seconds - Child times from tms.cstime and tms.cutime

Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
Forked  Time     SysTime UTime   Minute     Child      Time     Percent
100     108.62   12.46   203.84  5689.35    56.89      4.59     4.46     95
200     217.51   25.29   408.11  5682.60    28.41      9.06     4.36     95
300     325.57   38.05   612.20  5694.63    18.98      12.01    3.85     96
400     434.89   50.67   817.90  5684.16    14.21      15.60    3.75     96
500     545.89   65.74   1024.75 5660.51    11.32      29.45    5.72     94
Max Jobs per Minute 5694.63
iota:/mnt#

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-30 16:02                               ` Grant Grundler
@ 2003-07-30 16:36                                 ` Andi Kleen
  2003-07-30 17:18                                   ` James Bottomley
  0 siblings, 1 reply; 52+ messages in thread
From: Andi Kleen @ 2003-07-30 16:36 UTC (permalink / raw)
  To: Grant Grundler
  Cc: David S. Miller, ak, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas

On Wed, Jul 30, 2003 at 10:02:50AM -0600, Grant Grundler wrote:
> On Tue, Jul 29, 2003 at 09:51:18PM -0700, David S. Miller wrote:
> > Make an ext2 filesystem with 16K blocks :-)
> 
> Executive summary:
> 	looks the same as previous 4k block/16k page w/VMERGE enabled.
> 
> davem, I thought you were joking...I've submitted a oneliner to
> Ted Tyso to increase EXT2_MAX_BLOCK_LOG_SIZE to 64k.
> kudos to willy for quickly digging this up.
> 16k block size Works For Me (tm).
> 
> appended is the re-aim-7 results for 16k page/block on ext2.

The differences were greater with the mpt fusion driver, maybe it has
more overhead. Or your IO subsystem is significantly different.

-Andi

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-30 16:36                                 ` Andi Kleen
@ 2003-07-30 17:18                                   ` James Bottomley
  0 siblings, 0 replies; 52+ messages in thread
From: James Bottomley @ 2003-07-30 17:18 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Grant Grundler, David S. Miller, Alan Cox, Jens Axboe, suparna,
	Linux Kernel, alex_williamson, bjorn_helgaas

On Wed, 2003-07-30 at 11:36, Andi Kleen wrote:
> The differences were greater with the mpt fusion driver, maybe it has
> more overhead. Or your IO subsystem is significantly different.

By and large, these results are more like what I expect.

As I've said before, getting SG tables to work efficiently is a core
part of getting an I/O board to function.

There are two places vmerging can help:

1. Reducing the size of the SG table
2. Increasing the length of the I/O for devices with fixed (but small)
SG table lengths.

However, it's important to remember that vmerging comes virtually for
free in the BIO layer, so the only added cost is the programming of the
IOMMU.  This isn't an issue on SPARC, PA-RISC and the like where IOMMU
programming is required to do I/O, it may be something the IOMMU
optional architectures (like IA-64 and AMD-64) should consider, which is
where I entered with the IOMMU bypass patch.

James



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC] block layer support for DMA IOMMU bypass mode II
  2003-07-30  2:31                           ` Grant Grundler
@ 2003-08-01 21:51                             ` Cliff White
  2003-08-01 23:18                               ` reaim now available as osdl-aim-7 - " Cliff White
  0 siblings, 1 reply; 52+ messages in thread
From: Cliff White @ 2003-08-01 21:51 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Andi Kleen, davem, alan, James.Bottomley, axboe, suparna,
	linux-kernel, alex_williamson, bjorn_helgaas, cliffw

> On Mon, Jul 28, 2003 at 01:15:13PM +0200, Andi Kleen wrote:
> > Run it with 100-500 users (reaim -f workfile... -s 100 -e 500 -i 100) 
> > I tested with ext3 on a single SCSI disk.
> 
> andi, davem, jens,
> sorry for the long delay. Here's the data for ZX1 using u320 HBA
> (LSI 53c1030) and ST373453LC disk (running U160 IIRC).
> If you need more runs on this config, please ask.
> 
> Executive summary: < %1 difference for this config.
> 
> I'd still like to try a 53c1010 but don't have any installed right now.
> I suspect 53c1010 is alot less efficient at retrieving SG lists
> and will see a bigger difference in perf.
> 
> 
> One minor issue when starting re-aim-7, but not during the run:
> 
> reaim(343): floating-point assist fault at ip 4000000000017a81, isr 0000020000000008                                                                            
> reaim(343): floating-point assist fault at ip 4000000000017a61, isr 0000020000000008
> 
> 
> For the record, I retrieved source from:
>     http://umn.dl.sourceforge.net/sourceforge/re-aim-7/reaim-0.1.8.tar.gz
> 
> This should be renamed to osdl-aim-7 (or something like that).
> The name space collision is unfortunate and annoying.

Apoligies, the naming was not good.
I'll attempt to change it. ( i like osdl-aim-7 )
In the meantime, you can also get the source from

bk://developer.osdl.org/reaim
cliffw

> 
> hth,
> grant
> 
> 
> 
> 
> #define BIO_VMERGE_BOUNDARY     0 /* (ia64_max_iommu_merge_mask + 1) */
> iota:/mnt# reaim -f /mnt/usr/local/share/reaim/workfile.new_dbase -s100 -e 500 -i 100
> 
> REAIM Workload
> Times are in seconds - Child times from tms.cstime and tms.cutime
> 
> Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
> Forked  Time     SysTime UTime   Minute     Child      Time     Percent 
> 100     110.25   12.67   206.35  5605.19    56.05      3.48     3.29     96   
> 200     219.07   25.56   411.93  5642.06    28.21      7.83     3.73     96   
> 300     327.59   38.23   615.83  5659.46    18.86      12.79    4.09     95   
> 400     436.66   51.19   821.30  5661.19    14.15      18.34    4.42     95   
> 500     548.21   65.76   1029.15 5636.54    11.27      23.34    4.47     95   
> Max Jobs per Minute 5661.19
> iota:/mnt#
> 
> 
> 
> #define BIO_VMERGE_BOUNDARY     (ia64_max_iommu_merge_mask + 1)                 
> 
> iota:/mnt# PATH=$PATH:/mnt/usr/local/bin       
> iota:/mnt# reaim -f /mnt/usr/local/share/reaim/workfile.new_dbase -s100 -e 500 -i 100
> 
> REAIM Workload
> Times are in seconds - Child times from tms.cstime and tms.cutime
> 
> Num     Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
> Forked  Time     SysTime UTime   Minute     Child      Time     Percent 
> 100     108.72   12.47   203.78  5684.17    56.84      4.46     4.32     95   
> 200     217.64   25.59   408.90  5679.16    28.40      8.65     4.16     95   
> 300     326.48   37.88   613.62  5678.81    18.93      13.80    4.44     95   
> 400     434.87   50.53   817.64  5684.46    14.21      17.40    4.18     95   
> 500     544.69   65.23   1022.92 5672.92    11.35      21.53    4.12     95   
> Max Jobs per Minute 5684.46
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 52+ messages in thread

* reaim now available as osdl-aim-7  - Re: [RFC] block layer support  for DMA IOMMU bypass mode II
  2003-08-01 21:51                             ` Cliff White
@ 2003-08-01 23:18                               ` Cliff White
  0 siblings, 0 replies; 52+ messages in thread
From: Cliff White @ 2003-08-01 23:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Grant Grundler, Andi Kleen, davem, alan, James.Bottomley, axboe,
	suparna, alex_williamson, bjorn_helgaas, cliffw


In response to this comment:
--
> For the record, I retrieved source from:
>     http://umn.dl.sourceforge.net/sourceforge/re-aim-7/reaim-0.1.8.tar.gz
> 
> This should be renamed to osdl-aim-7 (or something like that).
> The name space collision is unfortunate and annoying.
----------
We're depreciating Sourceforge, i'll keep tarballs there, but CVS will not be 
good.

The reaim code is now available at

bk://developer.osdl.org/osdl-aim-7

email me if you have problems.

(The symlink trick works, so far :)

cliffw


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2003-08-01 23:18 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-01 16:46 [RFC] block layer support for DMA IOMMU bypass mode James Bottomley
2003-07-01 17:09 ` Andi Kleen
2003-07-01 17:28   ` James Bottomley
2003-07-01 17:42     ` Andi Kleen
2003-07-01 19:22       ` Grant Grundler
2003-07-01 19:56       ` James Bottomley
2003-07-01 17:54     ` H. Peter Anvin
2003-07-01 19:19 ` Grant Grundler
2003-07-01 19:59   ` Alex Williamson
2003-07-01 20:11     ` James Bottomley
2003-07-01 20:03   ` James Bottomley
2003-07-01 23:01     ` Grant Grundler
2003-07-02 15:52       ` James Bottomley
2003-07-01 22:51 ` David S. Miller
2003-07-01 23:57 ` [RFC] block layer support for DMA IOMMU bypass mode II Andi Kleen
2003-07-02  0:03   ` David S. Miller
2003-07-02  0:22     ` Andi Kleen
2003-07-02  0:21       ` David S. Miller
2003-07-02 16:53       ` Grant Grundler
2003-07-02 17:19         ` Andi Kleen
2003-07-02 16:55   ` Grant Grundler
2003-07-02 17:20     ` Andi Kleen
2003-07-02 17:37       ` Grant Grundler
2003-07-02 21:16     ` Alan Cox
2003-07-02 23:56       ` Andi Kleen
2003-07-03 20:26         ` Alan Cox
2003-07-03 21:24           ` Andi Kleen
2003-07-03 22:19             ` Grant Grundler
2003-07-08  2:14             ` David S. Miller
2003-07-08 19:34               ` Andi Kleen
2003-07-08 19:47                 ` Jeff Garzik
2003-07-08 20:10                   ` Andi Kleen
2003-07-08 20:11                   ` Grant Grundler
2003-07-08 22:04                 ` David S. Miller
2003-07-08 22:25                   ` Grant Grundler
2003-07-08 22:23                     ` David S. Miller
2003-07-09 18:55                       ` Andi Kleen
2003-07-23 11:40                       ` Grant Grundler
2003-07-28 11:15                         ` Andi Kleen
2003-07-28 14:59                           ` Grant Grundler
2003-07-30  2:31                           ` Grant Grundler
2003-08-01 21:51                             ` Cliff White
2003-08-01 23:18                               ` reaim now available as osdl-aim-7 - " Cliff White
2003-07-30  4:42                           ` Grant Grundler
2003-07-30  4:51                             ` David S. Miller
2003-07-30 13:06                               ` Grant Grundler
2003-07-30 16:02                               ` Grant Grundler
2003-07-30 16:36                                 ` Andi Kleen
2003-07-30 17:18                                   ` James Bottomley
2003-07-30 14:20                             ` James Bottomley
2003-07-23 13:20                       ` Grant Grundler
2003-07-23 15:30                         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).