All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-07 15:35 Berkley Shands
  2004-01-07 19:19 ` badari
  0 siblings, 1 reply; 28+ messages in thread
From: Berkley Shands @ 2004-01-07 15:35 UTC (permalink / raw)
  To: gibbs, linux-kernel, linux-scsi, pbadari

	Running with the force segment merge OFF panics the processor after
about 1000 scsi retries. the error given, also in pci-gart.c, is
pci_map_area overflow 4096 bytes
So a brain dead repair kills the kernel. Someone clearly needs to figure
out where to correct the merge of the sg lists. A bit of doc on the iommu
and the 4096 byte limit would be nice too :-)

I see that is is the aborting of an SCB that causes the sg list halt.

Jan  7 09:18:32 typhoon kernel: DevQ(0:6:0): 0 waiting
Jan  7 09:18:32 typhoon kernel: (scsi0:A:2:0): SCB 0x46 - timed out
Jan  7 09:18:32 typhoon kernel: Recovery SCB completes
Jan  7 09:18:32 typhoon kernel: scsi0: Issued Channel A Bus Reset. 3 SCBs aborted
Jan  7 09:18:46 typhoon kernel: Did it again, boss 0000:01:03.0

Since the sg list merges into one i/o list, simply adding s->length = 4096
back into the list seems to keep the kernel up. a better if slightly less
stupid fix is to add up the remaining sg list lengths, and ajust
the sg[0] entry to sum to the correct value.

/*   		BUG_ON(s->length == 0); */
if (! s->length)
   {
   unsigned long zero = sg[0].length;
   unsigned long remain = 0;
   int t = 0;
   
   BUG_ON(i != 1);		/* some other error here */
   
   for (t = i + 1; t < nents; t++)
      remain += sg[t].length;	/* collect remaining sizes */
   zero -= remain;		/* deduct what is left on the list */
   sg[0].length = zero / 2;
   sg[1].length = zero / 2;	/* allocate uniformly */
   size = zero / 2;		/* reduce oversize first entry */
   printk(KERN_WARNING "Did it again, boss %s\n", dev->slot_name);
   }

The better solution is to have the upper layer fix the sg list, or
have some marker that the list was diddled, and save the old entries
to put it back.

berkley

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
  2004-01-07 15:35 [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps Berkley Shands
@ 2004-01-07 19:19 ` badari
  0 siblings, 0 replies; 28+ messages in thread
From: badari @ 2004-01-07 19:19 UTC (permalink / raw)
  To: Berkley Shands; +Cc: gibbs, linux-kernel, linux-scsi

Hi,

Andi Kleen reworked pci-gart code.
Would you try Andi's x86-64 patch and see if the problems still exist ?

ftp://ftp.x86-64.org/pub/linux/v2.6/x86_64-2.6.1rc2-1.bz2

And also, Can you try with "iommu=noforce,nomerge" ?

Fixing the sg-list in the upperlayer (by re-creating) it in case of retry
worked for me. I am still not sure why you are running into "len==0"
panics.

Thanks,
Badari

Berkley Shands wrote:

>         Running with the force segment merge OFF panics the processor after
> about 1000 scsi retries. the error given, also in pci-gart.c, is
> pci_map_area overflow 4096 bytes
> So a brain dead repair kills the kernel. Someone clearly needs to figure
> out where to correct the merge of the sg lists. A bit of doc on the iommu
> and the 4096 byte limit would be nice too :-)
>
> I see that is is the aborting of an SCB that causes the sg list halt.
>
> Jan  7 09:18:32 typhoon kernel: DevQ(0:6:0): 0 waiting
> Jan  7 09:18:32 typhoon kernel: (scsi0:A:2:0): SCB 0x46 - timed out
> Jan  7 09:18:32 typhoon kernel: Recovery SCB completes
> Jan  7 09:18:32 typhoon kernel: scsi0: Issued Channel A Bus Reset. 3 SCBs aborted
> Jan  7 09:18:46 typhoon kernel: Did it again, boss 0000:01:03.0
>
> Since the sg list merges into one i/o list, simply adding s->length = 4096
> back into the list seems to keep the kernel up. a better if slightly less
> stupid fix is to add up the remaining sg list lengths, and ajust
> the sg[0] entry to sum to the correct value.
>
> /*              BUG_ON(s->length == 0); */
> if (! s->length)
>    {
>    unsigned long zero = sg[0].length;
>    unsigned long remain = 0;
>    int t = 0;
>
>    BUG_ON(i != 1);              /* some other error here */
>
>    for (t = i + 1; t < nents; t++)
>       remain += sg[t].length;   /* collect remaining sizes */
>    zero -= remain;              /* deduct what is left on the list */
>    sg[0].length = zero / 2;
>    sg[1].length = zero / 2;     /* allocate uniformly */
>    size = zero / 2;             /* reduce oversize first entry */
>    printk(KERN_WARNING "Did it again, boss %s\n", dev->slot_name);
>    }
>
> The better solution is to have the upper layer fix the sg list, or
> have some marker that the list was diddled, and save the old entries
> to put it back.
>
> berkley


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-07 16:33 Berkley Shands
  0 siblings, 0 replies; 28+ messages in thread
From: Berkley Shands @ 2004-01-07 16:33 UTC (permalink / raw)
  To: berkley, davem; +Cc: gibbs, linux-kernel, linux-scsi

	Here is a somewhat more robust (and trivial) fix for the sg iommu
problem on the x86_64. Still does not correct the sg list modification logic though :-)

--- /raid0/kernels/linux/arch/x86_64/kernel/pci-gart.c  2004-01-05 10:16:11.000000000 -0600
+++ pci-gart.c  2004-01-07 10:14:47.000000000 -0600
@@ -446,6 +446,16 @@
                return 0;
        out = 0;
        start = 0;
+
+       /* see if we are recovering from an error */
+
+       if (sg[0].oldlength) {
+               printk(KERN_WARNING "Recovering sg list for %s\n", dev->slot_name);
+               for (i = 0; i < nents; i++) {
+                       sg[i].length = sg[i].oldlength;
+                       }
+               }
+
        for (i = 0; i < nents; i++) {
                struct scatterlist *s = &sg[i];
                dma_addr_t addr = page_to_phys(s->page) + s->offset;
@@ -453,6 +463,7 @@
                BUG_ON(s->length == 0); 
 
                size += s->length; 
+               s->oldlength = s->length;
 
                /* Handle the previous not yet processed entries */
                if (i > start) {

--- /raid0/kernels/linux/include/asm-x86_64/scatterlist.h       2003-12-17 20:59:39.000000000 -0600
+++ scatterlist.h       2004-01-07 10:01:06.000000000 -0600
@@ -5,6 +5,7 @@
     struct page                *page;
     unsigned int       offset;
     unsigned int       length;
+    unsigned int       oldlength;
     dma_addr_t         dma_address;
 };


berkley

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple  map/unmaps
  2004-01-06  3:04               ` David S. Miller
@ 2004-01-06  3:14                 ` James Bottomley
  -1 siblings, 0 replies; 28+ messages in thread
From: James Bottomley @ 2004-01-06  3:14 UTC (permalink / raw)
  To: David S. Miller; +Cc: Andi Kleen, Linux Kernel, SCSI Mailing List, gibbs

On Mon, 2004-01-05 at 21:04, David S. Miller wrote:
> On Tue, 6 Jan 2004 04:06:40 +0100
> Andi Kleen <ak@suse.de> wrote:
> 
> > It's ok. I fixed the code now[1] If you have other undocumented requirements
> > you should document them though, otherwise there may be more problems.
> > Since merging is disabled by default now it won't trigger anyways.
> 
> I totally agree with you Andi, not documenting this properly was an
> oversight and not intentional.

I'll go over the SCSI code and see if I spot any other lacunae in the
API.

James



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-06  3:14                 ` James Bottomley
  0 siblings, 0 replies; 28+ messages in thread
From: James Bottomley @ 2004-01-06  3:14 UTC (permalink / raw)
  To: David S. Miller; +Cc: Andi Kleen, Linux Kernel, SCSI Mailing List, gibbs

On Mon, 2004-01-05 at 21:04, David S. Miller wrote:
> On Tue, 6 Jan 2004 04:06:40 +0100
> Andi Kleen <ak@suse.de> wrote:
> 
> > It's ok. I fixed the code now[1] If you have other undocumented requirements
> > you should document them though, otherwise there may be more problems.
> > Since merging is disabled by default now it won't trigger anyways.
> 
> I totally agree with you Andi, not documenting this properly was an
> oversight and not intentional.

I'll go over the SCSI code and see if I spot any other lacunae in the
API.

James



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple  map/unmaps
  2004-01-06  0:05           ` James Bottomley
@ 2004-01-06  3:06             ` Andi Kleen
  -1 siblings, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2004-01-06  3:06 UTC (permalink / raw)
  To: James Bottomley; +Cc: davem, linux-kernel, linux-scsi, gibbs

On 05 Jan 2004 18:05:47 -0600
James Bottomley <James.Bottomley@steeleye.com> wrote:

> On Mon, 2004-01-05 at 15:31, Andi Kleen wrote:
> > For the sake of bug-to-bug compatibility to the SCSI layer this patch may
> > work. I haven't tested it so no guarantees if it won't eat your file systems.
> > Feedback welcome anyways.
> 
> This isn't a bug in SCSI, it's a deliberate design feature.  SCSI has
> certain events, like QUEUE full that cause us to re-queue the pending
> I/O.  Other block layer drivers that can get these EAGAIN type queueing
> problems from the device also follow this model.

It's ok. I fixed the code now[1] If you have other undocumented requirements
you should document them though, otherwise there may be more problems.
Since merging is disabled by default now it won't trigger anyways.

[1] will be in next merge after testing, the last patch i posted still
had one bug.

> As to the idempotence of map/unmap: I'm ambivalent.  If it's going to be
> a performance hit to return the sg list to its prior state in unmap,
> then it does seem a waste given that for most of our I/O transactions we
> simply free the sg list after the unmap.

With the evil dma_length trick it is actually near zero cost.

> 1. Fix the x86_64 mapping layer as your patch proposes (how much of a
> performance hit on every transaction will this be)?

I don't expect a significant performance hit.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-06  3:06             ` Andi Kleen
  0 siblings, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2004-01-06  3:06 UTC (permalink / raw)
  To: James Bottomley; +Cc: davem, linux-kernel, linux-scsi, gibbs

On 05 Jan 2004 18:05:47 -0600
James Bottomley <James.Bottomley@steeleye.com> wrote:

> On Mon, 2004-01-05 at 15:31, Andi Kleen wrote:
> > For the sake of bug-to-bug compatibility to the SCSI layer this patch may
> > work. I haven't tested it so no guarantees if it won't eat your file systems.
> > Feedback welcome anyways.
> 
> This isn't a bug in SCSI, it's a deliberate design feature.  SCSI has
> certain events, like QUEUE full that cause us to re-queue the pending
> I/O.  Other block layer drivers that can get these EAGAIN type queueing
> problems from the device also follow this model.

It's ok. I fixed the code now[1] If you have other undocumented requirements
you should document them though, otherwise there may be more problems.
Since merging is disabled by default now it won't trigger anyways.

[1] will be in next merge after testing, the last patch i posted still
had one bug.

> As to the idempotence of map/unmap: I'm ambivalent.  If it's going to be
> a performance hit to return the sg list to its prior state in unmap,
> then it does seem a waste given that for most of our I/O transactions we
> simply free the sg list after the unmap.

With the evil dma_length trick it is actually near zero cost.

> 1. Fix the x86_64 mapping layer as your patch proposes (how much of a
> performance hit on every transaction will this be)?

I don't expect a significant performance hit.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple  map/unmaps
  2004-01-06  3:06             ` Andi Kleen
@ 2004-01-06  3:04               ` David S. Miller
  -1 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2004-01-06  3:04 UTC (permalink / raw)
  To: Andi Kleen; +Cc: James.Bottomley, linux-kernel, linux-scsi, gibbs

On Tue, 6 Jan 2004 04:06:40 +0100
Andi Kleen <ak@suse.de> wrote:

> It's ok. I fixed the code now[1] If you have other undocumented requirements
> you should document them though, otherwise there may be more problems.
> Since merging is disabled by default now it won't trigger anyways.

I totally agree with you Andi, not documenting this properly was an
oversight and not intentional.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-06  3:04               ` David S. Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2004-01-06  3:04 UTC (permalink / raw)
  To: Andi Kleen; +Cc: James.Bottomley, linux-kernel, linux-scsi, gibbs

On Tue, 6 Jan 2004 04:06:40 +0100
Andi Kleen <ak@suse.de> wrote:

> It's ok. I fixed the code now[1] If you have other undocumented requirements
> you should document them though, otherwise there may be more problems.
> Since merging is disabled by default now it won't trigger anyways.

I totally agree with you Andi, not documenting this properly was an
oversight and not intentional.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple  map/unmaps
  2004-01-05 21:31         ` Andi Kleen
@ 2004-01-06  0:05           ` James Bottomley
  -1 siblings, 0 replies; 28+ messages in thread
From: James Bottomley @ 2004-01-06  0:05 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, Linux Kernel, SCSI Mailing List, gibbs

On Mon, 2004-01-05 at 15:31, Andi Kleen wrote:
> For the sake of bug-to-bug compatibility to the SCSI layer this patch may
> work. I haven't tested it so no guarantees if it won't eat your file systems.
> Feedback welcome anyways.

This isn't a bug in SCSI, it's a deliberate design feature.  SCSI has
certain events, like QUEUE full that cause us to re-queue the pending
I/O.  Other block layer drivers that can get these EAGAIN type queueing
problems from the device also follow this model.

The reason for doing this is the prep/request model for block drivers
(although the behaviour pre-dates the bio model).  As part of the prep
function, we prepare the mapping list that is later passed to
dma_map_sg().  Requeueing is done from the request function; by design,
we don't re-prepare the requests (and hence don't free and rebuild the
sg list).

Like Dave says, we rely on the sequence map/unmap/map to produce a
correct SG list.  This does give you slightly more leeway, since we
never look at the sg list after the unmap, for SCSI it doesn't have to
be returned to the pre-map state as long as re-mapping it is correct.

As to the idempotence of map/unmap: I'm ambivalent.  If it's going to be
a performance hit to return the sg list to its prior state in unmap,
then it does seem a waste given that for most of our I/O transactions we
simply free the sg list after the unmap.

It looks like we're down to a choice of three

1. Fix the x86_64 mapping layer as your patch proposes (how much of a
performance hit on every transaction will this be)?
2. Change SCSI (and every other block driver) to re-prepare requeued
I/O.  This will incur a free/setup overhead penalty, but only in the
requeue path.
3. Don't unmap the I/O in the requeue case.  I like this least because
the LLD is responsible for map/unmap and the mid-layer is responsible
for requeueing, so it's a layering violation (as well as a waste of
potentially valuable mapping resources).

On the whole, I prefer 1. Partly because it doesn't involve extra work
for me ;-) but also because idempotence is an appealing property from a
layering point of view.

If we're agreed on this, I can add it to the DMA docs.

James



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-06  0:05           ` James Bottomley
  0 siblings, 0 replies; 28+ messages in thread
From: James Bottomley @ 2004-01-06  0:05 UTC (permalink / raw)
  To: Andi Kleen; +Cc: David S. Miller, Linux Kernel, SCSI Mailing List, gibbs

On Mon, 2004-01-05 at 15:31, Andi Kleen wrote:
> For the sake of bug-to-bug compatibility to the SCSI layer this patch may
> work. I haven't tested it so no guarantees if it won't eat your file systems.
> Feedback welcome anyways.

This isn't a bug in SCSI, it's a deliberate design feature.  SCSI has
certain events, like QUEUE full that cause us to re-queue the pending
I/O.  Other block layer drivers that can get these EAGAIN type queueing
problems from the device also follow this model.

The reason for doing this is the prep/request model for block drivers
(although the behaviour pre-dates the bio model).  As part of the prep
function, we prepare the mapping list that is later passed to
dma_map_sg().  Requeueing is done from the request function; by design,
we don't re-prepare the requests (and hence don't free and rebuild the
sg list).

Like Dave says, we rely on the sequence map/unmap/map to produce a
correct SG list.  This does give you slightly more leeway, since we
never look at the sg list after the unmap, for SCSI it doesn't have to
be returned to the pre-map state as long as re-mapping it is correct.

As to the idempotence of map/unmap: I'm ambivalent.  If it's going to be
a performance hit to return the sg list to its prior state in unmap,
then it does seem a waste given that for most of our I/O transactions we
simply free the sg list after the unmap.

It looks like we're down to a choice of three

1. Fix the x86_64 mapping layer as your patch proposes (how much of a
performance hit on every transaction will this be)?
2. Change SCSI (and every other block driver) to re-prepare requeued
I/O.  This will incur a free/setup overhead penalty, but only in the
requeue path.
3. Don't unmap the I/O in the requeue case.  I like this least because
the LLD is responsible for map/unmap and the mid-layer is responsible
for requeueing, so it's a layering violation (as well as a waste of
potentially valuable mapping resources).

On the whole, I prefer 1. Partly because it doesn't involve extra work
for me ;-) but also because idempotence is an appealing property from a
layering point of view.

If we're agreed on this, I can add it to the DMA docs.

James

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple  map/unmaps
  2004-01-05 21:01       ` David S. Miller
@ 2004-01-05 21:31         ` Andi Kleen
  -1 siblings, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2004-01-05 21:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, linux-scsi, gibbs

On Mon, 5 Jan 2004 13:01:18 -0800
"David S. Miller" <davem@redhat.com> wrote:

> On 05 Jan 2004 22:02:19 +0100
> Andi Kleen <ak@suse.de> wrote:
> 
> > It sets length to zero to terminate the list when entries were merged.
> > It doesn't have a dma_length.
> 
> I understand, and you are defining dma_length to just use the
> normal sg->length field, and I'm trying to explain to you that this
> is not allowed.  If you want to modify the length field to zero terminate
> the DMA chunks, you must have a seperate dma_length field in your
> platforms scatterlist structure.

DMA-mapping.txt does not ever mention a dma_length or that remapping
is legal. 

> Again, for the 3rd time, see what sparc64 is doing here.

Sorry Dave, I'm not going to reverse engineer your cryptic mess.
If you have any more such undocumented requirements you should definitely
add them to the Spec.

For the sake of bug-to-bug compatibility to the SCSI layer this patch may
work. I haven't tested it so no guarantees if it won't eat your file systems.
Feedback welcome anyways.

-Andi

diff -u linux-2.6.1rc1-amd64/arch/x86_64/kernel/pci-gart.c-o linux-2.6.1rc1-amd64/arch/x86_64/kernel/pci-gart.c
--- linux-2.6.1rc1-amd64/arch/x86_64/kernel/pci-gart.c-o	2004-01-01 06:40:28.000000000 +0100
+++ linux-2.6.1rc1-amd64/arch/x86_64/kernel/pci-gart.c	2004-01-05 22:17:49.000000000 +0100
@@ -384,6 +395,7 @@
 			}
 		}
 		s->dma_address = addr;
+		s->dma_length = s->length;
 	}
 	flush_gart(dev);
 	return nents;
@@ -410,8 +422,9 @@
 			*sout = *s; 
 			sout->dma_address = iommu_bus_base;
 			sout->dma_address += iommu_page*PAGE_SIZE + s->offset;
+			sout->dma_length = s->length;
 		} else { 
-			sout->length += s->length; 
+			sout->dma_length += s->length; 
 		}
 
 		addr = phys_addr;
@@ -538,9 +551,9 @@
 	int i;
 	for (i = 0; i < nents; i++) { 
 		struct scatterlist *s = &sg[i];
-		if (!s->length) 
+		if (!s->dma_length || !s->length) 
 			break;
-		pci_unmap_single(dev, s->dma_address, s->length, dir);
+		pci_unmap_single(dev, s->dma_address, s->dma_length, dir);
 	}
 }
 
diff -u linux-2.6.1rc1-amd64/include/asm-x86_64/scatterlist.h-o linux-2.6.1rc1-amd64/include/asm-x86_64/scatterlist.h
--- linux-2.6.1rc1-amd64/include/asm-x86_64/scatterlist.h-o	2003-07-18 02:40:01.000000000 +0200
+++ linux-2.6.1rc1-amd64/include/asm-x86_64/scatterlist.h	2004-01-05 22:10:15.000000000 +0100
@@ -4,8 +4,9 @@
 struct scatterlist {
     struct page		*page;
     unsigned int	offset;
-    unsigned int	length;
+    unsigned int	length;  
     dma_addr_t		dma_address;
+    unsigned int        dma_length;
 };
 
 #define ISA_DMA_THRESHOLD (0x00ffffff)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 21:31         ` Andi Kleen
  0 siblings, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2004-01-05 21:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, linux-scsi, gibbs

On Mon, 5 Jan 2004 13:01:18 -0800
"David S. Miller" <davem@redhat.com> wrote:

> On 05 Jan 2004 22:02:19 +0100
> Andi Kleen <ak@suse.de> wrote:
> 
> > It sets length to zero to terminate the list when entries were merged.
> > It doesn't have a dma_length.
> 
> I understand, and you are defining dma_length to just use the
> normal sg->length field, and I'm trying to explain to you that this
> is not allowed.  If you want to modify the length field to zero terminate
> the DMA chunks, you must have a seperate dma_length field in your
> platforms scatterlist structure.

DMA-mapping.txt does not ever mention a dma_length or that remapping
is legal. 

> Again, for the 3rd time, see what sparc64 is doing here.

Sorry Dave, I'm not going to reverse engineer your cryptic mess.
If you have any more such undocumented requirements you should definitely
add them to the Spec.

For the sake of bug-to-bug compatibility to the SCSI layer this patch may
work. I haven't tested it so no guarantees if it won't eat your file systems.
Feedback welcome anyways.

-Andi

diff -u linux-2.6.1rc1-amd64/arch/x86_64/kernel/pci-gart.c-o linux-2.6.1rc1-amd64/arch/x86_64/kernel/pci-gart.c
--- linux-2.6.1rc1-amd64/arch/x86_64/kernel/pci-gart.c-o	2004-01-01 06:40:28.000000000 +0100
+++ linux-2.6.1rc1-amd64/arch/x86_64/kernel/pci-gart.c	2004-01-05 22:17:49.000000000 +0100
@@ -384,6 +395,7 @@
 			}
 		}
 		s->dma_address = addr;
+		s->dma_length = s->length;
 	}
 	flush_gart(dev);
 	return nents;
@@ -410,8 +422,9 @@
 			*sout = *s; 
 			sout->dma_address = iommu_bus_base;
 			sout->dma_address += iommu_page*PAGE_SIZE + s->offset;
+			sout->dma_length = s->length;
 		} else { 
-			sout->length += s->length; 
+			sout->dma_length += s->length; 
 		}
 
 		addr = phys_addr;
@@ -538,9 +551,9 @@
 	int i;
 	for (i = 0; i < nents; i++) { 
 		struct scatterlist *s = &sg[i];
-		if (!s->length) 
+		if (!s->dma_length || !s->length) 
 			break;
-		pci_unmap_single(dev, s->dma_address, s->length, dir);
+		pci_unmap_single(dev, s->dma_address, s->dma_length, dir);
 	}
 }
 
diff -u linux-2.6.1rc1-amd64/include/asm-x86_64/scatterlist.h-o linux-2.6.1rc1-amd64/include/asm-x86_64/scatterlist.h
--- linux-2.6.1rc1-amd64/include/asm-x86_64/scatterlist.h-o	2003-07-18 02:40:01.000000000 +0200
+++ linux-2.6.1rc1-amd64/include/asm-x86_64/scatterlist.h	2004-01-05 22:10:15.000000000 +0100
@@ -4,8 +4,9 @@
 struct scatterlist {
     struct page		*page;
     unsigned int	offset;
-    unsigned int	length;
+    unsigned int	length;  
     dma_addr_t		dma_address;
+    unsigned int        dma_length;
 };
 
 #define ISA_DMA_THRESHOLD (0x00ffffff)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
  2004-01-05 20:35   ` David S. Miller
@ 2004-01-05 21:10     ` Andi Kleen
  0 siblings, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2004-01-05 21:10 UTC (permalink / raw)
  To: David S. Miller; +Cc: Andi Kleen, gibbs, linux-kernel, linux-scsi

On Mon, Jan 05, 2004 at 12:35:09PM -0800, David S. Miller wrote:
> On Mon, 05 Jan 2004 20:47:19 +0100
> Andi Kleen <ak@muc.de> wrote:
> 
> > Actually I disabled merging by default in the latest x86-64 code,
> > but it can be still enabled by the user using options (it makes some
> > adapters run several percent faster). I would appreciate if you could
> > fix the problem anyways.
> > 
> > I was actually planning to add a BUG() for this. Should do that.
> > There is already one that triggers often when the problem occurs.
> 
> Andi, you must not modify sg->length in any way shape or form.
> 
> The following is legal:
> 
> 	pci_map_sg(..&sg);
> 	pci_unmap_sg(...&sg);
> 	pci_map_sg(..&sg);

Well, on x86-64 it is not legal.

> If you must modify the length field for DMA, you must have a seperate
> dma_length member of the scatterlist structure on your platform, see what
> sparc64 does here.
> 
> If the documentation states this wrongly, it's a doc bug.

The documentation doesn't allow it so I didn't implement it.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple  map/unmaps
       [not found] ` <20040105112800.7a9f240b.davem@redhat.com.suse.lists.linux.kernel>
@ 2004-01-05 21:02   ` Andi Kleen
  2004-01-05 21:01       ` David S. Miller
  0 siblings, 1 reply; 28+ messages in thread
From: Andi Kleen @ 2004-01-05 21:02 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, linux-scsi, gibbs

"David S. Miller" <davem@redhat.com> writes:

> On Mon, 5 Jan 2004 13:29:54 -0600 (CST)
> Berkley Shands <berkley@cs.wustl.edu> wrote:
> 
> > 	The pci layer is modifying the sg list, and then placing a zero
> > in the length field. pci-gart.c at line 453 (2.6.0 sources) checks this length field
> > after a retry, sees that it is zero, and bughalts.
> 
> Oh that's a bug.  It is allowed to modify the dma_length field but not
> the physical length field.

It sets length to zero to terminate the list when entries were merged.
It doesn't have a dma_length.

It tripping over remapped lists is an side effect, but an useful one 
because remapping is not supported (merging destroys information that
cannot be reconstructed). If the bug didn't exist you would get data
corruption.

-Andi

P.S.: The x86-64 IOMMU code in 2.6.0 was buggy. Use current -bk*.
It will avoid the problem because merging is turned off by default, 
but it should be still fixed.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple  map/unmaps
  2004-01-05 21:02   ` Andi Kleen
@ 2004-01-05 21:01       ` David S. Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2004-01-05 21:01 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, linux-scsi, gibbs

On 05 Jan 2004 22:02:19 +0100
Andi Kleen <ak@suse.de> wrote:

> It sets length to zero to terminate the list when entries were merged.
> It doesn't have a dma_length.

I understand, and you are defining dma_length to just use the
normal sg->length field, and I'm trying to explain to you that this
is not allowed.  If you want to modify the length field to zero terminate
the DMA chunks, you must have a seperate dma_length field in your
platforms scatterlist structure.

Again, for the 3rd time, see what sparc64 is doing here.

> It tripping over remapped lists is an side effect, but an useful one 
> because remapping is not supported (merging destroys information that
> cannot be reconstructed). If the bug didn't exist you would get data
> corruption.

You should not be modifying any portion of the non-DMA fields.
Therefore, if the SG is unmapped, then passed into your IOMMU code for
a future map call, it should just work.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 21:01       ` David S. Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2004-01-05 21:01 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, linux-scsi, gibbs

On 05 Jan 2004 22:02:19 +0100
Andi Kleen <ak@suse.de> wrote:

> It sets length to zero to terminate the list when entries were merged.
> It doesn't have a dma_length.

I understand, and you are defining dma_length to just use the
normal sg->length field, and I'm trying to explain to you that this
is not allowed.  If you want to modify the length field to zero terminate
the DMA chunks, you must have a seperate dma_length field in your
platforms scatterlist structure.

Again, for the 3rd time, see what sparc64 is doing here.

> It tripping over remapped lists is an side effect, but an useful one 
> because remapping is not supported (merging destroys information that
> cannot be reconstructed). If the bug didn't exist you would get data
> corruption.

You should not be modifying any portion of the non-DMA fields.
Therefore, if the SG is unmapped, then passed into your IOMMU code for
a future map call, it should just work.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
       [not found]   ` <2997092704.1073333041@aslan.btc.adaptec.com.suse.lists.linux.kernel>
@ 2004-01-05 20:58     ` Andi Kleen
  0 siblings, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2004-01-05 20:58 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel, linux-scsi

"Justin T. Gibbs" <gibbs@scsiguy.com> writes:

> > "Justin T. Gibbs" <gibbs@scsiguy.com> writes:
> > 
> >> Berkley Shands recently tripped over this problem.  The 2.6.X pci_map_sg
> >> code for x86_64 modifies the passed in S/G list to compact it for mapping
> >> by the GART.  This modification is not reversed when pci_unmap_sg is
> >> called.  In the case of a retried SCSI command, this causes any attempt
> > 
> > Qlogic has the same bug.
> 
> Which bug is this?  Not updating use_sg, or mapping the command the
> second time when it is retried?  If the latter, I don't think this is an
> HBA bug.  The HBA driver doesn't know that the command has been mapped in
> the past, so it will blindly map/unmap it again.

In some cases x86-64 pci_map_sg causes a BUG() when you pass it an already
mapped list. I've got reports of that happening with Qlogic.

I haven't analyzed it in detail, whether the mid layer or the driver
or someone else is to blame.
 
> ...
> 
> >> DMA-API.txt doesn't seem to cover this issue.
> > 
> > It does actually, but not very clearly. I've suggested changes of wording
> > recently to make this more clear, but it hasn't gotten in.
> 
> Can you point to the section of the document you believe applies?

It documents that pci_map_sg rewrites the sg list and never guarantees
that the rewriting is undone on pci_unmap_sg.

> > You should never remap an already mapped sg list.
> 
> But the sg list is no longer mapped.  The HBA driver did call pci_unmap_sg
> on it.  Did you mean to say, "Never map an sg list again that has been
> mapped in the past"?

Yep. While it would be in theory possible to reconstruct the list at
unmap time it would be extremly costly (require lots of uncached
memory access on x86-64) and is not done.

> 
> > Either reuse the already mapped list or keep a copy of the original list
> > around. First is better because the later may have problems with the page
> > reference counts.
> 
> The mid-layer doesn't map the list.  The HBA drivers do.  So you're saying
> that either the mid-layer or the HBA drivers need to copy the list so it
> can be restored just in case the command will be retried?

I'm just pointing out the requirements of pci_map_sg.

Either you can keep a flag around somewhere that says if the list is
mapped or not and skip the remapping and don't do a unmap in
between. Or copy and make sure the page reference counts are correctly
maintained.  I'm not very intimate with the SCSI/block code so you
guys have to decide what is more convenient.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
  2004-01-05 19:47 ` Andi Kleen
  2004-01-05 20:04   ` Justin T. Gibbs
@ 2004-01-05 20:35   ` David S. Miller
  2004-01-05 21:10     ` Andi Kleen
  1 sibling, 1 reply; 28+ messages in thread
From: David S. Miller @ 2004-01-05 20:35 UTC (permalink / raw)
  To: Andi Kleen; +Cc: gibbs, linux-kernel, linux-scsi

On Mon, 05 Jan 2004 20:47:19 +0100
Andi Kleen <ak@muc.de> wrote:

> Actually I disabled merging by default in the latest x86-64 code,
> but it can be still enabled by the user using options (it makes some
> adapters run several percent faster). I would appreciate if you could
> fix the problem anyways.
> 
> I was actually planning to add a BUG() for this. Should do that.
> There is already one that triggers often when the problem occurs.

Andi, you must not modify sg->length in any way shape or form.

The following is legal:

	pci_map_sg(..&sg);
	pci_unmap_sg(...&sg);
	pci_map_sg(..&sg);

If you must modify the length field for DMA, you must have a seperate
dma_length member of the scatterlist structure on your platform, see what
sparc64 does here.

If the documentation states this wrongly, it's a doc bug.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
  2004-01-05 19:47 ` Andi Kleen
@ 2004-01-05 20:04   ` Justin T. Gibbs
  2004-01-05 20:35   ` David S. Miller
  1 sibling, 0 replies; 28+ messages in thread
From: Justin T. Gibbs @ 2004-01-05 20:04 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, davem, linux-scsi

> "Justin T. Gibbs" <gibbs@scsiguy.com> writes:
> 
>> Berkley Shands recently tripped over this problem.  The 2.6.X pci_map_sg
>> code for x86_64 modifies the passed in S/G list to compact it for mapping
>> by the GART.  This modification is not reversed when pci_unmap_sg is
>> called.  In the case of a retried SCSI command, this causes any attempt
> 
> Qlogic has the same bug.

Which bug is this?  Not updating use_sg, or mapping the command the
second time when it is retried?  If the latter, I don't think this is an
HBA bug.  The HBA driver doesn't know that the command has been mapped in
the past, so it will blindly map/unmap it again.

...

>> DMA-API.txt doesn't seem to cover this issue.
> 
> It does actually, but not very clearly. I've suggested changes of wording
> recently to make this more clear, but it hasn't gotten in.

Can you point to the section of the document you believe applies?

> You should never remap an already mapped sg list.

But the sg list is no longer mapped.  The HBA driver did call pci_unmap_sg
on it.  Did you mean to say, "Never map an sg list again that has been
mapped in the past"?

> Either reuse the already mapped list or keep a copy of the original list
> around. First is better because the later may have problems with the page
> reference counts.

The mid-layer doesn't map the list.  The HBA drivers do.  So you're saying
that either the mid-layer or the HBA drivers need to copy the list so it
can be restored just in case the command will be retried?

--
Justin


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 20:00 Berkley Shands
  0 siblings, 0 replies; 28+ messages in thread
From: Berkley Shands @ 2004-01-05 20:00 UTC (permalink / raw)
  To: berkley, davem; +Cc: gibbs, linux-kernel, linux-scsi

	   		BUG_ON(s->length == 0);

		size += s->length; 

		/* Handle the previous not yet processed entries */

		if (i > start) {
			struct scatterlist *ps = &sg[i-1];
			/* Can only merge when the last chunk ends on a page 
			   boundary. */
			if (1 || !force_iommu || !need || (i-1 > start && ps->offset) ||
			    (ps->offset + ps->length) % PAGE_SIZE) { 
				if (pci_map_cont(sg, start, i, sg+out, pages, 
						 need) < 0)
					goto error;

If I totally disable coalescing, in arch/x86_64/pci-gart.c by adding the if (1 || ...
then the bughalt goes away. So might some performance, but a system that runs is
MUCH better than one that hard faults on a regular basis.

The shipped scsi driver still needs to be updated to NOT "offline" the drives.
the 2.0.5 driver does this quite well.

Thank you.

berkley

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
  2004-01-05 17:57 Justin T. Gibbs
  2004-01-05 19:22 ` David S. Miller
  2004-01-05 19:41 ` Badari Pulavarty
@ 2004-01-05 19:47 ` Andi Kleen
  2004-01-05 20:04   ` Justin T. Gibbs
  2004-01-05 20:35   ` David S. Miller
  2 siblings, 2 replies; 28+ messages in thread
From: Andi Kleen @ 2004-01-05 19:47 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel, davem, linux-scsi

"Justin T. Gibbs" <gibbs@scsiguy.com> writes:

> Berkley Shands recently tripped over this problem.  The 2.6.X pci_map_sg
> code for x86_64 modifies the passed in S/G list to compact it for mapping
> by the GART.  This modification is not reversed when pci_unmap_sg is
> called.  In the case of a retried SCSI command, this causes any attempt

Qlogic has the same bug.

Actually I disabled merging by default in the latest x86-64 code,
but it can be still enabled by the user using options (it makes some
adapters run several percent faster). I would appreciate if you could
fix the problem anyways.

I was actually planning to add a BUG() for this. Should do that.
There is already one that triggers often when the problem occurs.

> to map the command a second time to fail with a BUG assertion since the
> nseg parameter passed into the second map call is state.  nseg comes from
> the "use_sg" field in the SCSI command structure which is never touched
> by the HBA drivers invoking pci_map_sg.
>
> DMA-API.txt doesn't seem to cover this issue.  Should the low-level DMA

It does actually, but not very clearly. I've suggested changes of wording
recently to make this more clear, but it hasn't gotten in.

> code restore the S/G list to its original state on unmap or should the
> SCSI HBA drivers be changed to update "use_sg" with the segment count
> reported by the pci_map_sg() API?  If the latter, this seems to contradict

You should never remap an already mapped sg list. Either reuse the already mapped
list or keep a copy of the original list around. First is better because the later
may have problems with the page reference counts.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
  2004-01-05 17:57 Justin T. Gibbs
  2004-01-05 19:22 ` David S. Miller
@ 2004-01-05 19:41 ` Badari Pulavarty
  2004-01-05 19:47 ` Andi Kleen
  2 siblings, 0 replies; 28+ messages in thread
From: Badari Pulavarty @ 2004-01-05 19:41 UTC (permalink / raw)
  To: Justin T. Gibbs, Kernel Mailinglist, linux-scsi; +Cc: Berkley Shands

On Monday 05 January 2004 09:57 am, Justin T. Gibbs wrote:
> Berkley Shands recently tripped over this problem.  The 2.6.X pci_map_sg
> code for x86_64 modifies the passed in S/G list to compact it for mapping
> by the GART.  This modification is not reversed when pci_unmap_sg is
> called.  In the case of a retried SCSI command, this causes any attempt
> to map the command a second time to fail with a BUG assertion since the
> nseg parameter passed into the second map call is state.  nseg comes from
> the "use_sg" field in the SCSI command structure which is never touched
> by the HBA drivers invoking pci_map_sg.
>
> DMA-API.txt doesn't seem to cover this issue.  Should the low-level DMA
> code restore the S/G list to its original state on unmap or should the
> SCSI HBA drivers be changed to update "use_sg" with the segment count
> reported by the pci_map_sg() API?  If the latter, this seems to contradict
> the mandate in DMA-API that the nseg parameter passed into the unmap call
> be the same as that passed into the map call.  Most of the kernel assumes
> that an S/G list can be mapped an unmapped multiple times using the same
> arguments.  This doesn't seem to me to be an unreasonable expectation.

Hi,

I ran into the same thing a month ago. I talked to Andi Kleen about this.
He seems to think that, it is okay to modify the sg-list in pci_map_sg().
pci_unmap_sg() can't restore it back, since we lost lot of information.
One option is to recreate entire sg-list in case of a retry. I used a patch 
similar to following to do this. Is this acceptable ?

Thanks,
Badari

--- scsi_lib.c  2004-01-04 19:53:56.000000000 -0800
+++ scsi_lib.c.new      2004-01-04 19:57:35.000000000 -0800
@@ -308,6 +308,13 @@ void scsi_setup_cmd_retry(struct scsi_cm
        cmd->cmd_len = cmd->old_cmd_len;
        cmd->sc_data_direction = cmd->sc_old_data_direction;
        cmd->underflow = cmd->old_underflow;
+       if (cmd->use_sg) {
+               struct request     *req = cmd->request;
+               int count;
+
+               count = blk_rq_map_sg(req->q, req, cmd->request_buffer);
+               BUG_ON(count != cmd->use_sg);
+       }
 }



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 19:29 Berkley Shands
  2004-01-05 19:28   ` David S. Miller
  0 siblings, 1 reply; 28+ messages in thread
From: Berkley Shands @ 2004-01-05 19:29 UTC (permalink / raw)
  To: davem, gibbs; +Cc: berkley, linux-kernel, linux-scsi

	The pci layer is modifying the sg list, and then placing a zero
in the length field. pci-gart.c at line 453 (2.6.0 sources) checks this length field
after a retry, sees that it is zero, and bughalts.
	As Justin points out, SOMEBODY needs to reset the sg list, or kick the error
back up far enough to recreate it completely. Or just don't bother to
combine the entries. A machine with 8GB or more can stand to lose a few bytes.
How many hardware devices can handle a 1MB+ I/O buffer being passed to them anyway?

berkley

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
  2004-01-05 19:29 Berkley Shands
@ 2004-01-05 19:28   ` David S. Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2004-01-05 19:28 UTC (permalink / raw)
  To: Berkley Shands; +Cc: gibbs, berkley, linux-kernel, linux-scsi

On Mon, 5 Jan 2004 13:29:54 -0600 (CST)
Berkley Shands <berkley@cs.wustl.edu> wrote:

> 	The pci layer is modifying the sg list, and then placing a zero
> in the length field. pci-gart.c at line 453 (2.6.0 sources) checks this length field
> after a retry, sees that it is zero, and bughalts.

Oh that's a bug.  It is allowed to modify the dma_length field but not
the physical length field.

I imagine x86_64 is doing this so that there need not be a seperate dma_length
field in the scatter_gather struct defined for that platform, and that's too bad it will
definitely need such a seperate field if it wants to implement coalescing.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 19:28   ` David S. Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2004-01-05 19:28 UTC (permalink / raw)
  Cc: gibbs, berkley, linux-kernel, linux-scsi

On Mon, 5 Jan 2004 13:29:54 -0600 (CST)
Berkley Shands <berkley@cs.wustl.edu> wrote:

> 	The pci layer is modifying the sg list, and then placing a zero
> in the length field. pci-gart.c at line 453 (2.6.0 sources) checks this length field
> after a retry, sees that it is zero, and bughalts.

Oh that's a bug.  It is allowed to modify the dma_length field but not
the physical length field.

I imagine x86_64 is doing this so that there need not be a seperate dma_length
field in the scatter_gather struct defined for that platform, and that's too bad it will
definitely need such a seperate field if it wants to implement coalescing.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
  2004-01-05 17:57 Justin T. Gibbs
@ 2004-01-05 19:22 ` David S. Miller
  2004-01-05 19:41 ` Badari Pulavarty
  2004-01-05 19:47 ` Andi Kleen
  2 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2004-01-05 19:22 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: linux-kernel, linux-scsi, berkley

On Mon, 05 Jan 2004 10:57:35 -0700
"Justin T. Gibbs" <gibbs@scsiguy.com> wrote:

> DMA-API.txt doesn't seem to cover this issue.  Should the low-level DMA
> code restore the S/G list to its original state on unmap or should the
> SCSI HBA drivers be changed to update "use_sg" with the segment count
> reported by the pci_map_sg() API?  If the latter, this seems to contradict
> the mandate in DMA-API that the nseg parameter passed into the unmap call
> be the same as that passed into the map call.  Most of the kernel assumes
> that an S/G list can be mapped an unmapped multiple times using the same
> arguments.  This doesn't seem to me to be an unreasonable expectation.

No, the PCI layer is not required at all to restore the SG to it's original state
if it does DMA page coalescing as sparc64 and x86_64 do.

But I don't see what the real problem is, all the PCI layer is doing is setting up
the DMA address/length pairs in the entries that are needed, then returning
the number of such slots that exist.  You, in your driver, can remember the original
total number of physical entries and use that value to pass things back in.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 17:57 Justin T. Gibbs
  2004-01-05 19:22 ` David S. Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Justin T. Gibbs @ 2004-01-05 17:57 UTC (permalink / raw)
  To: Kernel Mailinglist, linux-scsi; +Cc: Berkley Shands

Berkley Shands recently tripped over this problem.  The 2.6.X pci_map_sg
code for x86_64 modifies the passed in S/G list to compact it for mapping
by the GART.  This modification is not reversed when pci_unmap_sg is
called.  In the case of a retried SCSI command, this causes any attempt
to map the command a second time to fail with a BUG assertion since the
nseg parameter passed into the second map call is state.  nseg comes from
the "use_sg" field in the SCSI command structure which is never touched
by the HBA drivers invoking pci_map_sg.

DMA-API.txt doesn't seem to cover this issue.  Should the low-level DMA
code restore the S/G list to its original state on unmap or should the
SCSI HBA drivers be changed to update "use_sg" with the segment count
reported by the pci_map_sg() API?  If the latter, this seems to contradict
the mandate in DMA-API that the nseg parameter passed into the unmap call
be the same as that passed into the map call.  Most of the kernel assumes
that an S/G list can be mapped an unmapped multiple times using the same
arguments.  This doesn't seem to me to be an unreasonable expectation.

--
Justin


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2004-01-07 19:22 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-01-07 15:35 [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps Berkley Shands
2004-01-07 19:19 ` badari
  -- strict thread matches above, loose matches on Subject: below --
2004-01-07 16:33 Berkley Shands
     [not found] <200401051929.i05JTsM0000014248@mudpuddle.cs.wustl.edu.suse.lists.linux.kernel>
     [not found] ` <20040105112800.7a9f240b.davem@redhat.com.suse.lists.linux.kernel>
2004-01-05 21:02   ` Andi Kleen
2004-01-05 21:01     ` David S. Miller
2004-01-05 21:01       ` David S. Miller
2004-01-05 21:31       ` Andi Kleen
2004-01-05 21:31         ` Andi Kleen
2004-01-06  0:05         ` James Bottomley
2004-01-06  0:05           ` James Bottomley
2004-01-06  3:06           ` Andi Kleen
2004-01-06  3:06             ` Andi Kleen
2004-01-06  3:04             ` David S. Miller
2004-01-06  3:04               ` David S. Miller
2004-01-06  3:14               ` James Bottomley
2004-01-06  3:14                 ` James Bottomley
     [not found] <2938942704.1073325455@aslan.btc.adaptec.com.suse.lists.linux.kernel>
     [not found] ` <m3brpi41q0.fsf@averell.firstfloor.org.suse.lists.linux.kernel>
     [not found]   ` <2997092704.1073333041@aslan.btc.adaptec.com.suse.lists.linux.kernel>
2004-01-05 20:58     ` Andi Kleen
2004-01-05 20:00 Berkley Shands
2004-01-05 19:29 Berkley Shands
2004-01-05 19:28 ` David S. Miller
2004-01-05 19:28   ` David S. Miller
2004-01-05 17:57 Justin T. Gibbs
2004-01-05 19:22 ` David S. Miller
2004-01-05 19:41 ` Badari Pulavarty
2004-01-05 19:47 ` Andi Kleen
2004-01-05 20:04   ` Justin T. Gibbs
2004-01-05 20:35   ` David S. Miller
2004-01-05 21:10     ` Andi Kleen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.