All of lore.kernel.org
 help / color / mirror / Atom feed
* mptsas/iommu/pciehp : PCIe hotplug of LSISAS1064E  fails with intel_iommu=on
       [not found] ` <20090911163933.GC2907@lackof.org>
@ 2009-09-15 13:07   ` Isabelle, Francois
  2009-09-16 16:04   ` Isabelle, Francois
  1 sibling, 0 replies; 4+ messages in thread
From: Isabelle, Francois @ 2009-09-15 13:07 UTC (permalink / raw)
  To: linux-scsi, iommu; +Cc: linux-pci, DL-MPTFusionLinux, Grant Grundler

Hi.

An issue has been raised with the SAS1064E when used on an Intel 5520 platform: pcie hot-plug of the controller does not work.

Since I can't access the archive for 'DL-MPTFusionLinux@lsi.com' and since I didn't receive a reply from MPTFusion list yet, I'm posting the thread here. It seems to be the right place to discuss MPT fusion driver anyways.

It appears that the MPT driver misbehaves; I'd like to know if there are any success stories when mixing VT-d technology with PCIe hot plug on a similar platform.

Instead of duplicating the whole thread I'm sending these links to the original posts.

http://marc.info/?l=linux-pci&m=125252723005873&w=2 [original post]
http://marc.info/?l=linux-pci&m=125268717610600&w=2 [reply]
http://marc.info/?l=linux-pci&m=125268984516738&w=2 [follow up post]

I'm also reposting to IOMMU list with a new 'topic' line that might be more meaningful. 

Thank you.

François Isabelle | Software Designer | Kontron Canada | T 450 437 5682 |F 450 437 8053 | E francois.isabelle@ca.kontron.com
 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: mptsas/iommu/pciehp : PCIe hotplug of LSISAS1064E  fails with intel_iommu=on
       [not found] ` <20090911163933.GC2907@lackof.org>
  2009-09-15 13:07   ` mptsas/iommu/pciehp : PCIe hotplug of LSISAS1064E fails with intel_iommu=on Isabelle, Francois
@ 2009-09-16 16:04   ` Isabelle, Francois
  2009-09-18  4:49     ` Grant Grundler
  1 sibling, 1 reply; 4+ messages in thread
From: Isabelle, Francois @ 2009-09-16 16:04 UTC (permalink / raw)
  To: linux-scsi, iommu; +Cc: linux-pci, DL-MPTFusionLinux, Grant Grundler

>>PrimeIocFifos calls pci_alloc_consistent() and has debug code to dump
>>the DMA resource allocated (both virtual and DMA addresses). Off hand
>>I don't know how to enable that but it would be the next step.

With mpt_debug_level=0x8060 , I was able to get the allocation code to be more verbose, see below.

>>> DRHD: handling fault status reg 2
>>> DMAR:[DMA Read] Request device [06:00.0] fault addr fffc2000

>>fffc2000 seems to be an unusual address to DMA from/to.
>>Is fffc2000 reserved address space for the IOMMU?
>> (ACPI DMAR info should tell us this)

It seems to be the actual address (see below); however I think there might be some kind of page overlap between the 'reply buffer' and 'request buffer' ...

mptbase: ioc0: ReplyBuffers @ ffff88003b180000[00000000fffc0000]
mptbase: ioc0: RequestBuffers @ ffff88003b182800[00000000fffc2800]

> > > > DMAR:[fault reason 06] PTE Read access is not set
> > > It's also odd that "Read Access is not set" for something (ioc_init)
> > > that I think should be bi-directional.

Wouldn't that depend on the IOMMU driver? Or is the device driver supposed to set protections?

When we compare the 2 'init' we see differences in the 'mem_phys' entry, which I think is weird, especially if it should match the PCI memory region of the device
>>      Region 1: Memory at dbffc000 (64-bit, non-prefetchable) [size=16K]
>>      Region 3: Memory at dbfe0000 (64-bit, non-prefetchable) [size=64K]

mptbase: ioc0: mem = ffffc90012e40000, mem_phys = dbffc000
mptbase: ioc0: facts @ ffff88003dc2541c, pfacts[0] @ ffff88003dc2546c
...
mptbase: ioc1: mem = ffffc90012e48000, mem_phys = d8010000
mptbase: ioc1: facts @ ffff88003d21041c, pfacts[0] @ ffff88003d21046c

..and we see differences in the chain buffers..

mptbase: ioc0: ReqToChain alloc  @ ffff88003d4d8800, sz=1108 bytes
mptbase: ioc0: RequestNB alloc  @ ffff88003e3f3000, sz=1108 bytes
mptbase: ioc0: ChainToChain alloc @ ffff88003db14000, sz=4572 bytes

mptbase: ioc1: ReqToChain alloc  @ ffff88003d2aa000, sz=1108 bytes
mptbase: ioc1: RequestNB alloc  @ ffff88003d268800, sz=1108 bytes
mptbase: ioc1: ChainToChain alloc @ ffff88003dfda000, sz=4572 bytes


The other thing that is now obvious is that the device name has changed (ioc1 instead of ioc0); this makes me think that the whole problem is probably more closely related to the resources unallocation, maybe a missing 'unlink' operation of some kind... I'm looking into this, but I could still use some help if anyone has an idea.

Following is the complete MPTSAS init log.

==== First time init  ===

mptsas 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
mptbase: ioc0: : 32 BIT PCI BUS DMA ADDRESSING SUPPORTED
mptbase: ioc0: mem = ffffc90012e40000, mem_phys = dbffc000
mptbase: ioc0: facts @ ffff88003dc2541c, pfacts[0] @ ffff88003dc2546c
mptbase: ioc0: Initiating bringup
mptbase: ioc0: IOC is in READY state
mptbase: ioc0: Sending get IocFacts request req_sz=12 reply_sz=80
mptbase: ioc0: NB_for_64_byte_frame=2 NBShiftFactor=5 BlockSize=8
mptbase: ioc0: reply_sz= 80, reply_depth= 128
mptbase: ioc0: req_sz  =128, req_depth  = 277
mptbase: ioc0: Sending get PortFacts(0) request
ioc0: LSISAS1064E B3: Capabilities={Initiator}
mptsas 0000:06:00.0: setting latency timer to 64
mptbase: ioc0: installed at interrupt 16
mptbase: ioc0: PrimeIocFifos
mptbase: ioc0: ReqToChain alloc  @ ffff88003d4d8800, sz=1108 bytes
mptbase: ioc0: RequestNB alloc  @ ffff88003e3f3000, sz=1108 bytes
mptbase: ioc0: num_sge=25 numSGE=520
mptbase: ioc0: Now numSGE=128 num_sge=130 num_chain=9
mptbase: ioc0: ChainToChain alloc @ ffff88003db14000, sz=4572 bytes
mptbase: ioc0: ReplyBuffer sz=80 bytes, ReplyDepth=128
mptbase: ioc0: ReplyBuffer sz=10240[2800] bytes
mptbase: ioc0: RequestBuffer sz=128 bytes, RequestDepth=277
mptbase: ioc0: RequestBuffer sz=35456[8a80] bytes
mptbase: ioc0: ChainBuffer sz=128 bytes, ChainDepth=1143
mptbase: ioc0: ChainBuffer sz=146304[23b80] bytes num_chain=1143
mptbase: ioc0: Total alloc @ ffff88003b180000[00000000fffc0000], sz=192000[2ee00] bytes
mptbase: ioc0: ReplyBuffers @ ffff88003b180000[00000000fffc0000]
mptbase: ioc0: RequestBuffers @ ffff88003b182800[00000000fffc2800]
mptbase: ioc0: ChainBuffers @ ffff88003b18b280(00000000fffcb280)
mptbase: ioc0: SenseBuffers @ ffff88003d2f8000[00000000fffb8000]
mptbase: ioc0: ReplyBuffers @ ffff88003b180000[00000000fffc0000]
mptbase: ioc0: SendIocInit
mptbase: ioc0: facts.MsgVersion=105
mptbase: ioc0: Sending Port(0)Enable (req @ ffff88003cc63a98)
mptbase: ioc0: Wait IOC_OPERATIONAL state (cnt=0)
mptbase: ioc0: SendEventNotification
scsi15 : ioc0: LSISAS1064E B3, FwRev=011b0000h, Ports=1, MaxQ=277, IRQ=16

==== Reinit after PCIe Hotplug  === 

tsas 0000:06:00.0: enabling device (0000 -> 0002)
mptsas 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
mptbase: ioc1: : 32 BIT PCI BUS DMA ADDRESSING SUPPORTED
mptbase: ioc1: mem = ffffc90012e48000, mem_phys = d8010000
mptbase: ioc1: facts @ ffff88003d21041c, pfacts[0] @ ffff88003d21046c
mptbase: ioc1: Initiating bringup
mptbase: ioc1: IOC is in READY state
mptbase: ioc1: Sending get IocFacts request req_sz=12 reply_sz=80
mptbase: ioc1: NB_for_64_byte_frame=2 NBShiftFactor=5 BlockSize=8
mptbase: ioc1: reply_sz= 80, reply_depth= 128
mptbase: ioc1: req_sz  =128, req_depth  = 277
mptbase: ioc1: Sending get PortFacts(0) request
ioc1: LSISAS1064E B3: Capabilities={Initiator}
mptsas 0000:06:00.0: setting latency timer to 64
mptbase: ioc1: installed at interrupt 16
mptbase: ioc1: PrimeIocFifos
mptbase: ioc1: ReqToChain alloc  @ ffff88003d2aa000, sz=1108 bytes
mptbase: ioc1: RequestNB alloc  @ ffff88003d268800, sz=1108 bytes
mptbase: ioc1: num_sge=25 numSGE=520
mptbase: ioc1: Now numSGE=128 num_sge=130 num_chain=9
mptbase: ioc1: ChainToChain alloc @ ffff88003dfda000, sz=4572 bytes
mptbase: ioc1: ReplyBuffer sz=80 bytes, ReplyDepth=128
mptbase: ioc1: ReplyBuffer sz=10240[2800] bytes                  ==========> (2800h bytes)
mptbase: ioc1: RequestBuffer sz=128 bytes, RequestDepth=277
mptbase: ioc1: RequestBuffer sz=35456[8a80] bytes
mptbase: ioc1: ChainBuffer sz=128 bytes, ChainDepth=1143
mptbase: ioc1: ChainBuffer sz=146304[23b80] bytes num_chain=1143
mptbase: ioc1: Total alloc @ ffff88003b180000[00000000fffc0000], sz=192000[2ee00] bytes
mptbase: ioc1: ReplyBuffers @ ffff88003b180000[00000000fffc0000]
mptbase: ioc1: RequestBuffers @ ffff88003b182800[00000000fffc2800]
mptbase: ioc1: ChainBuffers @ ffff88003b18b280(00000000fffcb280)
mptbase: ioc1: SenseBuffers @ ffff88003d2f8000[00000000fffb8000]
mptbase: ioc1: ReplyBuffers @ ffff88003b180000[00000000fffc0000] ==========>  (2800h bytes)
mptbase: ioc1: SendIocInit
mptbase: ioc1: facts.MsgVersion=105
mptbase: ioc1: Sending Port(0)Enable (req @ ffff88003e28d880)
deb64:~#


b64:~# mptbase: ioc1: WARNING - Issuing Reset from mpt_config!!
mptbase: ioc1: Initiating recovery
mptbase: ioc1: WARNING - IOC is in FAULT state!!!
mptbase: ioc1: WARNING -            FAULT code = 2000h
mptbase: ioc1: Recovered from IOC FAULT
mptbase: ioc1: PrimeIocFifos
mptbase: ioc1: SendIocInit
mptbase: ioc1: SendEventNotification
mptbase: ioc1: Attempting Retry Config request type 0x1, page 0x2, action 0
DRHD: handling fault status reg 102
DMAR:[DMA Read] Request device [06:00.0] fault addr fffc2000
DMAR:[fault reason 06] PTE Read access is not set


Thank you

François Isabelle | Software Designer | Kontron Canada | T 450 437 5682 |F 450 437 8053 | E francois.isabelle@ca.kontron.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mptsas/iommu/pciehp : PCIe hotplug of LSISAS1064E  fails with intel_iommu=on
  2009-09-16 16:04   ` Isabelle, Francois
@ 2009-09-18  4:49     ` Grant Grundler
  2009-09-18 20:53       ` mptsas/iommu/pciehp : PCIe hotplug of LSISAS1064E fails withintel_iommu=on Isabelle, Francois
  0 siblings, 1 reply; 4+ messages in thread
From: Grant Grundler @ 2009-09-18  4:49 UTC (permalink / raw)
  To: Isabelle, Francois
  Cc: linux-scsi, iommu, linux-pci, DL-MPTFusionLinux, Grant Grundler

On Wed, Sep 16, 2009 at 12:04:27PM -0400, Isabelle, Francois wrote:
> >>PrimeIocFifos calls pci_alloc_consistent() and has debug code to dump
> >>the DMA resource allocated (both virtual and DMA addresses). Off hand
> >>I don't know how to enable that but it would be the next step.
> 
> With mpt_debug_level=0x8060 , I was able to get the allocation code to be more verbose, see below.

Cool - noted.

> >>> DRHD: handling fault status reg 2
> >>> DMAR:[DMA Read] Request device [06:00.0] fault addr fffc2000
> 
> >>fffc2000 seems to be an unusual address to DMA from/to.
> >>Is fffc2000 reserved address space for the IOMMU?
> >> (ACPI DMAR info should tell us this)
> 
> It seems to be the actual address (see below); however I think there might be some kind of page overlap between the 'reply buffer' and 'request buffer' ...
> 
> mptbase: ioc0: ReplyBuffers @ ffff88003b180000[00000000fffc0000]
> mptbase: ioc0: RequestBuffers @ ffff88003b182800[00000000fffc2800]

Excellent. This corresponds with:
                dinitprintk(ioc, printk(MYIOC_s_DEBUG_FMT "ReplyBuffers @ %p[%p]\n",
                        ioc->name, ioc->reply_frames, (void *)(ulong)alloc_dma));

alloc_dma is the DMA Mapped address.

I don't believe the two ranges overlap based on this output:
> mptbase: ioc0: ReplyBuffer sz=80 bytes, ReplyDepth=128
> mptbase: ioc0: ReplyBuffer sz=10240[2800] bytes

that you included at the bottom of the email.


> 
> > > > > DMAR:[fault reason 06] PTE Read access is not set
> > > > It's also odd that "Read Access is not set" for something (ioc_init)
> > > > that I think should be bi-directional.
> 
> Wouldn't that depend on the IOMMU driver? Or is the device driver supposed to set protections?

Yes - IOMMU support owns that. Anything allocated/mapped with
pci_alloc_consistent() needs to have both read and write permissions
for both Host and PCI target. This definitely not a device driver
problem. Just need to verify the access was to that range.
I've lost track of the original address that was the problem.

> When we compare the 2 'init' we see differences in the 'mem_phys' entry, which I think is weird, especially if it should match the PCI memory region of the device
> >>      Region 1: Memory at dbffc000 (64-bit, non-prefetchable) [size=16K]
> >>      Region 3: Memory at dbfe0000 (64-bit, non-prefetchable) [size=64K]
> 
> mptbase: ioc0: mem = ffffc90012e40000, mem_phys = dbffc000
> mptbase: ioc0: facts @ ffff88003dc2541c, pfacts[0] @ ffff88003dc2546c
> ...
> mptbase: ioc1: mem = ffffc90012e48000, mem_phys = d8010000
> mptbase: ioc1: facts @ ffff88003d21041c, pfacts[0] @ ffff88003d21046c

The Region 1 and Region 3 are for the same controller (ioc0).
Not for two different controllers. The second controller should
have it's own pair of Region 1 and Region 3 configuration space registers.

> ..and we see differences in the chain buffers..
> 
> mptbase: ioc0: ReqToChain alloc  @ ffff88003d4d8800, sz=1108 bytes
> mptbase: ioc0: RequestNB alloc  @ ffff88003e3f3000, sz=1108 bytes
> mptbase: ioc0: ChainToChain alloc @ ffff88003db14000, sz=4572 bytes
> 
> mptbase: ioc1: ReqToChain alloc  @ ffff88003d2aa000, sz=1108 bytes
> mptbase: ioc1: RequestNB alloc  @ ffff88003d268800, sz=1108 bytes
> mptbase: ioc1: ChainToChain alloc @ ffff88003dfda000, sz=4572 bytes

I'm not sure what you are comparing with the previous 6 lines of output.
The two controller (ioc0 and ioc1) are required to have different
chain buffers.

> The other thing that is now obvious is that the device name has
> changed (ioc1 instead of ioc0);

Yes - good catch.

> this makes me think that the whole
> problem is probably more closely related to the resources unallocation,
> maybe a missing 'unlink' operation of some kind... I'm looking into
> this, but I could still use some help if anyone has an idea.

I'd be looking for why the ioc number changed. It should be resuming
with the same instance number since it's the same physical device.

hth,
grant

> 
> Following is the complete MPTSAS init log.
> 
> ==== First time init  ===
> 
> mptsas 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> mptbase: ioc0: : 32 BIT PCI BUS DMA ADDRESSING SUPPORTED
> mptbase: ioc0: mem = ffffc90012e40000, mem_phys = dbffc000
> mptbase: ioc0: facts @ ffff88003dc2541c, pfacts[0] @ ffff88003dc2546c
> mptbase: ioc0: Initiating bringup
> mptbase: ioc0: IOC is in READY state
> mptbase: ioc0: Sending get IocFacts request req_sz=12 reply_sz=80
> mptbase: ioc0: NB_for_64_byte_frame=2 NBShiftFactor=5 BlockSize=8
> mptbase: ioc0: reply_sz= 80, reply_depth= 128
> mptbase: ioc0: req_sz  =128, req_depth  = 277
> mptbase: ioc0: Sending get PortFacts(0) request
> ioc0: LSISAS1064E B3: Capabilities={Initiator}
> mptsas 0000:06:00.0: setting latency timer to 64
> mptbase: ioc0: installed at interrupt 16
> mptbase: ioc0: PrimeIocFifos
> mptbase: ioc0: ReqToChain alloc  @ ffff88003d4d8800, sz=1108 bytes
> mptbase: ioc0: RequestNB alloc  @ ffff88003e3f3000, sz=1108 bytes
> mptbase: ioc0: num_sge=25 numSGE=520
> mptbase: ioc0: Now numSGE=128 num_sge=130 num_chain=9
> mptbase: ioc0: ChainToChain alloc @ ffff88003db14000, sz=4572 bytes
> mptbase: ioc0: ReplyBuffer sz=80 bytes, ReplyDepth=128
> mptbase: ioc0: ReplyBuffer sz=10240[2800] bytes
> mptbase: ioc0: RequestBuffer sz=128 bytes, RequestDepth=277
> mptbase: ioc0: RequestBuffer sz=35456[8a80] bytes
> mptbase: ioc0: ChainBuffer sz=128 bytes, ChainDepth=1143
> mptbase: ioc0: ChainBuffer sz=146304[23b80] bytes num_chain=1143
> mptbase: ioc0: Total alloc @ ffff88003b180000[00000000fffc0000], sz=192000[2ee00] bytes
> mptbase: ioc0: ReplyBuffers @ ffff88003b180000[00000000fffc0000]
> mptbase: ioc0: RequestBuffers @ ffff88003b182800[00000000fffc2800]
> mptbase: ioc0: ChainBuffers @ ffff88003b18b280(00000000fffcb280)
> mptbase: ioc0: SenseBuffers @ ffff88003d2f8000[00000000fffb8000]
> mptbase: ioc0: ReplyBuffers @ ffff88003b180000[00000000fffc0000]
> mptbase: ioc0: SendIocInit
> mptbase: ioc0: facts.MsgVersion=105
> mptbase: ioc0: Sending Port(0)Enable (req @ ffff88003cc63a98)
> mptbase: ioc0: Wait IOC_OPERATIONAL state (cnt=0)
> mptbase: ioc0: SendEventNotification
> scsi15 : ioc0: LSISAS1064E B3, FwRev=011b0000h, Ports=1, MaxQ=277, IRQ=16
> 
> ==== Reinit after PCIe Hotplug  === 
> 
> tsas 0000:06:00.0: enabling device (0000 -> 0002)
> mptsas 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> mptbase: ioc1: : 32 BIT PCI BUS DMA ADDRESSING SUPPORTED
> mptbase: ioc1: mem = ffffc90012e48000, mem_phys = d8010000
> mptbase: ioc1: facts @ ffff88003d21041c, pfacts[0] @ ffff88003d21046c
> mptbase: ioc1: Initiating bringup
> mptbase: ioc1: IOC is in READY state
> mptbase: ioc1: Sending get IocFacts request req_sz=12 reply_sz=80
> mptbase: ioc1: NB_for_64_byte_frame=2 NBShiftFactor=5 BlockSize=8
> mptbase: ioc1: reply_sz= 80, reply_depth= 128
> mptbase: ioc1: req_sz  =128, req_depth  = 277
> mptbase: ioc1: Sending get PortFacts(0) request
> ioc1: LSISAS1064E B3: Capabilities={Initiator}
> mptsas 0000:06:00.0: setting latency timer to 64
> mptbase: ioc1: installed at interrupt 16
> mptbase: ioc1: PrimeIocFifos
> mptbase: ioc1: ReqToChain alloc  @ ffff88003d2aa000, sz=1108 bytes
> mptbase: ioc1: RequestNB alloc  @ ffff88003d268800, sz=1108 bytes
> mptbase: ioc1: num_sge=25 numSGE=520
> mptbase: ioc1: Now numSGE=128 num_sge=130 num_chain=9
> mptbase: ioc1: ChainToChain alloc @ ffff88003dfda000, sz=4572 bytes
> mptbase: ioc1: ReplyBuffer sz=80 bytes, ReplyDepth=128
> mptbase: ioc1: ReplyBuffer sz=10240[2800] bytes                  ==========> (2800h bytes)
> mptbase: ioc1: RequestBuffer sz=128 bytes, RequestDepth=277
> mptbase: ioc1: RequestBuffer sz=35456[8a80] bytes
> mptbase: ioc1: ChainBuffer sz=128 bytes, ChainDepth=1143
> mptbase: ioc1: ChainBuffer sz=146304[23b80] bytes num_chain=1143
> mptbase: ioc1: Total alloc @ ffff88003b180000[00000000fffc0000], sz=192000[2ee00] bytes
> mptbase: ioc1: ReplyBuffers @ ffff88003b180000[00000000fffc0000]
> mptbase: ioc1: RequestBuffers @ ffff88003b182800[00000000fffc2800]
> mptbase: ioc1: ChainBuffers @ ffff88003b18b280(00000000fffcb280)
> mptbase: ioc1: SenseBuffers @ ffff88003d2f8000[00000000fffb8000]
> mptbase: ioc1: ReplyBuffers @ ffff88003b180000[00000000fffc0000] ==========>  (2800h bytes)
> mptbase: ioc1: SendIocInit
> mptbase: ioc1: facts.MsgVersion=105
> mptbase: ioc1: Sending Port(0)Enable (req @ ffff88003e28d880)
> deb64:~#
> 
> 
> b64:~# mptbase: ioc1: WARNING - Issuing Reset from mpt_config!!
> mptbase: ioc1: Initiating recovery
> mptbase: ioc1: WARNING - IOC is in FAULT state!!!
> mptbase: ioc1: WARNING -            FAULT code = 2000h
> mptbase: ioc1: Recovered from IOC FAULT
> mptbase: ioc1: PrimeIocFifos
> mptbase: ioc1: SendIocInit
> mptbase: ioc1: SendEventNotification
> mptbase: ioc1: Attempting Retry Config request type 0x1, page 0x2, action 0
> DRHD: handling fault status reg 102
> DMAR:[DMA Read] Request device [06:00.0] fault addr fffc2000
> DMAR:[fault reason 06] PTE Read access is not set
> 
> 
> Thank you
> 
> François Isabelle | Software Designer | Kontron Canada | T 450 437 5682 |F 450 437 8053 | E francois.isabelle@ca.kontron.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: mptsas/iommu/pciehp : PCIe hotplug of LSISAS1064E  fails withintel_iommu=on
  2009-09-18  4:49     ` Grant Grundler
@ 2009-09-18 20:53       ` Isabelle, Francois
  0 siblings, 0 replies; 4+ messages in thread
From: Isabelle, Francois @ 2009-09-18 20:53 UTC (permalink / raw)
  To: Grant Grundler; +Cc: linux-scsi, iommu, linux-pci, DL-MPTFusionLinux


> Yes - IOMMU support owns that. Anything allocated/mapped with
> pci_alloc_consistent() needs to have both read and write permissions
> for both Host and PCI target. This definitely not a device driver
> problem. Just need to verify the access was to that range.
> I've lost track of the original address that was the problem.

It was 0xfffc2000, which is in the range allocated (alloc_dma)

>>> I'd be looking for why the ioc number changed. It should be resuming
>> with the same instance number since it's the same physical device.

Yes , that's what I though as well.

I tried to trace the expected call sequence , it seems like pci resources are correctly unassigned.

The folling sequence is executed on removal:

pci_remove_bus_device
pci_stop_dev
MPT:detach
pci_destroy_devs
pci_free_resources

I figured out that the ioc 'id' is not decremented in mpt_detach() and that it's used for some kind of reverse lookup by mptsas.c and by mpt_verify_adapter(), mostly used for mptclt. But it appears to be almost only cosmetical...

...
On the IOMMU side, I see that it sets mapping (CONTEXT_TT_MULTI_LEVEL)) for the SAS controller.

#ipmitool  picmg policy set 2 1 0
deb64:~# pciehp 0000:00:04.0:pcie04: Card present on Slot(52)
pciehp 0000:00:04.0:pcie04: Latch close on Slot(52)
pciehp 0000:00:04.0:pcie04: Button pressed on Slot(52)
pciehp 0000:00:04.0:pcie04: PCI slot #52 - powering on due to button press.
Fusion MPT base driver 3.04.12
Copyright (c) 1999-2008 LSI Corporation
Fusion MPT SAS Host driver 3.04.12
mptsas 0000:06:00.0: enabling device (0000 -> 0002)
mptsas 0000:06:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
mptbase: ioc0: Initiating bringup
ioc0: LSISAS1064E B3: Capabilities={Initiator}
IOMMU: get valid domain for 0000:06:00.0
IOMMU: mapping 0000:06:00.0
IOMMU: lsi 06:00.0
IOMMU: continue mapping for 06:00.0
IOMMU: mapping for 06:00.0 completed
IOMMU: mapping no upstream  0000:06:00.0
scsi7 : ioc0: LSISAS1064E B3, FwRev=011b0000h, Ports=1, MaxQ=277, IRQ=16

...
I looked at the driver correctness for alloc/free of dma memory, and it seems the play by the book as long as the DMA goes. With some extra debugging I get:


ptbase: ioc0: ChainBuffer sz=128 bytes, ChainDepth=1143
mptbase: ioc0: ChainBuffer sz=146304[23b80] bytes num_chain=1143
IOMMU:alloc coherent
IOMMU:succeeded 000000003bc00000(ffff88003bc00000) 192512 ffffffff
mptbase: ioc0: Total alloc @ ffff88003bc00000[00000000fffc0000], sz=192000[2ee00] bytes
mptbase: ioc0: ReplyBuffers @ ffff88003bc00000[00000000fffc0000]
mptbase: ioc0: RequestBuffers @ ffff88003bc02800[00000000fffc2800]
mptbase: ioc0: ChainBuffers @ ffff88003bc0b280(00000000fffcb280)
IOMMU:alloc coherent
IOMMU:succeeded 000000003d1f8000(ffff88003d1f8000) 20480 ffffffff
mptbase: ioc0: SenseBuffers @ ffff88003d1f8000[00000000fffb8000]
mptbase: ioc0: ReplyBuffers @ ffff88003bc00000[00000000fffc0000]
mptbase: ioc0: SendIocInit
mptbase: ioc0: facts.MsgVersion=105
mptbase: ioc0: Sending Port(0)Enable (req @ ffff88003da6da98)
mptbase: ioc0: Wait IOC_OPERATIONAL state (cnt=0)

The DMA memory is freed on removal:

mptbase: ioc0: free  @ ffff88003bc00000, sz=192000 bytes
IOMMU:free coherent
IOMMU:free coherent
mptsas 0000:06:00.0: PCI INT A disabled

However, I see the 'free' called  at mptbase.c:4596 uses the allocated size (192000) rather than the page padded size(192512) really allocated by the IOMMU engine, but I doubt it could be the cause for this problem.
...

So I'm starting to think the IOMMU driver messes up the mapping tables in some way on deallocation/reallocation operations.

I must admist I don't fully understand the impact of this change: http://kerneltrap.org/mailarchive/git-commits-head/2009/7/3/6134073

But seeing these lines 
-			dma_set_pte_readable(pte);
-			dma_set_pte_writable(pte);

I can't believe it's not related to this message:

DMAR:[DMA Read] Request device [06:00.0] fault addr fffc3000
DMAR:[fault reason 06] PTE Read access is not set

... this still needs some investigation.

Thank you

François Isabelle | Software Designer | Kontron Canada | T 450 437 5682 |F 450 437 8053 | E francois.isabelle@ca.kontron.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-09-18 20:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <C2866F9FC4CB034EB51A633DF168598605D68925@ssbarcelone>
     [not found] ` <20090911163933.GC2907@lackof.org>
2009-09-15 13:07   ` mptsas/iommu/pciehp : PCIe hotplug of LSISAS1064E fails with intel_iommu=on Isabelle, Francois
2009-09-16 16:04   ` Isabelle, Francois
2009-09-18  4:49     ` Grant Grundler
2009-09-18 20:53       ` mptsas/iommu/pciehp : PCIe hotplug of LSISAS1064E fails withintel_iommu=on Isabelle, Francois

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.