* IO_PAGE_FAULT from SATA card during boot
@ 2011-01-29 11:24 Chris Webb
2011-01-29 16:41 ` Robert Hancock
0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-01-29 11:24 UTC (permalink / raw)
To: linux-ide
I have several Supermicro H8DGT-HF motherboards (BIOS version 1.1) with Star
Tech PEXSAT32 PCI Express SATA cards attached, and am seeing an
IO_PAGE_FAULT during boot corresponding to this card:
IO_PAGE_FAULT device=03:00.1 domain=0x0000 address=0x00000000000403c0 flags=0x0050]
The card later times out when the kernel tries to access the drives:
ata6.00: qc timeout (cmd 0xec)
ata12.00: qc timeout (cmd 0xa1)
ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata12.00: qc timeout (cmd 0xa1)
ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata12: limiting SATA link speed to 1.5 Gbps
ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata6.00: qc timeout (cmd 0xec)
ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata6: limiting SATA link speed to 1.5 Gbps
ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
ata12.00: qc timeout (cmd 0xa1)
ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata6.00: qc timeout (cmd 0xec)
ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
I first saw this with a 2.6.32.25 kernel, but get identical behaviour with
the latest 2.6.37. The kernel config I'm using with 2.6.37 is here:
http://cdw.me.uk/tmp/sata-fault.config
with a full dmesg and dmidecode output here:
http://cdw.me.uk/tmp/sata-fault.dmesg
http://cdw.me.uk/tmp/sata-fault.dmi
Because I initially believed this might be a problem with the ACPI table on the
IOMMU driver, as similar issues have come up with other boards (and very
similar symptoms) recently, I've added amd_iommu_dump to the kernel command
line, so there's dump info in that dmesg. However, Joerg Roedel, the IOMMU
driver maintainer, tells me that the IOMMU ACPI table is fine in this case and
the problem is a different one:
Joerg Roedel <Joerg.Roedel@amd.com> writes:
> The flags indicate that the device tried to read an address which is
> only mapped writable for the device.
> It is at least no BIOS issue, both devices (3:00.0 and 3:00.1) are
> listed in the ACPI table as indicated by these messages:
>
> AMD-Vi: DEV_SELECT devid: 03:00.0 flags: 00
> AMD-Vi: DEV_SELECT devid: 03:00.1 flags: 00
>
> This looks like a bug in the driver for your SATA add-on card. It
> probably requests a DMA buffer with the wrong direction parameter.
I think both the onboard cards (which work) and the PCI Express card (which
doesn't) use the ahci driver in this case. Any advice would be very gratefully
received!
Best wishes,
Chris.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-01-29 11:24 IO_PAGE_FAULT from SATA card during boot Chris Webb
@ 2011-01-29 16:41 ` Robert Hancock
2011-01-30 1:54 ` Chris Webb
0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-01-29 16:41 UTC (permalink / raw)
To: Chris Webb; +Cc: linux-ide
On 01/29/2011 05:24 AM, Chris Webb wrote:
> I have several Supermicro H8DGT-HF motherboards (BIOS version 1.1) with Star
> Tech PEXSAT32 PCI Express SATA cards attached, and am seeing an
> IO_PAGE_FAULT during boot corresponding to this card:
>
> IO_PAGE_FAULT device=03:00.1 domain=0x0000 address=0x00000000000403c0 flags=0x0050]
What's that device 03:00.1? The AHCI controller itself seems to be
03:00.0. Can you post the lspci -v output? Though it could be that maybe
the IOMMU can't tell functions on the same device apart, not sure.
Given that the onboard AHCI controller is working but this one is not,
it could be that the controller is doing some unexpected read request
somewhere..
>
> The card later times out when the kernel tries to access the drives:
>
> ata6.00: qc timeout (cmd 0xec)
> ata12.00: qc timeout (cmd 0xa1)
> ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> ata12.00: qc timeout (cmd 0xa1)
> ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> ata12: limiting SATA link speed to 1.5 Gbps
> ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> ata6.00: qc timeout (cmd 0xec)
> ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> ata6: limiting SATA link speed to 1.5 Gbps
> ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
> ata12.00: qc timeout (cmd 0xa1)
> ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> ata6.00: qc timeout (cmd 0xec)
> ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
>
> I first saw this with a 2.6.32.25 kernel, but get identical behaviour with
> the latest 2.6.37. The kernel config I'm using with 2.6.37 is here:
>
> http://cdw.me.uk/tmp/sata-fault.config
>
> with a full dmesg and dmidecode output here:
>
> http://cdw.me.uk/tmp/sata-fault.dmesg
> http://cdw.me.uk/tmp/sata-fault.dmi
>
> Because I initially believed this might be a problem with the ACPI table on the
> IOMMU driver, as similar issues have come up with other boards (and very
> similar symptoms) recently, I've added amd_iommu_dump to the kernel command
> line, so there's dump info in that dmesg. However, Joerg Roedel, the IOMMU
> driver maintainer, tells me that the IOMMU ACPI table is fine in this case and
> the problem is a different one:
>
> Joerg Roedel<Joerg.Roedel@amd.com> writes:
>
>> The flags indicate that the device tried to read an address which is
>> only mapped writable for the device.
>> It is at least no BIOS issue, both devices (3:00.0 and 3:00.1) are
>> listed in the ACPI table as indicated by these messages:
>>
>> AMD-Vi: DEV_SELECT devid: 03:00.0 flags: 00
>> AMD-Vi: DEV_SELECT devid: 03:00.1 flags: 00
>>
>> This looks like a bug in the driver for your SATA add-on card. It
>> probably requests a DMA buffer with the wrong direction parameter.
>
> I think both the onboard cards (which work) and the PCI Express card (which
> doesn't) use the ahci driver in this case. Any advice would be very gratefully
> received!
>
> Best wishes,
>
> Chris.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-01-29 16:41 ` Robert Hancock
@ 2011-01-30 1:54 ` Chris Webb
2011-01-30 15:37 ` Robert Hancock
0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-01-30 1:54 UTC (permalink / raw)
To: Robert Hancock; +Cc: linux-ide
Robert Hancock <hancockrwd@gmail.com> writes:
> On 01/29/2011 05:24 AM, Chris Webb wrote:
> >I have several Supermicro H8DGT-HF motherboards (BIOS version 1.1) with Star
> >Tech PEXSAT32 PCI Express SATA cards attached, and am seeing an
> >IO_PAGE_FAULT during boot corresponding to this card:
> >
> > IO_PAGE_FAULT device=03:00.1 domain=0x0000 address=0x00000000000403c0 flags=0x0050]
>
> What's that device 03:00.1? The AHCI controller itself seems to be
> 03:00.0. Can you post the lspci -v output? Though it could be that
> maybe the IOMMU can't tell functions on the same device apart, not
> sure.
Thanks for the reply. Sure, no problem---here's the lspci -v output in full:
00:00.0 Class 0600: Device 1002:5a12 (rev 02)
Subsystem: Device 15d9:aa11
Flags: fast devsel
Capabilities: [f0] HyperTransport: MSI Mapping Enable+ Fixed+
Capabilities: [c4] HyperTransport: Slave or Primary Interface
Capabilities: [40] HyperTransport: Retry Mode
Capabilities: [54] HyperTransport: UnitID Clumping
Capabilities: [9c] HyperTransport: #1a
Capabilities: [70] MSI: Enable- Count=1/4 Maskable- 64bit-
00:00.2 Class 0806: Device 1002:5a23
Subsystem: Device 1002:5a23
Flags: bus master, fast devsel, latency 0, IRQ 40
Capabilities: [40] Secure device <?>
Capabilities: [54] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [64] HyperTransport: MSI Mapping Enable+ Fixed+
00:02.0 Class 0604: Device 1002:5a16
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: 0000d000-0000efff
Memory behind bridge: feb00000-febfffff
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Root Port (Slot+), MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit-
Capabilities: [b0] Subsystem: Device 15d9:aa11
Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [190] Access Control Services
00:04.0 Class 0604: Device 1002:5a18
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
I/O behind bridge: 0000c000-0000cfff
Memory behind bridge: fea00000-feafffff
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Root Port (Slot+), MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit-
Capabilities: [b0] Subsystem: Device 15d9:aa11
Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities: [190] Access Control Services
00:11.0 Class 0106: Device 1002:4391 (prog-if 01)
Subsystem: Device 1002:4391
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 22
I/O ports at b000 [size=8]
I/O ports at a000 [size=4]
I/O ports at 9000 [size=8]
I/O ports at 8000 [size=4]
I/O ports at 7000 [size=16]
Memory at fe9fa400 (32-bit, non-prefetchable) [size=1K]
Capabilities: [60] Power Management version 2
Capabilities: [70] SATA HBA v1.0
Kernel driver in use: ahci
00:12.0 Class 0c03: Device 1002:4397 (prog-if 10)
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 10
Memory at fe9f6000 (32-bit, non-prefetchable) [size=4K]
00:12.1 Class 0c03: Device 1002:4398 (prog-if 10)
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 10
Memory at fe9f7000 (32-bit, non-prefetchable) [size=4K]
00:12.2 Class 0c03: Device 1002:4396 (prog-if 20)
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 17
Memory at fe9fa800 (32-bit, non-prefetchable) [size=256]
Capabilities: [c0] Power Management version 2
Capabilities: [e4] Debug port: BAR=1 offset=00e0
Kernel driver in use: ehci_hcd
00:13.0 Class 0c03: Device 1002:4397 (prog-if 10)
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 11
Memory at fe9f8000 (32-bit, non-prefetchable) [size=4K]
00:13.1 Class 0c03: Device 1002:4398 (prog-if 10)
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 11
Memory at fe9f9000 (32-bit, non-prefetchable) [size=4K]
00:13.2 Class 0c03: Device 1002:4396 (prog-if 20)
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 19
Memory at fe9fac00 (32-bit, non-prefetchable) [size=256]
Capabilities: [c0] Power Management version 2
Capabilities: [e4] Debug port: BAR=1 offset=00e0
Kernel driver in use: ehci_hcd
00:14.0 Class 0c05: Device 1002:4385 (rev 3d)
Subsystem: Device 15d9:aa11
Flags: 66MHz, medium devsel
Capabilities: [b0] HyperTransport: MSI Mapping Enable- Fixed+
00:14.1 Class 0101: Device 1002:439c (prog-if 8a [Master SecP PriP])
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 0
I/O ports at 01f0 [size=8]
I/O ports at 03f4 [size=1]
I/O ports at 0170 [size=8]
I/O ports at 0374 [size=1]
I/O ports at ff00 [size=16]
Capabilities: [70] MSI: Enable- Count=1/2 Maskable- 64bit-
00:14.3 Class 0601: Device 1002:439d
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 0
00:14.4 Class 0604: Device 1002:4384 (prog-if 01)
Flags: bus master, 66MHz, medium devsel, latency 64
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
Memory behind bridge: fdf00000-fe7fffff
Prefetchable memory behind bridge: fc000000-fcffffff
00:14.5 Class 0c03: Device 1002:4399 (prog-if 10)
Subsystem: Device 15d9:aa11
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 11
Memory at fe9fb000 (32-bit, non-prefetchable) [size=4K]
00:18.0 Class 0600: Device 1022:1200
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
Capabilities: [a0] HyperTransport: Host or Secondary Interface
Capabilities: [c0] HyperTransport: Host or Secondary Interface
Capabilities: [e0] HyperTransport: Host or Secondary Interface
00:18.1 Class 0600: Device 1022:1201
Flags: fast devsel
00:18.2 Class 0600: Device 1022:1202
Flags: fast devsel
00:18.3 Class 0600: Device 1022:1203
Flags: fast devsel
Capabilities: [f0] Secure device <?>
00:18.4 Class 0600: Device 1022:1204
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
00:19.0 Class 0600: Device 1022:1200
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
Capabilities: [a0] HyperTransport: Host or Secondary Interface
Capabilities: [c0] HyperTransport: Host or Secondary Interface
00:19.1 Class 0600: Device 1022:1201
Flags: fast devsel
00:19.2 Class 0600: Device 1022:1202
Flags: fast devsel
00:19.3 Class 0600: Device 1022:1203
Flags: fast devsel
Capabilities: [f0] Secure device <?>
00:19.4 Class 0600: Device 1022:1204
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
00:1a.0 Class 0600: Device 1022:1200
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
Capabilities: [a0] HyperTransport: Host or Secondary Interface
Capabilities: [c0] HyperTransport: Host or Secondary Interface
00:1a.1 Class 0600: Device 1022:1201
Flags: fast devsel
00:1a.2 Class 0600: Device 1022:1202
Flags: fast devsel
00:1a.3 Class 0600: Device 1022:1203
Flags: fast devsel
Capabilities: [f0] Secure device <?>
00:1a.4 Class 0600: Device 1022:1204
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
00:1b.0 Class 0600: Device 1022:1200
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
Capabilities: [a0] HyperTransport: Host or Secondary Interface
Capabilities: [c0] HyperTransport: Host or Secondary Interface
Capabilities: [e0] HyperTransport: Host or Secondary Interface
00:1b.1 Class 0600: Device 1022:1201
Flags: fast devsel
00:1b.2 Class 0600: Device 1022:1202
Flags: fast devsel
00:1b.3 Class 0600: Device 1022:1203
Flags: fast devsel
Capabilities: [f0] Secure device <?>
00:1b.4 Class 0600: Device 1022:1204
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
Capabilities: [e0] HyperTransport: Host or Secondary Interface
01:04.0 Class 0300: Device 102b:0532 (rev 0a)
Subsystem: Device 15d9:aa11
Flags: bus master, medium devsel, latency 64, IRQ 7
Memory at fc000000 (32-bit, prefetchable) [size=16M]
Memory at fdffc000 (32-bit, non-prefetchable) [size=16K]
Memory at fe000000 (32-bit, non-prefetchable) [size=8M]
Expansion ROM at <unassigned> [disabled]
Capabilities: [dc] Power Management version 1
02:00.0 Class 0200: Device 8086:10c9 (rev 01)
Subsystem: Device 15d9:10c9
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at fea60000 (32-bit, non-prefetchable) [size=128K]
Memory at fea40000 (32-bit, non-prefetchable) [size=128K]
I/O ports at c400 [size=32]
Memory at fea1c000 (32-bit, non-prefetchable) [size=16K]
Expansion ROM at fea20000 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-25-90-ff-ff-13-c3-26
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: igb
02:00.1 Class 0200: Device 8086:10c9 (rev 01)
Subsystem: Device 15d9:10c9
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at feae0000 (32-bit, non-prefetchable) [size=128K]
Memory at feac0000 (32-bit, non-prefetchable) [size=128K]
I/O ports at c800 [size=32]
Memory at fea9c000 (32-bit, non-prefetchable) [size=16K]
Expansion ROM at feaa0000 [disabled] [size=128K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-25-90-ff-ff-13-c3-26
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
Kernel driver in use: igb
03:00.0 Class 0106: Device 1b4b:9123 (rev 11) (prog-if 01)
Subsystem: Device 18ab:9115
Flags: bus master, fast devsel, latency 0, IRQ 41
I/O ports at d800 [size=8]
I/O ports at d400 [size=4]
I/O ports at e800 [size=8]
I/O ports at e400 [size=4]
I/O ports at e000 [size=16]
Memory at febef800 (32-bit, non-prefetchable) [size=2K]
Expansion ROM at febf0000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [70] Express Legacy Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: ahci
Best wishes,
Chris.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-01-30 1:54 ` Chris Webb
@ 2011-01-30 15:37 ` Robert Hancock
2011-02-02 13:56 ` Chris Webb
0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-01-30 15:37 UTC (permalink / raw)
To: Chris Webb; +Cc: linux-ide
On Sat, Jan 29, 2011 at 7:54 PM, Chris Webb <chris.webb@elastichosts.com> wrote:
> Robert Hancock <hancockrwd@gmail.com> writes:
>
>> On 01/29/2011 05:24 AM, Chris Webb wrote:
>> >I have several Supermicro H8DGT-HF motherboards (BIOS version 1.1) with Star
>> >Tech PEXSAT32 PCI Express SATA cards attached, and am seeing an
>> >IO_PAGE_FAULT during boot corresponding to this card:
>> >
>> > IO_PAGE_FAULT device=03:00.1 domain=0x0000 address=0x00000000000403c0 flags=0x0050]
>>
>> What's that device 03:00.1? The AHCI controller itself seems to be
>> 03:00.0. Can you post the lspci -v output? Though it could be that
>> maybe the IOMMU can't tell functions on the same device apart, not
>> sure.
>
> Thanks for the reply. Sure, no problem---here's the lspci -v output in full:
>
...
> 03:00.0 Class 0106: Device 1b4b:9123 (rev 11) (prog-if 01)
> Subsystem: Device 18ab:9115
> Flags: bus master, fast devsel, latency 0, IRQ 41
> I/O ports at d800 [size=8]
> I/O ports at d400 [size=4]
> I/O ports at e800 [size=8]
> I/O ports at e400 [size=4]
> I/O ports at e000 [size=16]
> Memory at febef800 (32-bit, non-prefetchable) [size=2K]
> Expansion ROM at febf0000 [disabled] [size=64K]
> Capabilities: [40] Power Management version 3
> Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
> Capabilities: [70] Express Legacy Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Kernel driver in use: ahci
Hmm, so that 03:00.1 device doesn't seem to show up at all? I think
that these Marvell 9123 cards have a PATA controller integrated as
well, maybe that has something to do with the issue?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-01-30 15:37 ` Robert Hancock
@ 2011-02-02 13:56 ` Chris Webb
2011-02-03 0:49 ` Robert Hancock
0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-02 13:56 UTC (permalink / raw)
To: Robert Hancock; +Cc: linux-ide
Robert Hancock <hancockrwd@gmail.com> writes:
> On Sat, Jan 29, 2011 at 7:54 PM, Chris Webb <chris.webb@elastichosts.com> wrote:
> > 03:00.0 Class 0106: Device 1b4b:9123 (rev 11) (prog-if 01)
> > ? ? ? ?Subsystem: Device 18ab:9115
> > ? ? ? ?Flags: bus master, fast devsel, latency 0, IRQ 41
> > ? ? ? ?I/O ports at d800 [size=8]
> > ? ? ? ?I/O ports at d400 [size=4]
> > ? ? ? ?I/O ports at e800 [size=8]
> > ? ? ? ?I/O ports at e400 [size=4]
> > ? ? ? ?I/O ports at e000 [size=16]
> > ? ? ? ?Memory at febef800 (32-bit, non-prefetchable) [size=2K]
> > ? ? ? ?Expansion ROM at febf0000 [disabled] [size=64K]
> > ? ? ? ?Capabilities: [40] Power Management version 3
> > ? ? ? ?Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > ? ? ? ?Capabilities: [70] Express Legacy Endpoint, MSI 00
> > ? ? ? ?Capabilities: [100] Advanced Error Reporting
> > ? ? ? ?Kernel driver in use: ahci
>
> Hmm, so that 03:00.1 device doesn't seem to show up at all? I think
> that these Marvell 9123 cards have a PATA controller integrated as
> well, maybe that has something to do with the issue?
Yes, strange isn't it? There are no other add-on cards in this system, for
what it's worth, and the SATA card is definitely failing, so it does look
like it might be implicated. I have four such boxes, all of which show
identical symptoms.
Is there anything I can do to debug this better?
Cheers,
Chris.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-02 13:56 ` Chris Webb
@ 2011-02-03 0:49 ` Robert Hancock
2011-02-03 8:56 ` Chris Webb
0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-02-03 0:49 UTC (permalink / raw)
To: Chris Webb; +Cc: linux-ide
On 02/02/2011 07:56 AM, Chris Webb wrote:
> Robert Hancock<hancockrwd@gmail.com> writes:
>
>> On Sat, Jan 29, 2011 at 7:54 PM, Chris Webb<chris.webb@elastichosts.com> wrote:
>>> 03:00.0 Class 0106: Device 1b4b:9123 (rev 11) (prog-if 01)
>>> ? ? ? ?Subsystem: Device 18ab:9115
>>> ? ? ? ?Flags: bus master, fast devsel, latency 0, IRQ 41
>>> ? ? ? ?I/O ports at d800 [size=8]
>>> ? ? ? ?I/O ports at d400 [size=4]
>>> ? ? ? ?I/O ports at e800 [size=8]
>>> ? ? ? ?I/O ports at e400 [size=4]
>>> ? ? ? ?I/O ports at e000 [size=16]
>>> ? ? ? ?Memory at febef800 (32-bit, non-prefetchable) [size=2K]
>>> ? ? ? ?Expansion ROM at febf0000 [disabled] [size=64K]
>>> ? ? ? ?Capabilities: [40] Power Management version 3
>>> ? ? ? ?Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
>>> ? ? ? ?Capabilities: [70] Express Legacy Endpoint, MSI 00
>>> ? ? ? ?Capabilities: [100] Advanced Error Reporting
>>> ? ? ? ?Kernel driver in use: ahci
>>
>> Hmm, so that 03:00.1 device doesn't seem to show up at all? I think
>> that these Marvell 9123 cards have a PATA controller integrated as
>> well, maybe that has something to do with the issue?
>
> Yes, strange isn't it? There are no other add-on cards in this system, for
> what it's worth, and the SATA card is definitely failing, so it does look
> like it might be implicated. I have four such boxes, all of which show
> identical symptoms.
Is it the same model of add-on card?
This controller apparently has both a SATA and PATA controller on it.
The PATA portion doesn't seem to be showing up in lspci, but obviously
the BIOS saw some sign of it - and if the IOMMU reporting can be
believed, it's trying to do a read request for some reason..
>
> Is there anything I can do to debug this better?
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-03 0:49 ` Robert Hancock
@ 2011-02-03 8:56 ` Chris Webb
2011-02-07 17:48 ` Chris Webb
0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-03 8:56 UTC (permalink / raw)
To: Robert Hancock; +Cc: linux-ide
Robert Hancock <hancockrwd@gmail.com> writes:
> Is it the same model of add-on card?
Hi. Yes, they're four identical machines with four identical SATA
controllers.
> This controller apparently has both a SATA and PATA controller on
> it. The PATA portion doesn't seem to be showing up in lspci, but
> obviously the BIOS saw some sign of it - and if the IOMMU reporting
> can be believed, it's trying to do a read request for some reason..
Since SATA cards are cheap, I've ordered a different AHCI card to try
swapping out, so will be able to confirm if it's specific to this SATA card.
Cheers,
Chris.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-03 8:56 ` Chris Webb
@ 2011-02-07 17:48 ` Chris Webb
2011-02-08 2:04 ` Robert Hancock
0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-07 17:48 UTC (permalink / raw)
To: Robert Hancock; +Cc: linux-ide
Chris Webb <chris@arachsys.com> writes:
> Robert Hancock <hancockrwd@gmail.com> writes:
>
> > Is it the same model of add-on card?
>
> Hi. Yes, they're four identical machines with four identical SATA
> controllers.
>
> > This controller apparently has both a SATA and PATA controller on
> > it. The PATA portion doesn't seem to be showing up in lspci, but
> > obviously the BIOS saw some sign of it - and if the IOMMU reporting
> > can be believed, it's trying to do a read request for some reason..
>
> Since SATA cards are cheap, I've ordered a different AHCI card to try
> swapping out, so will be able to confirm if it's specific to this SATA card.
I've now done this, swapping in a Highpoint R620. I get the same
IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
device 03:00.1. I've put the new dmesg and lspci output at
http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
http://cdw.me.uk/tmp/sata-fault-hpt.lspci
Again, problem is present both with 2.6.32.25 and 2.6.37.
Cheers,
Chris.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-07 17:48 ` Chris Webb
@ 2011-02-08 2:04 ` Robert Hancock
2011-02-08 10:41 ` Roedel, Joerg
0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-02-08 2:04 UTC (permalink / raw)
To: Chris Webb; +Cc: linux-ide, Joerg.Roedel
On 02/07/2011 11:48 AM, Chris Webb wrote:
> Chris Webb<chris@arachsys.com> writes:
>
>> Robert Hancock<hancockrwd@gmail.com> writes:
>>
>>> Is it the same model of add-on card?
>>
>> Hi. Yes, they're four identical machines with four identical SATA
>> controllers.
>>
>>> This controller apparently has both a SATA and PATA controller on
>>> it. The PATA portion doesn't seem to be showing up in lspci, but
>>> obviously the BIOS saw some sign of it - and if the IOMMU reporting
>>> can be believed, it's trying to do a read request for some reason..
>>
>> Since SATA cards are cheap, I've ordered a different AHCI card to try
>> swapping out, so will be able to confirm if it's specific to this SATA card.
>
> I've now done this, swapping in a Highpoint R620. I get the same
> IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
> device 03:00.1. I've put the new dmesg and lspci output at
>
> http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
> http://cdw.me.uk/tmp/sata-fault-hpt.lspci
>
> Again, problem is present both with 2.6.32.25 and 2.6.37.
Curious.. We don't even have a driver loaded for the PATA device on that
chip so I don't see how we could be telling it to do anything. As far as
I can see there are a few possible causes: Either the device is
generating read requests which appear to come from the PATA function
rather than the SATA one for some reason, the IOMMU is picking up the
wrong device function for requests from that device, or something in the
platform is somehow misconfiguring the device to cause this error. It
may not be easy to figure out which one is the cause, however.
Putting Joerg Roedel from AMD on the CC list to see if he has any more
insight..
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-08 2:04 ` Robert Hancock
@ 2011-02-08 10:41 ` Roedel, Joerg
2011-02-08 11:00 ` Chris Webb
0 siblings, 1 reply; 16+ messages in thread
From: Roedel, Joerg @ 2011-02-08 10:41 UTC (permalink / raw)
To: Robert Hancock; +Cc: Chris Webb, linux-ide
On Mon, Feb 07, 2011 at 09:04:40PM -0500, Robert Hancock wrote:
> On 02/07/2011 11:48 AM, Chris Webb wrote:
> > I've now done this, swapping in a Highpoint R620. I get the same
> > IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
> > device 03:00.1. I've put the new dmesg and lspci output at
> >
> > http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
> > http://cdw.me.uk/tmp/sata-fault-hpt.lspci
> >
> > Again, problem is present both with 2.6.32.25 and 2.6.37.
>
> Curious.. We don't even have a driver loaded for the PATA device on that
> chip so I don't see how we could be telling it to do anything. As far as
> I can see there are a few possible causes: Either the device is
> generating read requests which appear to come from the PATA function
> rather than the SATA one for some reason, the IOMMU is picking up the
> wrong device function for requests from that device, or something in the
> platform is somehow misconfiguring the device to cause this error. It
> may not be easy to figure out which one is the cause, however.
The most likely reason for this is, that the add-on card uses both
request-ids (03:00.0 and 03:00.1) for requests originating from the SATA
controler. The address in the page-fault looks like an address the IOMMU
driver would assign but from a device which has no driver loaded.
If this is a know feature of the card the BIOS should detect it an
report it in the IVRS table with an alias-range. The driver would handle
it in this situation. Otherwise it looks like a problem with the
addon-card.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-08 10:41 ` Roedel, Joerg
@ 2011-02-08 11:00 ` Chris Webb
2011-02-08 14:43 ` Robert Hancock
0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-08 11:00 UTC (permalink / raw)
To: Roedel, Joerg; +Cc: Robert Hancock, linux-ide
"Roedel, Joerg" <Joerg.Roedel@amd.com> writes:
> On Mon, Feb 07, 2011 at 09:04:40PM -0500, Robert Hancock wrote:
> > On 02/07/2011 11:48 AM, Chris Webb wrote:
> > > I've now done this, swapping in a Highpoint R620. I get the same
> > > IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
> > > device 03:00.1. I've put the new dmesg and lspci output at
> > >
> > > http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
> > > http://cdw.me.uk/tmp/sata-fault-hpt.lspci
> > >
> > > Again, problem is present both with 2.6.32.25 and 2.6.37.
> >
> > Curious.. We don't even have a driver loaded for the PATA device on that
> > chip so I don't see how we could be telling it to do anything. As far as
> > I can see there are a few possible causes: Either the device is
> > generating read requests which appear to come from the PATA function
> > rather than the SATA one for some reason, the IOMMU is picking up the
> > wrong device function for requests from that device, or something in the
> > platform is somehow misconfiguring the device to cause this error. It
> > may not be easy to figure out which one is the cause, however.
>
> The most likely reason for this is, that the add-on card uses both
> request-ids (03:00.0 and 03:00.1) for requests originating from the SATA
> controler. The address in the page-fault looks like an address the IOMMU
> driver would assign but from a device which has no driver loaded.
> If this is a know feature of the card the BIOS should detect it an
> report it in the IVRS table with an alias-range. The driver would handle
> it in this situation. Otherwise it looks like a problem with the
> addon-card.
Hi Joerg. What's particularly puzzling here is that the symptoms are pretty
much the same with two completely different AHCI SATA cards. I expected to
be able to work around the problem by swapping in a different SATA card with
a different chipset, but it seems to be a problem with both.
Cheers,
Chris.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-08 11:00 ` Chris Webb
@ 2011-02-08 14:43 ` Robert Hancock
2011-02-08 14:48 ` Chris Webb
0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-02-08 14:43 UTC (permalink / raw)
To: Chris Webb; +Cc: Roedel, Joerg, linux-ide
On Tue, Feb 8, 2011 at 5:00 AM, Chris Webb <chris.webb@elastichosts.com> wrote:
> "Roedel, Joerg" <Joerg.Roedel@amd.com> writes:
>
>> On Mon, Feb 07, 2011 at 09:04:40PM -0500, Robert Hancock wrote:
>> > On 02/07/2011 11:48 AM, Chris Webb wrote:
>> > > I've now done this, swapping in a Highpoint R620. I get the same
>> > > IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
>> > > device 03:00.1. I've put the new dmesg and lspci output at
>> > >
>> > > http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
>> > > http://cdw.me.uk/tmp/sata-fault-hpt.lspci
>> > >
>> > > Again, problem is present both with 2.6.32.25 and 2.6.37.
>> >
>> > Curious.. We don't even have a driver loaded for the PATA device on that
>> > chip so I don't see how we could be telling it to do anything. As far as
>> > I can see there are a few possible causes: Either the device is
>> > generating read requests which appear to come from the PATA function
>> > rather than the SATA one for some reason, the IOMMU is picking up the
>> > wrong device function for requests from that device, or something in the
>> > platform is somehow misconfiguring the device to cause this error. It
>> > may not be easy to figure out which one is the cause, however.
>>
>> The most likely reason for this is, that the add-on card uses both
>> request-ids (03:00.0 and 03:00.1) for requests originating from the SATA
>> controler. The address in the page-fault looks like an address the IOMMU
>> driver would assign but from a device which has no driver loaded.
>> If this is a know feature of the card the BIOS should detect it an
>> report it in the IVRS table with an alias-range. The driver would handle
>> it in this situation. Otherwise it looks like a problem with the
>> addon-card.
>
> Hi Joerg. What's particularly puzzling here is that the symptoms are pretty
> much the same with two completely different AHCI SATA cards. I expected to
> be able to work around the problem by swapping in a different SATA card with
> a different chipset, but it seems to be a problem with both.
Did you try different chipsets? The two sets of output you posted are
both from Marvell 88SE9123-based cards.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-08 14:43 ` Robert Hancock
@ 2011-02-08 14:48 ` Chris Webb
2011-02-17 9:40 ` Chris Webb
0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-08 14:48 UTC (permalink / raw)
To: Robert Hancock; +Cc: Roedel, Joerg, linux-ide
Robert Hancock <hancockrwd@gmail.com> writes:
> Did you try different chipsets? The two sets of output you posted are
> both from Marvell 88SE9123-based cards.
Oh, I may have been a muppet here then. I assumed the HPT cards would be a
different chipset. I'll find something completely different!
Cheers,
Chris.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-08 14:48 ` Chris Webb
@ 2011-02-17 9:40 ` Chris Webb
2011-02-18 0:22 ` Robert Hancock
0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-17 9:40 UTC (permalink / raw)
To: Robert Hancock; +Cc: Roedel, Joerg, linux-ide
Chris Webb <chris.webb@elastichosts.com> writes:
> Robert Hancock <hancockrwd@gmail.com> writes:
>
> > Did you try different chipsets? The two sets of output you posted are
> > both from Marvell 88SE9123-based cards.
>
> Oh, I may have been a muppet here then. I assumed the HPT cards would be a
> different chipset. I'll find something completely different!
A SIL-based AHCI controller card turns out to work fine in the same setup,
so it does look like a problem specific to Marvell 88SE9123-based cards.
Cheers,
Chris.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-17 9:40 ` Chris Webb
@ 2011-02-18 0:22 ` Robert Hancock
2011-02-21 9:39 ` Roedel, Joerg
0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-02-18 0:22 UTC (permalink / raw)
To: Chris Webb; +Cc: Roedel, Joerg, linux-ide
On 02/17/2011 03:40 AM, Chris Webb wrote:
> Chris Webb<chris.webb@elastichosts.com> writes:
>
>> Robert Hancock<hancockrwd@gmail.com> writes:
>>
>>> Did you try different chipsets? The two sets of output you posted are
>>> both from Marvell 88SE9123-based cards.
>>
>> Oh, I may have been a muppet here then. I assumed the HPT cards would be a
>> different chipset. I'll find something completely different!
>
> A SIL-based AHCI controller card turns out to work fine in the same setup,
> so it does look like a problem specific to Marvell 88SE9123-based cards.
That would tend to indicate some kind of issue with that particular
chip. Does anyone have any contacts at Marvell that could look into this?
Also wonder if we could quirk this in the kernel somehow so that IOMMUs
wouldn't freak out..
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: IO_PAGE_FAULT from SATA card during boot
2011-02-18 0:22 ` Robert Hancock
@ 2011-02-21 9:39 ` Roedel, Joerg
0 siblings, 0 replies; 16+ messages in thread
From: Roedel, Joerg @ 2011-02-21 9:39 UTC (permalink / raw)
To: Robert Hancock; +Cc: Chris Webb, linux-ide
On Thu, Feb 17, 2011 at 07:22:41PM -0500, Robert Hancock wrote:
> Also wonder if we could quirk this in the kernel somehow so that IOMMUs
> wouldn't freak out..
Some kind of quirk will be required for the cards out there. I'll think
about it and post some patches for testing.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2011-02-21 9:39 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-29 11:24 IO_PAGE_FAULT from SATA card during boot Chris Webb
2011-01-29 16:41 ` Robert Hancock
2011-01-30 1:54 ` Chris Webb
2011-01-30 15:37 ` Robert Hancock
2011-02-02 13:56 ` Chris Webb
2011-02-03 0:49 ` Robert Hancock
2011-02-03 8:56 ` Chris Webb
2011-02-07 17:48 ` Chris Webb
2011-02-08 2:04 ` Robert Hancock
2011-02-08 10:41 ` Roedel, Joerg
2011-02-08 11:00 ` Chris Webb
2011-02-08 14:43 ` Robert Hancock
2011-02-08 14:48 ` Chris Webb
2011-02-17 9:40 ` Chris Webb
2011-02-18 0:22 ` Robert Hancock
2011-02-21 9:39 ` Roedel, Joerg
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.