All of lore.kernel.org
 help / color / mirror / Atom feed
* IO_PAGE_FAULT from SATA card during boot
@ 2011-01-29 11:24 Chris Webb
  2011-01-29 16:41 ` Robert Hancock
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-01-29 11:24 UTC (permalink / raw)
  To: linux-ide

I have several Supermicro H8DGT-HF motherboards (BIOS version 1.1) with Star
Tech PEXSAT32 PCI Express SATA cards attached, and am seeing an
IO_PAGE_FAULT during boot corresponding to this card:

  IO_PAGE_FAULT device=03:00.1 domain=0x0000 address=0x00000000000403c0 flags=0x0050]

The card later times out when the kernel tries to access the drives:

  ata6.00: qc timeout (cmd 0xec)
  ata12.00: qc timeout (cmd 0xa1)
  ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
  ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
  ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
  ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
  ata12.00: qc timeout (cmd 0xa1)
  ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
  ata12: limiting SATA link speed to 1.5 Gbps
  ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
  ata6.00: qc timeout (cmd 0xec)
  ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
  ata6: limiting SATA link speed to 1.5 Gbps
  ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
  ata12.00: qc timeout (cmd 0xa1)
  ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
  ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
  ata6.00: qc timeout (cmd 0xec)
  ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
  ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 310)

I first saw this with a 2.6.32.25 kernel, but get identical behaviour with
the latest 2.6.37. The kernel config I'm using with 2.6.37 is here:

  http://cdw.me.uk/tmp/sata-fault.config

with a full dmesg and dmidecode output here:

  http://cdw.me.uk/tmp/sata-fault.dmesg
  http://cdw.me.uk/tmp/sata-fault.dmi

Because I initially believed this might be a problem with the ACPI table on the
IOMMU driver, as similar issues have come up with other boards (and very
similar symptoms) recently, I've added amd_iommu_dump to the kernel command
line, so there's dump info in that dmesg. However, Joerg Roedel, the IOMMU
driver maintainer, tells me that the IOMMU ACPI table is fine in this case and
the problem is a different one:

Joerg Roedel <Joerg.Roedel@amd.com> writes:

> The flags indicate that the device tried to read an address which is
> only mapped writable for the device.
> It is at least no BIOS issue, both devices (3:00.0 and 3:00.1) are
> listed in the ACPI table as indicated by these messages:
>                                                         
>         AMD-Vi:   DEV_SELECT                     devid: 03:00.0 flags: 00
>         AMD-Vi:   DEV_SELECT                     devid: 03:00.1 flags: 00
> 
> This looks like a bug in the driver for your SATA add-on card. It
> probably requests a DMA buffer with the wrong direction parameter.

I think both the onboard cards (which work) and the PCI Express card (which
doesn't) use the ahci driver in this case. Any advice would be very gratefully
received!

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-01-29 11:24 IO_PAGE_FAULT from SATA card during boot Chris Webb
@ 2011-01-29 16:41 ` Robert Hancock
  2011-01-30  1:54   ` Chris Webb
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-01-29 16:41 UTC (permalink / raw)
  To: Chris Webb; +Cc: linux-ide

On 01/29/2011 05:24 AM, Chris Webb wrote:
> I have several Supermicro H8DGT-HF motherboards (BIOS version 1.1) with Star
> Tech PEXSAT32 PCI Express SATA cards attached, and am seeing an
> IO_PAGE_FAULT during boot corresponding to this card:
>
>    IO_PAGE_FAULT device=03:00.1 domain=0x0000 address=0x00000000000403c0 flags=0x0050]

What's that device 03:00.1? The AHCI controller itself seems to be 
03:00.0. Can you post the lspci -v output? Though it could be that maybe 
the IOMMU can't tell functions on the same device apart, not sure.

Given that the onboard AHCI controller is working but this one is not, 
it could be that the controller is doing some unexpected read request 
somewhere..

>
> The card later times out when the kernel tries to access the drives:
>
>    ata6.00: qc timeout (cmd 0xec)
>    ata12.00: qc timeout (cmd 0xa1)
>    ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>    ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>    ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>    ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>    ata12.00: qc timeout (cmd 0xa1)
>    ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>    ata12: limiting SATA link speed to 1.5 Gbps
>    ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
>    ata6.00: qc timeout (cmd 0xec)
>    ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>    ata6: limiting SATA link speed to 1.5 Gbps
>    ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
>    ata12.00: qc timeout (cmd 0xa1)
>    ata12.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>    ata12: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
>    ata6.00: qc timeout (cmd 0xec)
>    ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4)
>    ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
>
> I first saw this with a 2.6.32.25 kernel, but get identical behaviour with
> the latest 2.6.37. The kernel config I'm using with 2.6.37 is here:
>
>    http://cdw.me.uk/tmp/sata-fault.config
>
> with a full dmesg and dmidecode output here:
>
>    http://cdw.me.uk/tmp/sata-fault.dmesg
>    http://cdw.me.uk/tmp/sata-fault.dmi
>
> Because I initially believed this might be a problem with the ACPI table on the
> IOMMU driver, as similar issues have come up with other boards (and very
> similar symptoms) recently, I've added amd_iommu_dump to the kernel command
> line, so there's dump info in that dmesg. However, Joerg Roedel, the IOMMU
> driver maintainer, tells me that the IOMMU ACPI table is fine in this case and
> the problem is a different one:
>
> Joerg Roedel<Joerg.Roedel@amd.com>  writes:
>
>> The flags indicate that the device tried to read an address which is
>> only mapped writable for the device.
>> It is at least no BIOS issue, both devices (3:00.0 and 3:00.1) are
>> listed in the ACPI table as indicated by these messages:
>>
>>          AMD-Vi:   DEV_SELECT                     devid: 03:00.0 flags: 00
>>          AMD-Vi:   DEV_SELECT                     devid: 03:00.1 flags: 00
>>
>> This looks like a bug in the driver for your SATA add-on card. It
>> probably requests a DMA buffer with the wrong direction parameter.
>
> I think both the onboard cards (which work) and the PCI Express card (which
> doesn't) use the ahci driver in this case. Any advice would be very gratefully
> received!
>
> Best wishes,
>
> Chris.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-01-29 16:41 ` Robert Hancock
@ 2011-01-30  1:54   ` Chris Webb
  2011-01-30 15:37     ` Robert Hancock
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-01-30  1:54 UTC (permalink / raw)
  To: Robert Hancock; +Cc: linux-ide

Robert Hancock <hancockrwd@gmail.com> writes:

> On 01/29/2011 05:24 AM, Chris Webb wrote:
> >I have several Supermicro H8DGT-HF motherboards (BIOS version 1.1) with Star
> >Tech PEXSAT32 PCI Express SATA cards attached, and am seeing an
> >IO_PAGE_FAULT during boot corresponding to this card:
> >
> >   IO_PAGE_FAULT device=03:00.1 domain=0x0000 address=0x00000000000403c0 flags=0x0050]
> 
> What's that device 03:00.1? The AHCI controller itself seems to be
> 03:00.0. Can you post the lspci -v output? Though it could be that
> maybe the IOMMU can't tell functions on the same device apart, not
> sure.

Thanks for the reply. Sure, no problem---here's the lspci -v output in full:

00:00.0 Class 0600: Device 1002:5a12 (rev 02)
	Subsystem: Device 15d9:aa11
	Flags: fast devsel
	Capabilities: [f0] HyperTransport: MSI Mapping Enable+ Fixed+
	Capabilities: [c4] HyperTransport: Slave or Primary Interface
	Capabilities: [40] HyperTransport: Retry Mode
	Capabilities: [54] HyperTransport: UnitID Clumping
	Capabilities: [9c] HyperTransport: #1a
	Capabilities: [70] MSI: Enable- Count=1/4 Maskable- 64bit-

00:00.2 Class 0806: Device 1002:5a23
	Subsystem: Device 1002:5a23
	Flags: bus master, fast devsel, latency 0, IRQ 40
	Capabilities: [40] Secure device <?>
	Capabilities: [54] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [64] HyperTransport: MSI Mapping Enable+ Fixed+

00:02.0 Class 0604: Device 1002:5a16
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
	I/O behind bridge: 0000d000-0000efff
	Memory behind bridge: feb00000-febfffff
	Capabilities: [50] Power Management version 3
	Capabilities: [58] Express Root Port (Slot+), MSI 00
	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit-
	Capabilities: [b0] Subsystem: Device 15d9:aa11
	Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [190] Access Control Services

00:04.0 Class 0604: Device 1002:5a18
	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
	I/O behind bridge: 0000c000-0000cfff
	Memory behind bridge: fea00000-feafffff
	Capabilities: [50] Power Management version 3
	Capabilities: [58] Express Root Port (Slot+), MSI 00
	Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit-
	Capabilities: [b0] Subsystem: Device 15d9:aa11
	Capabilities: [b8] HyperTransport: MSI Mapping Enable+ Fixed+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [190] Access Control Services

00:11.0 Class 0106: Device 1002:4391 (prog-if 01)
	Subsystem: Device 1002:4391
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 22
	I/O ports at b000 [size=8]
	I/O ports at a000 [size=4]
	I/O ports at 9000 [size=8]
	I/O ports at 8000 [size=4]
	I/O ports at 7000 [size=16]
	Memory at fe9fa400 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [60] Power Management version 2
	Capabilities: [70] SATA HBA v1.0
	Kernel driver in use: ahci

00:12.0 Class 0c03: Device 1002:4397 (prog-if 10)
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 10
	Memory at fe9f6000 (32-bit, non-prefetchable) [size=4K]

00:12.1 Class 0c03: Device 1002:4398 (prog-if 10)
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 10
	Memory at fe9f7000 (32-bit, non-prefetchable) [size=4K]

00:12.2 Class 0c03: Device 1002:4396 (prog-if 20)
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 17
	Memory at fe9fa800 (32-bit, non-prefetchable) [size=256]
	Capabilities: [c0] Power Management version 2
	Capabilities: [e4] Debug port: BAR=1 offset=00e0
	Kernel driver in use: ehci_hcd

00:13.0 Class 0c03: Device 1002:4397 (prog-if 10)
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 11
	Memory at fe9f8000 (32-bit, non-prefetchable) [size=4K]

00:13.1 Class 0c03: Device 1002:4398 (prog-if 10)
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 11
	Memory at fe9f9000 (32-bit, non-prefetchable) [size=4K]

00:13.2 Class 0c03: Device 1002:4396 (prog-if 20)
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 19
	Memory at fe9fac00 (32-bit, non-prefetchable) [size=256]
	Capabilities: [c0] Power Management version 2
	Capabilities: [e4] Debug port: BAR=1 offset=00e0
	Kernel driver in use: ehci_hcd

00:14.0 Class 0c05: Device 1002:4385 (rev 3d)
	Subsystem: Device 15d9:aa11
	Flags: 66MHz, medium devsel
	Capabilities: [b0] HyperTransport: MSI Mapping Enable- Fixed+

00:14.1 Class 0101: Device 1002:439c (prog-if 8a [Master SecP PriP])
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 0
	I/O ports at 01f0 [size=8]
	I/O ports at 03f4 [size=1]
	I/O ports at 0170 [size=8]
	I/O ports at 0374 [size=1]
	I/O ports at ff00 [size=16]
	Capabilities: [70] MSI: Enable- Count=1/2 Maskable- 64bit-

00:14.3 Class 0601: Device 1002:439d
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 0

00:14.4 Class 0604: Device 1002:4384 (prog-if 01)
	Flags: bus master, 66MHz, medium devsel, latency 64
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
	Memory behind bridge: fdf00000-fe7fffff
	Prefetchable memory behind bridge: fc000000-fcffffff

00:14.5 Class 0c03: Device 1002:4399 (prog-if 10)
	Subsystem: Device 15d9:aa11
	Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 11
	Memory at fe9fb000 (32-bit, non-prefetchable) [size=4K]

00:18.0 Class 0600: Device 1022:1200
	Flags: fast devsel
	Capabilities: [80] HyperTransport: Host or Secondary Interface
	Capabilities: [a0] HyperTransport: Host or Secondary Interface
	Capabilities: [c0] HyperTransport: Host or Secondary Interface
	Capabilities: [e0] HyperTransport: Host or Secondary Interface

00:18.1 Class 0600: Device 1022:1201
	Flags: fast devsel

00:18.2 Class 0600: Device 1022:1202
	Flags: fast devsel

00:18.3 Class 0600: Device 1022:1203
	Flags: fast devsel
	Capabilities: [f0] Secure device <?>

00:18.4 Class 0600: Device 1022:1204
	Flags: fast devsel
	Capabilities: [80] HyperTransport: Host or Secondary Interface

00:19.0 Class 0600: Device 1022:1200
	Flags: fast devsel
	Capabilities: [80] HyperTransport: Host or Secondary Interface
	Capabilities: [a0] HyperTransport: Host or Secondary Interface
	Capabilities: [c0] HyperTransport: Host or Secondary Interface

00:19.1 Class 0600: Device 1022:1201
	Flags: fast devsel

00:19.2 Class 0600: Device 1022:1202
	Flags: fast devsel

00:19.3 Class 0600: Device 1022:1203
	Flags: fast devsel
	Capabilities: [f0] Secure device <?>

00:19.4 Class 0600: Device 1022:1204
	Flags: fast devsel
	Capabilities: [80] HyperTransport: Host or Secondary Interface

00:1a.0 Class 0600: Device 1022:1200
	Flags: fast devsel
	Capabilities: [80] HyperTransport: Host or Secondary Interface
	Capabilities: [a0] HyperTransport: Host or Secondary Interface
	Capabilities: [c0] HyperTransport: Host or Secondary Interface

00:1a.1 Class 0600: Device 1022:1201
	Flags: fast devsel

00:1a.2 Class 0600: Device 1022:1202
	Flags: fast devsel

00:1a.3 Class 0600: Device 1022:1203
	Flags: fast devsel
	Capabilities: [f0] Secure device <?>

00:1a.4 Class 0600: Device 1022:1204
	Flags: fast devsel
	Capabilities: [80] HyperTransport: Host or Secondary Interface

00:1b.0 Class 0600: Device 1022:1200
	Flags: fast devsel
	Capabilities: [80] HyperTransport: Host or Secondary Interface
	Capabilities: [a0] HyperTransport: Host or Secondary Interface
	Capabilities: [c0] HyperTransport: Host or Secondary Interface
	Capabilities: [e0] HyperTransport: Host or Secondary Interface

00:1b.1 Class 0600: Device 1022:1201
	Flags: fast devsel

00:1b.2 Class 0600: Device 1022:1202
	Flags: fast devsel

00:1b.3 Class 0600: Device 1022:1203
	Flags: fast devsel
	Capabilities: [f0] Secure device <?>

00:1b.4 Class 0600: Device 1022:1204
	Flags: fast devsel
	Capabilities: [80] HyperTransport: Host or Secondary Interface
	Capabilities: [e0] HyperTransport: Host or Secondary Interface

01:04.0 Class 0300: Device 102b:0532 (rev 0a)
	Subsystem: Device 15d9:aa11
	Flags: bus master, medium devsel, latency 64, IRQ 7
	Memory at fc000000 (32-bit, prefetchable) [size=16M]
	Memory at fdffc000 (32-bit, non-prefetchable) [size=16K]
	Memory at fe000000 (32-bit, non-prefetchable) [size=8M]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [dc] Power Management version 1

02:00.0 Class 0200: Device 8086:10c9 (rev 01)
	Subsystem: Device 15d9:10c9
	Flags: bus master, fast devsel, latency 0, IRQ 16
	Memory at fea60000 (32-bit, non-prefetchable) [size=128K]
	Memory at fea40000 (32-bit, non-prefetchable) [size=128K]
	I/O ports at c400 [size=32]
	Memory at fea1c000 (32-bit, non-prefetchable) [size=16K]
	Expansion ROM at fea20000 [disabled] [size=128K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
	Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
	Capabilities: [a0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 00-25-90-ff-ff-13-c3-26
	Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
	Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
	Kernel driver in use: igb

02:00.1 Class 0200: Device 8086:10c9 (rev 01)
	Subsystem: Device 15d9:10c9
	Flags: bus master, fast devsel, latency 0, IRQ 17
	Memory at feae0000 (32-bit, non-prefetchable) [size=128K]
	Memory at feac0000 (32-bit, non-prefetchable) [size=128K]
	I/O ports at c800 [size=32]
	Memory at fea9c000 (32-bit, non-prefetchable) [size=16K]
	Expansion ROM at feaa0000 [disabled] [size=128K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
	Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
	Capabilities: [a0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 00-25-90-ff-ff-13-c3-26
	Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
	Capabilities: [160] Single Root I/O Virtualization (SR-IOV)
	Kernel driver in use: igb

03:00.0 Class 0106: Device 1b4b:9123 (rev 11) (prog-if 01)
	Subsystem: Device 18ab:9115
	Flags: bus master, fast devsel, latency 0, IRQ 41
	I/O ports at d800 [size=8]
	I/O ports at d400 [size=4]
	I/O ports at e800 [size=8]
	I/O ports at e400 [size=4]
	I/O ports at e000 [size=16]
	Memory at febef800 (32-bit, non-prefetchable) [size=2K]
	Expansion ROM at febf0000 [disabled] [size=64K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [70] Express Legacy Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: ahci

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-01-30  1:54   ` Chris Webb
@ 2011-01-30 15:37     ` Robert Hancock
  2011-02-02 13:56       ` Chris Webb
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-01-30 15:37 UTC (permalink / raw)
  To: Chris Webb; +Cc: linux-ide

On Sat, Jan 29, 2011 at 7:54 PM, Chris Webb <chris.webb@elastichosts.com> wrote:
> Robert Hancock <hancockrwd@gmail.com> writes:
>
>> On 01/29/2011 05:24 AM, Chris Webb wrote:
>> >I have several Supermicro H8DGT-HF motherboards (BIOS version 1.1) with Star
>> >Tech PEXSAT32 PCI Express SATA cards attached, and am seeing an
>> >IO_PAGE_FAULT during boot corresponding to this card:
>> >
>> >   IO_PAGE_FAULT device=03:00.1 domain=0x0000 address=0x00000000000403c0 flags=0x0050]
>>
>> What's that device 03:00.1? The AHCI controller itself seems to be
>> 03:00.0. Can you post the lspci -v output? Though it could be that
>> maybe the IOMMU can't tell functions on the same device apart, not
>> sure.
>
> Thanks for the reply. Sure, no problem---here's the lspci -v output in full:
>

...

> 03:00.0 Class 0106: Device 1b4b:9123 (rev 11) (prog-if 01)
>        Subsystem: Device 18ab:9115
>        Flags: bus master, fast devsel, latency 0, IRQ 41
>        I/O ports at d800 [size=8]
>        I/O ports at d400 [size=4]
>        I/O ports at e800 [size=8]
>        I/O ports at e400 [size=4]
>        I/O ports at e000 [size=16]
>        Memory at febef800 (32-bit, non-prefetchable) [size=2K]
>        Expansion ROM at febf0000 [disabled] [size=64K]
>        Capabilities: [40] Power Management version 3
>        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
>        Capabilities: [70] Express Legacy Endpoint, MSI 00
>        Capabilities: [100] Advanced Error Reporting
>        Kernel driver in use: ahci

Hmm, so that 03:00.1 device doesn't seem to show up at all? I think
that these Marvell 9123 cards have a PATA controller integrated as
well, maybe that has something to do with the issue?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-01-30 15:37     ` Robert Hancock
@ 2011-02-02 13:56       ` Chris Webb
  2011-02-03  0:49         ` Robert Hancock
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-02 13:56 UTC (permalink / raw)
  To: Robert Hancock; +Cc: linux-ide

Robert Hancock <hancockrwd@gmail.com> writes:

> On Sat, Jan 29, 2011 at 7:54 PM, Chris Webb <chris.webb@elastichosts.com> wrote:
> > 03:00.0 Class 0106: Device 1b4b:9123 (rev 11) (prog-if 01)
> > ? ? ? ?Subsystem: Device 18ab:9115
> > ? ? ? ?Flags: bus master, fast devsel, latency 0, IRQ 41
> > ? ? ? ?I/O ports at d800 [size=8]
> > ? ? ? ?I/O ports at d400 [size=4]
> > ? ? ? ?I/O ports at e800 [size=8]
> > ? ? ? ?I/O ports at e400 [size=4]
> > ? ? ? ?I/O ports at e000 [size=16]
> > ? ? ? ?Memory at febef800 (32-bit, non-prefetchable) [size=2K]
> > ? ? ? ?Expansion ROM at febf0000 [disabled] [size=64K]
> > ? ? ? ?Capabilities: [40] Power Management version 3
> > ? ? ? ?Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
> > ? ? ? ?Capabilities: [70] Express Legacy Endpoint, MSI 00
> > ? ? ? ?Capabilities: [100] Advanced Error Reporting
> > ? ? ? ?Kernel driver in use: ahci
> 
> Hmm, so that 03:00.1 device doesn't seem to show up at all? I think
> that these Marvell 9123 cards have a PATA controller integrated as
> well, maybe that has something to do with the issue?

Yes, strange isn't it? There are no other add-on cards in this system, for
what it's worth, and the SATA card is definitely failing, so it does look
like it might be implicated. I have four such boxes, all of which show
identical symptoms.

Is there anything I can do to debug this better?

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-02 13:56       ` Chris Webb
@ 2011-02-03  0:49         ` Robert Hancock
  2011-02-03  8:56           ` Chris Webb
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-02-03  0:49 UTC (permalink / raw)
  To: Chris Webb; +Cc: linux-ide

On 02/02/2011 07:56 AM, Chris Webb wrote:
> Robert Hancock<hancockrwd@gmail.com>  writes:
>
>> On Sat, Jan 29, 2011 at 7:54 PM, Chris Webb<chris.webb@elastichosts.com>  wrote:
>>> 03:00.0 Class 0106: Device 1b4b:9123 (rev 11) (prog-if 01)
>>> ? ? ? ?Subsystem: Device 18ab:9115
>>> ? ? ? ?Flags: bus master, fast devsel, latency 0, IRQ 41
>>> ? ? ? ?I/O ports at d800 [size=8]
>>> ? ? ? ?I/O ports at d400 [size=4]
>>> ? ? ? ?I/O ports at e800 [size=8]
>>> ? ? ? ?I/O ports at e400 [size=4]
>>> ? ? ? ?I/O ports at e000 [size=16]
>>> ? ? ? ?Memory at febef800 (32-bit, non-prefetchable) [size=2K]
>>> ? ? ? ?Expansion ROM at febf0000 [disabled] [size=64K]
>>> ? ? ? ?Capabilities: [40] Power Management version 3
>>> ? ? ? ?Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
>>> ? ? ? ?Capabilities: [70] Express Legacy Endpoint, MSI 00
>>> ? ? ? ?Capabilities: [100] Advanced Error Reporting
>>> ? ? ? ?Kernel driver in use: ahci
>>
>> Hmm, so that 03:00.1 device doesn't seem to show up at all? I think
>> that these Marvell 9123 cards have a PATA controller integrated as
>> well, maybe that has something to do with the issue?
>
> Yes, strange isn't it? There are no other add-on cards in this system, for
> what it's worth, and the SATA card is definitely failing, so it does look
> like it might be implicated. I have four such boxes, all of which show
> identical symptoms.

Is it the same model of add-on card?

This controller apparently has both a SATA and PATA controller on it. 
The PATA portion doesn't seem to be showing up in lspci, but obviously 
the BIOS saw some sign of it - and if the IOMMU reporting can be 
believed, it's trying to do a read request for some reason..

>
> Is there anything I can do to debug this better?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-03  0:49         ` Robert Hancock
@ 2011-02-03  8:56           ` Chris Webb
  2011-02-07 17:48             ` Chris Webb
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-03  8:56 UTC (permalink / raw)
  To: Robert Hancock; +Cc: linux-ide

Robert Hancock <hancockrwd@gmail.com> writes:

> Is it the same model of add-on card?

Hi. Yes, they're four identical machines with four identical SATA
controllers.

> This controller apparently has both a SATA and PATA controller on
> it. The PATA portion doesn't seem to be showing up in lspci, but
> obviously the BIOS saw some sign of it - and if the IOMMU reporting
> can be believed, it's trying to do a read request for some reason..

Since SATA cards are cheap, I've ordered a different AHCI card to try
swapping out, so will be able to confirm if it's specific to this SATA card.

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-03  8:56           ` Chris Webb
@ 2011-02-07 17:48             ` Chris Webb
  2011-02-08  2:04               ` Robert Hancock
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-07 17:48 UTC (permalink / raw)
  To: Robert Hancock; +Cc: linux-ide

Chris Webb <chris@arachsys.com> writes:

> Robert Hancock <hancockrwd@gmail.com> writes:
> 
> > Is it the same model of add-on card?
> 
> Hi. Yes, they're four identical machines with four identical SATA
> controllers.
> 
> > This controller apparently has both a SATA and PATA controller on
> > it. The PATA portion doesn't seem to be showing up in lspci, but
> > obviously the BIOS saw some sign of it - and if the IOMMU reporting
> > can be believed, it's trying to do a read request for some reason..
> 
> Since SATA cards are cheap, I've ordered a different AHCI card to try
> swapping out, so will be able to confirm if it's specific to this SATA card.

I've now done this, swapping in a Highpoint R620. I get the same
IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
device 03:00.1. I've put the new dmesg and lspci output at

  http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
  http://cdw.me.uk/tmp/sata-fault-hpt.lspci

Again, problem is present both with 2.6.32.25 and 2.6.37.

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-07 17:48             ` Chris Webb
@ 2011-02-08  2:04               ` Robert Hancock
  2011-02-08 10:41                 ` Roedel, Joerg
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-02-08  2:04 UTC (permalink / raw)
  To: Chris Webb; +Cc: linux-ide, Joerg.Roedel

On 02/07/2011 11:48 AM, Chris Webb wrote:
> Chris Webb<chris@arachsys.com>  writes:
>
>> Robert Hancock<hancockrwd@gmail.com>  writes:
>>
>>> Is it the same model of add-on card?
>>
>> Hi. Yes, they're four identical machines with four identical SATA
>> controllers.
>>
>>> This controller apparently has both a SATA and PATA controller on
>>> it. The PATA portion doesn't seem to be showing up in lspci, but
>>> obviously the BIOS saw some sign of it - and if the IOMMU reporting
>>> can be believed, it's trying to do a read request for some reason..
>>
>> Since SATA cards are cheap, I've ordered a different AHCI card to try
>> swapping out, so will be able to confirm if it's specific to this SATA card.
>
> I've now done this, swapping in a Highpoint R620. I get the same
> IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
> device 03:00.1. I've put the new dmesg and lspci output at
>
>    http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
>    http://cdw.me.uk/tmp/sata-fault-hpt.lspci
>
> Again, problem is present both with 2.6.32.25 and 2.6.37.

Curious.. We don't even have a driver loaded for the PATA device on that 
chip so I don't see how we could be telling it to do anything. As far as 
I can see there are a few possible causes: Either the device is 
generating read requests which appear to come from the PATA function 
rather than the SATA one for some reason, the IOMMU is picking up the 
wrong device function for requests from that device, or something in the 
platform is somehow misconfiguring the device to cause this error. It 
may not be easy to figure out which one is the cause, however.

Putting Joerg Roedel from AMD on the CC list to see if he has any more 
insight..

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-08  2:04               ` Robert Hancock
@ 2011-02-08 10:41                 ` Roedel, Joerg
  2011-02-08 11:00                   ` Chris Webb
  0 siblings, 1 reply; 16+ messages in thread
From: Roedel, Joerg @ 2011-02-08 10:41 UTC (permalink / raw)
  To: Robert Hancock; +Cc: Chris Webb, linux-ide

On Mon, Feb 07, 2011 at 09:04:40PM -0500, Robert Hancock wrote:
> On 02/07/2011 11:48 AM, Chris Webb wrote:
> > I've now done this, swapping in a Highpoint R620. I get the same
> > IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
> > device 03:00.1. I've put the new dmesg and lspci output at
> >
> >    http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
> >    http://cdw.me.uk/tmp/sata-fault-hpt.lspci
> >
> > Again, problem is present both with 2.6.32.25 and 2.6.37.
> 
> Curious.. We don't even have a driver loaded for the PATA device on that 
> chip so I don't see how we could be telling it to do anything. As far as 
> I can see there are a few possible causes: Either the device is 
> generating read requests which appear to come from the PATA function 
> rather than the SATA one for some reason, the IOMMU is picking up the 
> wrong device function for requests from that device, or something in the 
> platform is somehow misconfiguring the device to cause this error. It 
> may not be easy to figure out which one is the cause, however.

The most likely reason for this is, that the add-on card uses both
request-ids (03:00.0 and 03:00.1) for requests originating from the SATA
controler. The address in the page-fault looks like an address the IOMMU
driver would assign but from a device which has no driver loaded.
If this is a know feature of the card the BIOS should detect it an
report it in the IVRS table with an alias-range. The driver would handle
it in this situation. Otherwise it looks like a problem with the
addon-card.

		Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-08 10:41                 ` Roedel, Joerg
@ 2011-02-08 11:00                   ` Chris Webb
  2011-02-08 14:43                     ` Robert Hancock
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-08 11:00 UTC (permalink / raw)
  To: Roedel, Joerg; +Cc: Robert Hancock, linux-ide

"Roedel, Joerg" <Joerg.Roedel@amd.com> writes:

> On Mon, Feb 07, 2011 at 09:04:40PM -0500, Robert Hancock wrote:
> > On 02/07/2011 11:48 AM, Chris Webb wrote:
> > > I've now done this, swapping in a Highpoint R620. I get the same
> > > IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
> > > device 03:00.1. I've put the new dmesg and lspci output at
> > >
> > >    http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
> > >    http://cdw.me.uk/tmp/sata-fault-hpt.lspci
> > >
> > > Again, problem is present both with 2.6.32.25 and 2.6.37.
> > 
> > Curious.. We don't even have a driver loaded for the PATA device on that 
> > chip so I don't see how we could be telling it to do anything. As far as 
> > I can see there are a few possible causes: Either the device is 
> > generating read requests which appear to come from the PATA function 
> > rather than the SATA one for some reason, the IOMMU is picking up the 
> > wrong device function for requests from that device, or something in the 
> > platform is somehow misconfiguring the device to cause this error. It 
> > may not be easy to figure out which one is the cause, however.
> 
> The most likely reason for this is, that the add-on card uses both
> request-ids (03:00.0 and 03:00.1) for requests originating from the SATA
> controler. The address in the page-fault looks like an address the IOMMU
> driver would assign but from a device which has no driver loaded.
> If this is a know feature of the card the BIOS should detect it an
> report it in the IVRS table with an alias-range. The driver would handle
> it in this situation. Otherwise it looks like a problem with the
> addon-card.

Hi Joerg. What's particularly puzzling here is that the symptoms are pretty
much the same with two completely different AHCI SATA cards. I expected to
be able to work around the problem by swapping in a different SATA card with
a different chipset, but it seems to be a problem with both.

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-08 11:00                   ` Chris Webb
@ 2011-02-08 14:43                     ` Robert Hancock
  2011-02-08 14:48                       ` Chris Webb
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-02-08 14:43 UTC (permalink / raw)
  To: Chris Webb; +Cc: Roedel, Joerg, linux-ide

On Tue, Feb 8, 2011 at 5:00 AM, Chris Webb <chris.webb@elastichosts.com> wrote:
> "Roedel, Joerg" <Joerg.Roedel@amd.com> writes:
>
>> On Mon, Feb 07, 2011 at 09:04:40PM -0500, Robert Hancock wrote:
>> > On 02/07/2011 11:48 AM, Chris Webb wrote:
>> > > I've now done this, swapping in a Highpoint R620. I get the same
>> > > IO_PAGE_FAULT, same timeouts on the sata card, but lspci now shows up the
>> > > device 03:00.1. I've put the new dmesg and lspci output at
>> > >
>> > >    http://cdw.me.uk/tmp/sata-fault-hpt.dmesg
>> > >    http://cdw.me.uk/tmp/sata-fault-hpt.lspci
>> > >
>> > > Again, problem is present both with 2.6.32.25 and 2.6.37.
>> >
>> > Curious.. We don't even have a driver loaded for the PATA device on that
>> > chip so I don't see how we could be telling it to do anything. As far as
>> > I can see there are a few possible causes: Either the device is
>> > generating read requests which appear to come from the PATA function
>> > rather than the SATA one for some reason, the IOMMU is picking up the
>> > wrong device function for requests from that device, or something in the
>> > platform is somehow misconfiguring the device to cause this error. It
>> > may not be easy to figure out which one is the cause, however.
>>
>> The most likely reason for this is, that the add-on card uses both
>> request-ids (03:00.0 and 03:00.1) for requests originating from the SATA
>> controler. The address in the page-fault looks like an address the IOMMU
>> driver would assign but from a device which has no driver loaded.
>> If this is a know feature of the card the BIOS should detect it an
>> report it in the IVRS table with an alias-range. The driver would handle
>> it in this situation. Otherwise it looks like a problem with the
>> addon-card.
>
> Hi Joerg. What's particularly puzzling here is that the symptoms are pretty
> much the same with two completely different AHCI SATA cards. I expected to
> be able to work around the problem by swapping in a different SATA card with
> a different chipset, but it seems to be a problem with both.

Did you try different chipsets? The two sets of output you posted are
both from Marvell 88SE9123-based cards.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-08 14:43                     ` Robert Hancock
@ 2011-02-08 14:48                       ` Chris Webb
  2011-02-17  9:40                         ` Chris Webb
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-08 14:48 UTC (permalink / raw)
  To: Robert Hancock; +Cc: Roedel, Joerg, linux-ide

Robert Hancock <hancockrwd@gmail.com> writes:

> Did you try different chipsets? The two sets of output you posted are
> both from Marvell 88SE9123-based cards.

Oh, I may have been a muppet here then. I assumed the HPT cards would be a
different chipset. I'll find something completely different!

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-08 14:48                       ` Chris Webb
@ 2011-02-17  9:40                         ` Chris Webb
  2011-02-18  0:22                           ` Robert Hancock
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Webb @ 2011-02-17  9:40 UTC (permalink / raw)
  To: Robert Hancock; +Cc: Roedel, Joerg, linux-ide

Chris Webb <chris.webb@elastichosts.com> writes:

> Robert Hancock <hancockrwd@gmail.com> writes:
> 
> > Did you try different chipsets? The two sets of output you posted are
> > both from Marvell 88SE9123-based cards.
> 
> Oh, I may have been a muppet here then. I assumed the HPT cards would be a
> different chipset. I'll find something completely different!

A SIL-based AHCI controller card turns out to work fine in the same setup,
so it does look like a problem specific to Marvell 88SE9123-based cards.

Cheers,

Chris.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-17  9:40                         ` Chris Webb
@ 2011-02-18  0:22                           ` Robert Hancock
  2011-02-21  9:39                             ` Roedel, Joerg
  0 siblings, 1 reply; 16+ messages in thread
From: Robert Hancock @ 2011-02-18  0:22 UTC (permalink / raw)
  To: Chris Webb; +Cc: Roedel, Joerg, linux-ide

On 02/17/2011 03:40 AM, Chris Webb wrote:
> Chris Webb<chris.webb@elastichosts.com>  writes:
>
>> Robert Hancock<hancockrwd@gmail.com>  writes:
>>
>>> Did you try different chipsets? The two sets of output you posted are
>>> both from Marvell 88SE9123-based cards.
>>
>> Oh, I may have been a muppet here then. I assumed the HPT cards would be a
>> different chipset. I'll find something completely different!
>
> A SIL-based AHCI controller card turns out to work fine in the same setup,
> so it does look like a problem specific to Marvell 88SE9123-based cards.

That would tend to indicate some kind of issue with that particular 
chip. Does anyone have any contacts at Marvell that could look into this?

Also wonder if we could quirk this in the kernel somehow so that IOMMUs 
wouldn't freak out..

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: IO_PAGE_FAULT from SATA card during boot
  2011-02-18  0:22                           ` Robert Hancock
@ 2011-02-21  9:39                             ` Roedel, Joerg
  0 siblings, 0 replies; 16+ messages in thread
From: Roedel, Joerg @ 2011-02-21  9:39 UTC (permalink / raw)
  To: Robert Hancock; +Cc: Chris Webb, linux-ide

On Thu, Feb 17, 2011 at 07:22:41PM -0500, Robert Hancock wrote:
> Also wonder if we could quirk this in the kernel somehow so that IOMMUs 
> wouldn't freak out..

Some kind of quirk will be required for the cards out there. I'll think
about it and post some patches for testing.

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-02-21  9:39 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-29 11:24 IO_PAGE_FAULT from SATA card during boot Chris Webb
2011-01-29 16:41 ` Robert Hancock
2011-01-30  1:54   ` Chris Webb
2011-01-30 15:37     ` Robert Hancock
2011-02-02 13:56       ` Chris Webb
2011-02-03  0:49         ` Robert Hancock
2011-02-03  8:56           ` Chris Webb
2011-02-07 17:48             ` Chris Webb
2011-02-08  2:04               ` Robert Hancock
2011-02-08 10:41                 ` Roedel, Joerg
2011-02-08 11:00                   ` Chris Webb
2011-02-08 14:43                     ` Robert Hancock
2011-02-08 14:48                       ` Chris Webb
2011-02-17  9:40                         ` Chris Webb
2011-02-18  0:22                           ` Robert Hancock
2011-02-21  9:39                             ` Roedel, Joerg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.