All of lore.kernel.org
 help / color / mirror / Atom feed
* Regression between v3.5 and v3.6 in libata
@ 2013-02-26 20:03 Jan Sembera
  2013-02-26 22:07 ` Jan Sembera
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Sembera @ 2013-02-26 20:03 UTC (permalink / raw)
  To: linux-ide; +Cc: mjg, holger, ming.m.lin, 1126766

Hi,

	it looks like commit 30dcf76 (libata: migrate ACPI code over to new
bindings) introduced some regression in libata, which is causing my AMD E450
based system (Asus E45M1 motherboard) to fail on one of the connected disk
drives. It works fine in v3.5, it fails with v3.6 onwards (I've tested with
3.8 and it's also broken).

Basically, four SATA ports are connected to ahci driver and the fifth port
is somehow suspiciously connected to pata-atiixp controller (although it is
in fact SATA disk with SATA connection):

00:14.1 IDE interface: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 IDE Controller (rev 40)

This works with 3.5:

[    1.972785] pata_acpi 0000:03:00.1: enabling device (0000 -> 0001)
[    1.973265] Fixed MDIO Bus: probed
...
[    2.877438] scsi6 : pata_atiixp
[    2.881369] scsi7 : pata_atiixp
[    2.881678] ata7: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xf100 irq 14
[    2.881682] ata8: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xf108 irq 15
...
[    3.046418] ata7.00: ATA-8: SAMSUNG HD204UI, 1AQ10001, max UDMA/133
[    3.046428] ata7.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 0/32)
[    3.046436] ata7.00: limited to UDMA/33 due to 40-wire cable
[    3.059262] ata7.00: configured for UDMA/33

But fails with later kernels:

[    2.391535] scsi6 : pata_acpi
[    2.391994] scsi7 : pata_acpi
[    2.392281] ata7: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf100 irq 14
[    2.392371] ata8: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf108 irq 15
[    2.392453] pata_acpi 0000:03:00.1: enabling device (0000 -> 0001)
[    2.393013] libphy: Fixed MDIO Bus: probed
[    2.393060] ata7: prereset failed (errno=-19)
[    2.393063] ata7: reset failed, giving up
...
[    2.924032] ata8: prereset failed (errno=-19)
[    2.924126] ata8: reset failed, giving up

Full dmesg outputs:
https://launchpadlibrarian.net/131361456/dmesg-3.4-ok.log
https://launchpadlibrarian.net/131361470/dmesg-3.7-bad.log

And the disk connected to that port remains undetected. I have bisected between
v3.5 and v3.6 and with clear reproduction case narrowed it down to
30dcf76acc695cbd2fa919e294670fe9552e16e7. I would like to try to revert it
on newer kernels to retest with 3.8 for example, but it's not easy to do right
away without knowing what I'm doing, as there are some patches stacked on top
of that.

I'm willing to do any testing necessary to resolve this.

Thanks,
Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Regression between v3.5 and v3.6 in libata
  2013-02-26 20:03 Regression between v3.5 and v3.6 in libata Jan Sembera
@ 2013-02-26 22:07 ` Jan Sembera
  2013-02-27  6:36   ` Robert Hancock
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Sembera @ 2013-02-26 22:07 UTC (permalink / raw)
  To: linux-ide; +Cc: mjg, holger, ming.m.lin, 1126766

So I apparently missed two most important differences between good and bad
boots.

On Tue, Feb 26, 2013 at 09:03:48PM +0100, Jan Sembera wrote:
> [    2.877438] scsi6 : pata_atiixp
> [    2.881369] scsi7 : pata_atiixp
> 
> [    2.391535] scsi6 : pata_acpi
> [    2.391994] scsi7 : pata_acpi

pata-acpi doesn't play very well with this controller. Disabling it in
kernel and rebooting (even with 3.8) provided completely working kernel.
So either this driver shouldn't bind pata-acpi and leave it on pata-atiixp
as before (some kind of blacklisting needed?), or it needs some fixing to
work nicely with this controller.

As a workaround for now, I'll just not compile PATA_ACPI into the kernel.

00:14.1 IDE interface: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 IDE Controller (rev 40) (prog-if 8a [Master SecP PriP])
	Subsystem: ASUSTeK Computer Inc. Device 8496
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 32
	Interrupt: pin B routed to IRQ 17
	Region 0: I/O ports at 01f0 [size=8]
	Region 1: I/O ports at 03f4 [size=1]
	Region 2: I/O ports at 0170 [size=8]
	Region 3: I/O ports at 0374 [size=1]
	Region 4: I/O ports at f100 [size=16]
	Kernel driver in use: pata_atiixp


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Regression between v3.5 and v3.6 in libata
  2013-02-26 22:07 ` Jan Sembera
@ 2013-02-27  6:36   ` Robert Hancock
  2013-02-27  9:38     ` Jan Sembera
  0 siblings, 1 reply; 6+ messages in thread
From: Robert Hancock @ 2013-02-27  6:36 UTC (permalink / raw)
  To: Jan Sembera; +Cc: linux-ide, mjg, holger, ming.m.lin, 1126766

On 02/26/2013 04:07 PM, Jan Sembera wrote:
> So I apparently missed two most important differences between good and bad
> boots.
>
> On Tue, Feb 26, 2013 at 09:03:48PM +0100, Jan Sembera wrote:
>> [    2.877438] scsi6 : pata_atiixp
>> [    2.881369] scsi7 : pata_atiixp
>>
>> [    2.391535] scsi6 : pata_acpi
>> [    2.391994] scsi7 : pata_acpi
>
> pata-acpi doesn't play very well with this controller. Disabling it in
> kernel and rebooting (even with 3.8) provided completely working kernel.
> So either this driver shouldn't bind pata-acpi and leave it on pata-atiixp
> as before (some kind of blacklisting needed?), or it needs some fixing to
> work nicely with this controller.
>
> As a workaround for now, I'll just not compile PATA_ACPI into the kernel.

What are your kernel config settings for these modules? The idea is that 
pata_acpi is only supposed to get loaded if no other driver is able to 
bind to the device.

>
> 00:14.1 IDE interface: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 IDE Controller (rev 40) (prog-if 8a [Master SecP PriP])
> 	Subsystem: ASUSTeK Computer Inc. Device 8496
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> 	Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 32
> 	Interrupt: pin B routed to IRQ 17
> 	Region 0: I/O ports at 01f0 [size=8]
> 	Region 1: I/O ports at 03f4 [size=1]
> 	Region 2: I/O ports at 0170 [size=8]
> 	Region 3: I/O ports at 0374 [size=1]
> 	Region 4: I/O ports at f100 [size=16]
> 	Kernel driver in use: pata_atiixp
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Regression between v3.5 and v3.6 in libata
  2013-02-27  6:36   ` Robert Hancock
@ 2013-02-27  9:38     ` Jan Sembera
  2013-02-27 12:42       ` Aaron Lu
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Sembera @ 2013-02-27  9:38 UTC (permalink / raw)
  To: Robert Hancock; +Cc: linux-ide, mjg, holger

On Wed, Feb 27, 2013 at 12:36:07AM -0600, Robert Hancock wrote:
> On 02/26/2013 04:07 PM, Jan Sembera wrote:
> > So I apparently missed two most important differences between good and bad
> > boots.
> >
> > On Tue, Feb 26, 2013 at 09:03:48PM +0100, Jan Sembera wrote:
> >> [    2.877438] scsi6 : pata_atiixp
> >> [    2.881369] scsi7 : pata_atiixp
> >>
> >> [    2.391535] scsi6 : pata_acpi
> >> [    2.391994] scsi7 : pata_acpi
> >
> > pata-acpi doesn't play very well with this controller. Disabling it in
> > kernel and rebooting (even with 3.8) provided completely working kernel.
> > So either this driver shouldn't bind pata-acpi and leave it on pata-atiixp
> > as before (some kind of blacklisting needed?), or it needs some fixing to
> > work nicely with this controller.
> >
> > As a workaround for now, I'll just not compile PATA_ACPI into the kernel.
> 
> What are your kernel config settings for these modules? The idea is that 
> pata_acpi is only supposed to get loaded if no other driver is able to 
> bind to the device.

This is based on a config that Ubuntu uses for building vanilla kernels and
has PATA_ACPI=y, PATA_ATIIXP=m. Which probably means that pata_acpi will
grab the controller before pata_atiixp has any chance to do so. Which is
probably bad and should be set to PATA_ACPI=m instead.

Should this also be treated as a bug in pata_acpi, or is it expected that
it's going to fail on some subset of motherboards/controllers and it's not
worth bothering with fixing it, especially if there is some other driver
that handles the controller fine?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Regression between v3.5 and v3.6 in libata
  2013-02-27  9:38     ` Jan Sembera
@ 2013-02-27 12:42       ` Aaron Lu
  2013-02-27 14:25         ` Jan Sembera
  0 siblings, 1 reply; 6+ messages in thread
From: Aaron Lu @ 2013-02-27 12:42 UTC (permalink / raw)
  To: Jan Sembera; +Cc: Robert Hancock, linux-ide, mjg, holger

On 02/27/2013 05:38 PM, Jan Sembera wrote:
> On Wed, Feb 27, 2013 at 12:36:07AM -0600, Robert Hancock wrote:
>> On 02/26/2013 04:07 PM, Jan Sembera wrote:
>>> So I apparently missed two most important differences between good and bad
>>> boots.
>>>
>>> On Tue, Feb 26, 2013 at 09:03:48PM +0100, Jan Sembera wrote:
>>>> [    2.877438] scsi6 : pata_atiixp
>>>> [    2.881369] scsi7 : pata_atiixp
>>>>
>>>> [    2.391535] scsi6 : pata_acpi
>>>> [    2.391994] scsi7 : pata_acpi
>>>
>>> pata-acpi doesn't play very well with this controller. Disabling it in
>>> kernel and rebooting (even with 3.8) provided completely working kernel.
>>> So either this driver shouldn't bind pata-acpi and leave it on pata-atiixp
>>> as before (some kind of blacklisting needed?), or it needs some fixing to
>>> work nicely with this controller.
>>>
>>> As a workaround for now, I'll just not compile PATA_ACPI into the kernel.
>>
>> What are your kernel config settings for these modules? The idea is that
>> pata_acpi is only supposed to get loaded if no other driver is able to
>> bind to the device.
>
> This is based on a config that Ubuntu uses for building vanilla kernels and
> has PATA_ACPI=y, PATA_ATIIXP=m. Which probably means that pata_acpi will
> grab the controller before pata_atiixp has any chance to do so. Which is
> probably bad and should be set to PATA_ACPI=m instead.
>
> Should this also be treated as a bug in pata_acpi, or is it expected that
> it's going to fail on some subset of motherboards/controllers and it's not
> worth bothering with fixing it, especially if there is some other driver
> that handles the controller fine?

The order here is important: vendor driver should always be used before
pata_acpi.

And regarding the bisected commit, it actually fixed a bug in pata_acpi
and made it successfully probed the controller device, so that no other
pata driver is able to probe it; and due to pata_acpi can not always
successfully drive that controller(this depends on ACPI table, it may
not be a bug in pata_acpi), the disks attached will not function
properly. Here is an explanation on this:
https://bugzilla.kernel.org/show_bug.cgi?id=49151#c41

And a previously submitted bug report on this:
https://bugzilla.kernel.org/show_bug.cgi?id=48631

Thanks,
Aaron

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Regression between v3.5 and v3.6 in libata
  2013-02-27 12:42       ` Aaron Lu
@ 2013-02-27 14:25         ` Jan Sembera
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Sembera @ 2013-02-27 14:25 UTC (permalink / raw)
  To: Aaron Lu; +Cc: Robert Hancock, linux-ide, mjg, holger

On Wed, Feb 27, 2013 at 08:42:04PM +0800, Aaron Lu wrote:
> On 02/27/2013 05:38 PM, Jan Sembera wrote:
> > On Wed, Feb 27, 2013 at 12:36:07AM -0600, Robert Hancock wrote:
> >> On 02/26/2013 04:07 PM, Jan Sembera wrote:
> >>> So I apparently missed two most important differences between good and bad
> >>> boots.
> >>>
> >>> On Tue, Feb 26, 2013 at 09:03:48PM +0100, Jan Sembera wrote:
> >>>> [    2.877438] scsi6 : pata_atiixp
> >>>> [    2.881369] scsi7 : pata_atiixp
> >>>>
> >>>> [    2.391535] scsi6 : pata_acpi
> >>>> [    2.391994] scsi7 : pata_acpi
> >>>
> >>> pata-acpi doesn't play very well with this controller. Disabling it in
> >>> kernel and rebooting (even with 3.8) provided completely working kernel.
> >>> So either this driver shouldn't bind pata-acpi and leave it on pata-atiixp
> >>> as before (some kind of blacklisting needed?), or it needs some fixing to
> >>> work nicely with this controller.
> >>>
> >>> As a workaround for now, I'll just not compile PATA_ACPI into the kernel.
> >>
> >> What are your kernel config settings for these modules? The idea is that
> >> pata_acpi is only supposed to get loaded if no other driver is able to
> >> bind to the device.
> >
> > This is based on a config that Ubuntu uses for building vanilla kernels and
> > has PATA_ACPI=y, PATA_ATIIXP=m. Which probably means that pata_acpi will
> > grab the controller before pata_atiixp has any chance to do so. Which is
> > probably bad and should be set to PATA_ACPI=m instead.
> >
> > Should this also be treated as a bug in pata_acpi, or is it expected that
> > it's going to fail on some subset of motherboards/controllers and it's not
> > worth bothering with fixing it, especially if there is some other driver
> > that handles the controller fine?
> 
> The order here is important: vendor driver should always be used before
> pata_acpi.
> 
> And regarding the bisected commit, it actually fixed a bug in pata_acpi
> and made it successfully probed the controller device, so that no other
> pata driver is able to probe it; and due to pata_acpi can not always
> successfully drive that controller(this depends on ACPI table, it may
> not be a bug in pata_acpi), the disks attached will not function
> properly. Here is an explanation on this:
> https://bugzilla.kernel.org/show_bug.cgi?id=49151#c41
> 
> And a previously submitted bug report on this:
> https://bugzilla.kernel.org/show_bug.cgi?id=48631

Ok, thanks for the detailed explanation. I'll switch to PATA_ACPI=m for now,
but if there is some debugging you'd like me to do with pata_acpi, I'm
willing to help out with that as well.

Cheers,
Jan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-02-27 14:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-26 20:03 Regression between v3.5 and v3.6 in libata Jan Sembera
2013-02-26 22:07 ` Jan Sembera
2013-02-27  6:36   ` Robert Hancock
2013-02-27  9:38     ` Jan Sembera
2013-02-27 12:42       ` Aaron Lu
2013-02-27 14:25         ` Jan Sembera

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.