iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
       [not found] <a79ea7f5-6a41-a6c9-cfec-ba01aa2a3cfa@leemhuis.info>
@ 2023-03-28  1:22 ` Christoph Hellwig
  2023-03-30 12:18   ` Robin Murphy
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-03-28  1:22 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Christoph Hellwig, LKML, Linux PCI, iommu, baolu.lu


I finally found some real time to look into this:

On Tue, Mar 21, 2023 at 02:52:00PM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
> > The man errors I am getting are
> > 
> > dmar_fault 8 callbacks suppressed
> > DMAR : DRHD: handling fault status req 2
> > DMAR : [DMA Write NO_PASID] Request device [07.00.1] fault addr
> > 0xfffe0000 [fault reason 0x82] Present bit in contect entry is clear

This clearly indicates that my original idea about the AMD gart was
completely bonkers, as we're obviously on an Intel platform.

And this indicates that the device is trying to do a DMA write to
something that isn't IOMMU mapped.  Getting this from an initialization
change (commit 78013eaadf6 (x86: remove the IOMMU table infrastructure")
feels very strange to me.

Can you maybe post the full dmesg?  I wonder if there is interesting
initialization error in here.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-03-28  1:22 ` [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235 Christoph Hellwig
@ 2023-03-30 12:18   ` Robin Murphy
  2023-03-31  2:20     ` Jason Adriaanse
  2023-04-16  6:41     ` Christoph Hellwig
  0 siblings, 2 replies; 19+ messages in thread
From: Robin Murphy @ 2023-03-30 12:18 UTC (permalink / raw)
  To: Christoph Hellwig, Linux regressions mailing list
  Cc: LKML, Linux PCI, iommu, baolu.lu

On 2023-03-28 02:22, Christoph Hellwig wrote:
> 
> I finally found some real time to look into this:
> 
> On Tue, Mar 21, 2023 at 02:52:00PM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> The man errors I am getting are
>>>
>>> dmar_fault 8 callbacks suppressed
>>> DMAR : DRHD: handling fault status req 2
>>> DMAR : [DMA Write NO_PASID] Request device [07.00.1] fault addr
>>> 0xfffe0000 [fault reason 0x82] Present bit in contect entry is clear
> 
> This clearly indicates that my original idea about the AMD gart was
> completely bonkers, as we're obviously on an Intel platform.
> 
> And this indicates that the device is trying to do a DMA write to
> something that isn't IOMMU mapped.  Getting this from an initialization
> change (commit 78013eaadf6 (x86: remove the IOMMU table infrastructure")
> feels very strange to me.
> 
> Can you maybe post the full dmesg?  I wonder if there is interesting
> initialization error in here.

FWIW "Marvell SATA" instantly makes me suspect the phantom function 
quirk. What *should* happen is the IOMMU driver sees the PCI DMA aliases 
correctly and sets up context entries for both 07.00.0 and 07.00.1, but 
it looks like that may be what's gone awry.

Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-03-30 12:18   ` Robin Murphy
@ 2023-03-31  2:20     ` Jason Adriaanse
  2023-04-16  6:55       ` Christoph Hellwig
  2023-04-16  6:41     ` Christoph Hellwig
  1 sibling, 1 reply; 19+ messages in thread
From: Jason Adriaanse @ 2023-03-31  2:20 UTC (permalink / raw)
  To: robin.murphy; +Cc: baolu.lu, hch, iommu, linux-kernel, linux-pci, regressions

Hi Christoph and Robin,

Christoph - I would like to send you more dmesg information but as my 
boot device cannot be detected that information is not being written to 
disk. Is there any way to specifically write boot debug information to 
say a USB device with some kernel parameters?

Alternatively, I could boot from a USB device.

Jason



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-03-30 12:18   ` Robin Murphy
  2023-03-31  2:20     ` Jason Adriaanse
@ 2023-04-16  6:41     ` Christoph Hellwig
  2023-04-17 11:21       ` Robin Murphy
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-04-16  6:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Christoph Hellwig, Linux regressions mailing list, LKML,
	Linux PCI, iommu, baolu.lu

On Thu, Mar 30, 2023 at 01:18:45PM +0100, Robin Murphy wrote:
> FWIW "Marvell SATA" instantly makes me suspect the phantom function quirk. 
> What *should* happen is the IOMMU driver sees the PCI DMA aliases correctly 
> and sets up context entries for both 07.00.0 and 07.00.1, but it looks like 
> that may be what's gone awry.

Looking at the bug report it seems this is device 9235, which doesn't
need the DMA alias quirks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-03-31  2:20     ` Jason Adriaanse
@ 2023-04-16  6:55       ` Christoph Hellwig
  2023-04-22  6:25         ` Jason Adriaanse
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-04-16  6:55 UTC (permalink / raw)
  To: Jason Adriaanse
  Cc: robin.murphy, baolu.lu, hch, iommu, linux-kernel, linux-pci, regressions

Hi Jason,

sorry for the late reply.  I had some email issues and am still
recovering from the backlog.

On Fri, Mar 31, 2023 at 10:20:37AM +0800, Jason Adriaanse wrote:
> Hi Christoph and Robin,
>
> Christoph - I would like to send you more dmesg information but as my boot 
> device cannot be detected that information is not being written to disk. Is 
> there any way to specifically write boot debug information to say a USB 
> device with some kernel parameters?

I don't know of any good way.  pstore has some ways to save kernel
messages, but it doesn't work to well with normal block devices in
case of crashes.

I'm a bit lost at the moment.

Two ideas I have, would be to

 1) boot with the intel_iommu=off kernel command line
 3) build a kernel with CONFIG_INTEL_IOMMU

and see if that works and report the dmesg.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-16  6:41     ` Christoph Hellwig
@ 2023-04-17 11:21       ` Robin Murphy
  0 siblings, 0 replies; 19+ messages in thread
From: Robin Murphy @ 2023-04-17 11:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Linux regressions mailing list, LKML, Linux PCI, iommu, baolu.lu

On 2023-04-16 07:41, Christoph Hellwig wrote:
> On Thu, Mar 30, 2023 at 01:18:45PM +0100, Robin Murphy wrote:
>> FWIW "Marvell SATA" instantly makes me suspect the phantom function quirk.
>> What *should* happen is the IOMMU driver sees the PCI DMA aliases correctly
>> and sets up context entries for both 07.00.0 and 07.00.1, but it looks like
>> that may be what's gone awry.
> 
> Looking at the bug report it seems this is device 9235, which doesn't
> need the DMA alias quirks.

Indeed that one doesn't appear to be in the quirk list currently. 
However the symptom of DMA traffic from function 1 which the IOMMU 
clearly wasn't expecting firmly suggests that it *does* need the quirk. 
Digging up the original report, the lspci output there suggests that 
07:00.1 isn't a real function, which would further confirm it.

The other thing which catches my interest is the seemingly-conflicting 
"iommu=soft" and "intel_iommu=on" arguments - I could well believe that 
refactoring the x86 IOMMU detection stuff might have subtly changed the 
interaction there, such that previously it ended up not actually using 
the IOMMU for DMA ops, but now it is?

Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-16  6:55       ` Christoph Hellwig
@ 2023-04-22  6:25         ` Jason Adriaanse
  2023-04-24 13:20           ` Robin Murphy
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Adriaanse @ 2023-04-22  6:25 UTC (permalink / raw)
  To: hch
  Cc: baolu.lu, iommu, jason_a69, linux-kernel, linux-pci, regressions,
	robin.murphy

Hi Christoph,

Sorry for my late reply, I have been on the road.

So, if I boot with
intel_iommu=off
Then the server boots fine..although that is not a solution because I 
need Intel iommu for virtualisation.

Also, I build all my kernels with CONFIG_INTEL_IOMMU=y


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-22  6:25         ` Jason Adriaanse
@ 2023-04-24 13:20           ` Robin Murphy
  2023-04-24 13:44             ` Jason Adriaanse
  0 siblings, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2023-04-24 13:20 UTC (permalink / raw)
  To: Jason Adriaanse, hch
  Cc: baolu.lu, iommu, linux-kernel, linux-pci, regressions

On 2023-04-22 07:25, Jason Adriaanse wrote:
> Hi Christoph,
> 
> Sorry for my late reply, I have been on the road.
> 
> So, if I boot with
> intel_iommu=off
> Then the server boots fine..although that is not a solution because I 
> need Intel iommu for virtualisation.
> 
> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
> 

If you boot 5.15 *without* the "iommu=soft" argument, just 
"intel_iommu=on", does that also break?

Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-24 13:20           ` Robin Murphy
@ 2023-04-24 13:44             ` Jason Adriaanse
  2023-04-24 14:07               ` Robin Murphy
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Adriaanse @ 2023-04-24 13:44 UTC (permalink / raw)
  To: Robin Murphy, hch; +Cc: baolu.lu, iommu, linux-kernel, linux-pci, regressions

I took out "iommu=soft" and the server failed to boot, so yes it does break.

The first error was
ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)

On 24/04/2023 21:20, Robin Murphy wrote:
> On 2023-04-22 07:25, Jason Adriaanse wrote:
>> Hi Christoph,
>>
>> Sorry for my late reply, I have been on the road.
>>
>> So, if I boot with
>> intel_iommu=off
>> Then the server boots fine..although that is not a solution because I 
>> need Intel iommu for virtualisation.
>>
>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>
>
> If you boot 5.15 *without* the "iommu=soft" argument, just 
> "intel_iommu=on", does that also break?
>
> Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-24 13:44             ` Jason Adriaanse
@ 2023-04-24 14:07               ` Robin Murphy
  2023-04-25  4:17                 ` Jason Adriaanse
  0 siblings, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2023-04-24 14:07 UTC (permalink / raw)
  To: Jason Adriaanse, hch
  Cc: baolu.lu, iommu, linux-kernel, linux-pci, regressions

On 2023-04-24 14:44, Jason Adriaanse wrote:
> I took out "iommu=soft" and the server failed to boot, so yes it does 
> break.
> 
> The first error was
> ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)

OK, great, that confirms the underlying issue existed all along, so the 
regression is only a change in who wins a fight between certain 
conflicting command-line arguments, which is arguably not so critical.

The rest of the evidence points to 88SE9235 wanting the same phantom 
function quirk as most other Marvell controllers, since although it's 
apparently been half-fixed such that DMA for two of the ports is being 
correctly emitted from function 0 - given that you say two of the disks 
*are* detected OK - the other two are still claiming to be function 1 
after all.

Thanks,
Robin.

> On 24/04/2023 21:20, Robin Murphy wrote:
>> On 2023-04-22 07:25, Jason Adriaanse wrote:
>>> Hi Christoph,
>>>
>>> Sorry for my late reply, I have been on the road.
>>>
>>> So, if I boot with
>>> intel_iommu=off
>>> Then the server boots fine..although that is not a solution because I 
>>> need Intel iommu for virtualisation.
>>>
>>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>>
>>
>> If you boot 5.15 *without* the "iommu=soft" argument, just 
>> "intel_iommu=on", does that also break?
>>
>> Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-24 14:07               ` Robin Murphy
@ 2023-04-25  4:17                 ` Jason Adriaanse
  2023-04-25 11:37                   ` Robin Murphy
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Adriaanse @ 2023-04-25  4:17 UTC (permalink / raw)
  To: Robin Murphy, hch; +Cc: baolu.lu, iommu, linux-kernel, linux-pci, regressions

Ok great,

I take it a change needs to be made in
drivers/pci/quirks.c
?
I do not mind making the change locally here and letting you know if it 
works or not.

On 24/04/2023 22:07, Robin Murphy wrote:
> On 2023-04-24 14:44, Jason Adriaanse wrote:
>> I took out "iommu=soft" and the server failed to boot, so yes it does 
>> break.
>>
>> The first error was
>> ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)
>
> OK, great, that confirms the underlying issue existed all along, so 
> the regression is only a change in who wins a fight between certain 
> conflicting command-line arguments, which is arguably not so critical.
>
> The rest of the evidence points to 88SE9235 wanting the same phantom 
> function quirk as most other Marvell controllers, since although it's 
> apparently been half-fixed such that DMA for two of the ports is being 
> correctly emitted from function 0 - given that you say two of the 
> disks *are* detected OK - the other two are still claiming to be 
> function 1 after all.
>
> Thanks,
> Robin.
>
>> On 24/04/2023 21:20, Robin Murphy wrote:
>>> On 2023-04-22 07:25, Jason Adriaanse wrote:
>>>> Hi Christoph,
>>>>
>>>> Sorry for my late reply, I have been on the road.
>>>>
>>>> So, if I boot with
>>>> intel_iommu=off
>>>> Then the server boots fine..although that is not a solution because 
>>>> I need Intel iommu for virtualisation.
>>>>
>>>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>>>
>>>
>>> If you boot 5.15 *without* the "iommu=soft" argument, just 
>>> "intel_iommu=on", does that also break?
>>>
>>> Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-25  4:17                 ` Jason Adriaanse
@ 2023-04-25 11:37                   ` Robin Murphy
  2023-04-25 13:58                     ` Jason Adriaanse
  0 siblings, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2023-04-25 11:37 UTC (permalink / raw)
  To: Jason Adriaanse, hch
  Cc: baolu.lu, iommu, linux-kernel, linux-pci, regressions

On 2023-04-25 05:17, Jason Adriaanse wrote:
> Ok great,
> 
> I take it a change needs to be made in
> drivers/pci/quirks.c
> ?
> I do not mind making the change locally here and letting you know if it 
> works or not.

Indeed, something like this (make sure the IDs actually match what your
device reports, I'm just guessing):


diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 44cab813bf95..a9166e886b75 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4161,6 +4161,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9220,
  /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
  			 quirk_dma_func1_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
+			 quirk_dma_func1_alias);
  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
  			 quirk_dma_func1_alias);
  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,


Marvell themselves seem to lump the 88SE92xx products together as a
closely-related family, so given that we do have quirks for 3 of the 4
already, this one does rather seem conspicuous by its absence...

Thanks,
Robin.

> On 24/04/2023 22:07, Robin Murphy wrote:
>> On 2023-04-24 14:44, Jason Adriaanse wrote:
>>> I took out "iommu=soft" and the server failed to boot, so yes it does 
>>> break.
>>>
>>> The first error was
>>> ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)
>>
>> OK, great, that confirms the underlying issue existed all along, so 
>> the regression is only a change in who wins a fight between certain 
>> conflicting command-line arguments, which is arguably not so critical.
>>
>> The rest of the evidence points to 88SE9235 wanting the same phantom 
>> function quirk as most other Marvell controllers, since although it's 
>> apparently been half-fixed such that DMA for two of the ports is being 
>> correctly emitted from function 0 - given that you say two of the 
>> disks *are* detected OK - the other two are still claiming to be 
>> function 1 after all.
>>
>> Thanks,
>> Robin.
>>
>>> On 24/04/2023 21:20, Robin Murphy wrote:
>>>> On 2023-04-22 07:25, Jason Adriaanse wrote:
>>>>> Hi Christoph,
>>>>>
>>>>> Sorry for my late reply, I have been on the road.
>>>>>
>>>>> So, if I boot with
>>>>> intel_iommu=off
>>>>> Then the server boots fine..although that is not a solution because 
>>>>> I need Intel iommu for virtualisation.
>>>>>
>>>>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>>>>
>>>>
>>>> If you boot 5.15 *without* the "iommu=soft" argument, just 
>>>> "intel_iommu=on", does that also break?
>>>>
>>>> Robin.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-25 11:37                   ` Robin Murphy
@ 2023-04-25 13:58                     ` Jason Adriaanse
  2023-05-22 10:26                       ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Adriaanse @ 2023-04-25 13:58 UTC (permalink / raw)
  To: Robin Murphy, hch; +Cc: baolu.lu, iommu, linux-kernel, linux-pci, regressions

I am happy to report that the change worked, this is what 
drivers/pci/quirks.c looks like

/* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
                          quirk_dma_func1_alias);
/* https://bugzilla.kernel.org/show_bug.cgi?id=217218 */
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
                          quirk_dma_func1_alias);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
                          quirk_dma_func1_alias);

Relevant output of dmesg -T with the new kernel running

Tue Apr 25 21:45:13 2023] scsi host0: ahci
[Tue Apr 25 21:45:13 2023] scsi host1: ahci
[Tue Apr 25 21:45:13 2023] scsi host2: ahci
[Tue Apr 25 21:45:13 2023] scsi host3: ahci
[Tue Apr 25 21:45:13 2023] ata1: SATA max UDMA/133 abar m2048@0xf7d06000 
port 0xf7d06100 irq 40
[Tue Apr 25 21:45:13 2023] ata2: SATA max UDMA/133 abar m2048@0xf7d06000 
port 0xf7d06180 irq 40
[Tue Apr 25 21:45:13 2023] ata3: DUMMY
[Tue Apr 25 21:45:13 2023] ata4: DUMMY
[Tue Apr 25 21:45:13 2023] igb 0000:05:00.0 enp5s0: renamed from eth0
[Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: AHCI 0001.0000 32 slots 4 
ports 6 Gbps 0xf impl SATA mode
[Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: flags: 64bit ncq sntf led 
only pmp fbs pio slum part sxs
[Tue Apr 25 21:45:13 2023] scsi host4: ahci
[Tue Apr 25 21:45:13 2023] scsi host5: ahci
[Tue Apr 25 21:45:13 2023] scsi host6: ahci
[Tue Apr 25 21:45:13 2023] scsi host7: ahci
[Tue Apr 25 21:45:13 2023] ata5: SATA max UDMA/133 abar m2048@0xf7b10000 
port 0xf7b10100 irq 41
[Tue Apr 25 21:45:13 2023] ata6: SATA max UDMA/133 abar m2048@0xf7b10000 
port 0xf7b10180 irq 41
[Tue Apr 25 21:45:13 2023] ata7: SATA max UDMA/133 abar m2048@0xf7b10000 
port 0xf7b10200 irq 41
[Tue Apr 25 21:45:13 2023] ata8: SATA max UDMA/133 abar m2048@0xf7b10000 
port 0xf7b10280 irq 41
[Tue Apr 25 21:45:13 2023] usb 1-1: new high-speed USB device number 2 
using ehci-pci
[Tue Apr 25 21:45:14 2023] usb 3-1: new high-speed USB device number 2 
using ehci-pci
[Tue Apr 25 21:45:14 2023] ata8: SATA link up 6.0 Gbps (SStatus 133 
SControl 300)
[Tue Apr 25 21:45:14 2023] ata6: SATA link up 6.0 Gbps (SStatus 133 
SControl 300)
[Tue Apr 25 21:45:14 2023] ata7: SATA link up 6.0 Gbps (SStatus 133 
SControl 300)
[Tue Apr 25 21:45:14 2023] ata5: SATA link up 6.0 Gbps (SStatus 133 
SControl 300)
[Tue Apr 25 21:45:14 2023] ata7.00: ATA-9: WDC WD40EFRX-68WT0N0, 
80.00A80, max UDMA/133
[Tue Apr 25 21:45:14 2023] ata6.00: ATA-9: WDC WD40EFRX-68WT0N0, 
80.00A80, max UDMA/133
[Tue Apr 25 21:45:14 2023] ata8.00: ATA-9: WDC WD40EFRX-68WT0N0, 
80.00A80, max UDMA/133
[Tue Apr 25 21:45:14 2023] ata5.00: ATA-10: CT2000BX500SSD1, M6CR030, 
max UDMA/133
[Tue Apr 25 21:45:14 2023] ata6.00: 7814037168 sectors, multi 0: LBA48 
NCQ (depth 32), AA
[Tue Apr 25 21:45:14 2023] ata7.00: 7814037168 sectors, multi 0: LBA48 
NCQ (depth 32), AA
[Tue Apr 25 21:45:14 2023] ata8.00: 7814037168 sectors, multi 0: LBA48 
NCQ (depth 32), AA
[Tue Apr 25 21:45:14 2023] ata5.00: 3907029168 sectors, multi 1: LBA48 
NCQ (depth 32), AA
[Tue Apr 25 21:45:14 2023] ata6.00: configured for UDMA/133
[Tue Apr 25 21:45:14 2023] ata7.00: configured for UDMA/133
[Tue Apr 25 21:45:14 2023] ata8.00: configured for UDMA/133
[Tue Apr 25 21:45:14 2023] ata1: SATA link down (SStatus 0 SControl 300)
[Tue Apr 25 21:45:14 2023] ata5.00: Features: Dev-Sleep
[Tue Apr 25 21:45:14 2023] ata5.00: configured for UDMA/133
[Tue Apr 25 21:45:14 2023] usb 1-1: New USB device found, idVendor=8087, 
idProduct=0024, bcdDevice= 0.00
[Tue Apr 25 21:45:14 2023] usb 1-1: New USB device strings: Mfr=0, 
Product=0, SerialNumber=0
[Tue Apr 25 21:45:14 2023] hub 1-1:1.0: USB hub found
[Tue Apr 25 21:45:14 2023] hub 1-1:1.0: 4 ports detected
[Tue Apr 25 21:45:14 2023] usb 3-1: New USB device found, idVendor=8087, 
idProduct=0024, bcdDevice= 0.00
[Tue Apr 25 21:45:14 2023] usb 3-1: New USB device strings: Mfr=0, 
Product=0, SerialNumber=0
[Tue Apr 25 21:45:14 2023] hub 3-1:1.0: USB hub found
[Tue Apr 25 21:45:14 2023] hub 3-1:1.0: 6 ports detected
[Tue Apr 25 21:45:14 2023] ata2: SATA link down (SStatus 0 SControl 300)
[Tue Apr 25 21:45:14 2023] scsi 4:0:0:0: Direct-Access ATA      
CT2000BX500SSD1  030  PQ: 0 ANSI: 5
[Tue Apr 25 21:45:14 2023] scsi 5:0:0:0: Direct-Access ATA      WDC 
WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
[Tue Apr 25 21:45:14 2023] scsi 6:0:0:0: Direct-Access ATA      WDC 
WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
[Tue Apr 25 21:45:14 2023] scsi 7:0:0:0: Direct-Access ATA      WDC 
WD40EFRX-68W 0A80 PQ: 0 ANSI: 5

Thanks everyone for all your help.

Jason


On 25/04/2023 19:37, Robin Murphy wrote:
> On 2023-04-25 05:17, Jason Adriaanse wrote:
>> Ok great,
>>
>> I take it a change needs to be made in
>> drivers/pci/quirks.c
>> ?
>> I do not mind making the change locally here and letting you know if 
>> it works or not.
>
> Indeed, something like this (make sure the IDs actually match what your
> device reports, I'm just guessing):
>
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 44cab813bf95..a9166e886b75 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4161,6 +4161,8 @@ 
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9220,
>  /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>               quirk_dma_func1_alias);
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
> +             quirk_dma_func1_alias);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>               quirk_dma_func1_alias);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,
>
>
> Marvell themselves seem to lump the 88SE92xx products together as a
> closely-related family, so given that we do have quirks for 3 of the 4
> already, this one does rather seem conspicuous by its absence...
>
> Thanks,
> Robin.
>
>> On 24/04/2023 22:07, Robin Murphy wrote:
>>> On 2023-04-24 14:44, Jason Adriaanse wrote:
>>>> I took out "iommu=soft" and the server failed to boot, so yes it 
>>>> does break.
>>>>
>>>> The first error was
>>>> ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)
>>>
>>> OK, great, that confirms the underlying issue existed all along, so 
>>> the regression is only a change in who wins a fight between certain 
>>> conflicting command-line arguments, which is arguably not so critical.
>>>
>>> The rest of the evidence points to 88SE9235 wanting the same phantom 
>>> function quirk as most other Marvell controllers, since although 
>>> it's apparently been half-fixed such that DMA for two of the ports 
>>> is being correctly emitted from function 0 - given that you say two 
>>> of the disks *are* detected OK - the other two are still claiming to 
>>> be function 1 after all.
>>>
>>> Thanks,
>>> Robin.
>>>
>>>> On 24/04/2023 21:20, Robin Murphy wrote:
>>>>> On 2023-04-22 07:25, Jason Adriaanse wrote:
>>>>>> Hi Christoph,
>>>>>>
>>>>>> Sorry for my late reply, I have been on the road.
>>>>>>
>>>>>> So, if I boot with
>>>>>> intel_iommu=off
>>>>>> Then the server boots fine..although that is not a solution 
>>>>>> because I need Intel iommu for virtualisation.
>>>>>>
>>>>>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>>>>>
>>>>>
>>>>> If you boot 5.15 *without* the "iommu=soft" argument, just 
>>>>> "intel_iommu=on", does that also break?
>>>>>
>>>>> Robin.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-04-25 13:58                     ` Jason Adriaanse
@ 2023-05-22 10:26                       ` Linux regression tracking (Thorsten Leemhuis)
  2023-05-22 11:01                         ` Robin Murphy
  0 siblings, 1 reply; 19+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-22 10:26 UTC (permalink / raw)
  To: Jason Adriaanse, Robin Murphy, hch
  Cc: baolu.lu, iommu, linux-kernel, linux-pci, regressions

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

I might be missing something, but it looks to me like this regression
was never fixed in mainline. Which is strange, as we apparently had a
patch from Robin that fixed the issue for the reporter.

Did it fall through the cracks or what am I missing?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 25.04.23 15:58, Jason Adriaanse wrote:
> I am happy to report that the change worked, this is what
> drivers/pci/quirks.c looks like
> 
> /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>                          quirk_dma_func1_alias);
> /* https://bugzilla.kernel.org/show_bug.cgi?id=217218 */
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
>                          quirk_dma_func1_alias);
> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>                          quirk_dma_func1_alias);
> 
> Relevant output of dmesg -T with the new kernel running
> 
> Tue Apr 25 21:45:13 2023] scsi host0: ahci
> [Tue Apr 25 21:45:13 2023] scsi host1: ahci
> [Tue Apr 25 21:45:13 2023] scsi host2: ahci
> [Tue Apr 25 21:45:13 2023] scsi host3: ahci
> [Tue Apr 25 21:45:13 2023] ata1: SATA max UDMA/133 abar m2048@0xf7d06000
> port 0xf7d06100 irq 40
> [Tue Apr 25 21:45:13 2023] ata2: SATA max UDMA/133 abar m2048@0xf7d06000
> port 0xf7d06180 irq 40
> [Tue Apr 25 21:45:13 2023] ata3: DUMMY
> [Tue Apr 25 21:45:13 2023] ata4: DUMMY
> [Tue Apr 25 21:45:13 2023] igb 0000:05:00.0 enp5s0: renamed from eth0
> [Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: AHCI 0001.0000 32 slots 4
> ports 6 Gbps 0xf impl SATA mode
> [Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: flags: 64bit ncq sntf led
> only pmp fbs pio slum part sxs
> [Tue Apr 25 21:45:13 2023] scsi host4: ahci
> [Tue Apr 25 21:45:13 2023] scsi host5: ahci
> [Tue Apr 25 21:45:13 2023] scsi host6: ahci
> [Tue Apr 25 21:45:13 2023] scsi host7: ahci
> [Tue Apr 25 21:45:13 2023] ata5: SATA max UDMA/133 abar m2048@0xf7b10000
> port 0xf7b10100 irq 41
> [Tue Apr 25 21:45:13 2023] ata6: SATA max UDMA/133 abar m2048@0xf7b10000
> port 0xf7b10180 irq 41
> [Tue Apr 25 21:45:13 2023] ata7: SATA max UDMA/133 abar m2048@0xf7b10000
> port 0xf7b10200 irq 41
> [Tue Apr 25 21:45:13 2023] ata8: SATA max UDMA/133 abar m2048@0xf7b10000
> port 0xf7b10280 irq 41
> [Tue Apr 25 21:45:13 2023] usb 1-1: new high-speed USB device number 2
> using ehci-pci
> [Tue Apr 25 21:45:14 2023] usb 3-1: new high-speed USB device number 2
> using ehci-pci
> [Tue Apr 25 21:45:14 2023] ata8: SATA link up 6.0 Gbps (SStatus 133
> SControl 300)
> [Tue Apr 25 21:45:14 2023] ata6: SATA link up 6.0 Gbps (SStatus 133
> SControl 300)
> [Tue Apr 25 21:45:14 2023] ata7: SATA link up 6.0 Gbps (SStatus 133
> SControl 300)
> [Tue Apr 25 21:45:14 2023] ata5: SATA link up 6.0 Gbps (SStatus 133
> SControl 300)
> [Tue Apr 25 21:45:14 2023] ata7.00: ATA-9: WDC WD40EFRX-68WT0N0,
> 80.00A80, max UDMA/133
> [Tue Apr 25 21:45:14 2023] ata6.00: ATA-9: WDC WD40EFRX-68WT0N0,
> 80.00A80, max UDMA/133
> [Tue Apr 25 21:45:14 2023] ata8.00: ATA-9: WDC WD40EFRX-68WT0N0,
> 80.00A80, max UDMA/133
> [Tue Apr 25 21:45:14 2023] ata5.00: ATA-10: CT2000BX500SSD1, M6CR030,
> max UDMA/133
> [Tue Apr 25 21:45:14 2023] ata6.00: 7814037168 sectors, multi 0: LBA48
> NCQ (depth 32), AA
> [Tue Apr 25 21:45:14 2023] ata7.00: 7814037168 sectors, multi 0: LBA48
> NCQ (depth 32), AA
> [Tue Apr 25 21:45:14 2023] ata8.00: 7814037168 sectors, multi 0: LBA48
> NCQ (depth 32), AA
> [Tue Apr 25 21:45:14 2023] ata5.00: 3907029168 sectors, multi 1: LBA48
> NCQ (depth 32), AA
> [Tue Apr 25 21:45:14 2023] ata6.00: configured for UDMA/133
> [Tue Apr 25 21:45:14 2023] ata7.00: configured for UDMA/133
> [Tue Apr 25 21:45:14 2023] ata8.00: configured for UDMA/133
> [Tue Apr 25 21:45:14 2023] ata1: SATA link down (SStatus 0 SControl 300)
> [Tue Apr 25 21:45:14 2023] ata5.00: Features: Dev-Sleep
> [Tue Apr 25 21:45:14 2023] ata5.00: configured for UDMA/133
> [Tue Apr 25 21:45:14 2023] usb 1-1: New USB device found, idVendor=8087,
> idProduct=0024, bcdDevice= 0.00
> [Tue Apr 25 21:45:14 2023] usb 1-1: New USB device strings: Mfr=0,
> Product=0, SerialNumber=0
> [Tue Apr 25 21:45:14 2023] hub 1-1:1.0: USB hub found
> [Tue Apr 25 21:45:14 2023] hub 1-1:1.0: 4 ports detected
> [Tue Apr 25 21:45:14 2023] usb 3-1: New USB device found, idVendor=8087,
> idProduct=0024, bcdDevice= 0.00
> [Tue Apr 25 21:45:14 2023] usb 3-1: New USB device strings: Mfr=0,
> Product=0, SerialNumber=0
> [Tue Apr 25 21:45:14 2023] hub 3-1:1.0: USB hub found
> [Tue Apr 25 21:45:14 2023] hub 3-1:1.0: 6 ports detected
> [Tue Apr 25 21:45:14 2023] ata2: SATA link down (SStatus 0 SControl 300)
> [Tue Apr 25 21:45:14 2023] scsi 4:0:0:0: Direct-Access ATA     
> CT2000BX500SSD1  030  PQ: 0 ANSI: 5
> [Tue Apr 25 21:45:14 2023] scsi 5:0:0:0: Direct-Access ATA      WDC
> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
> [Tue Apr 25 21:45:14 2023] scsi 6:0:0:0: Direct-Access ATA      WDC
> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
> [Tue Apr 25 21:45:14 2023] scsi 7:0:0:0: Direct-Access ATA      WDC
> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
> 
> Thanks everyone for all your help.
> 
> Jason
> 
> 
> On 25/04/2023 19:37, Robin Murphy wrote:
>> On 2023-04-25 05:17, Jason Adriaanse wrote:
>>> Ok great,
>>>
>>> I take it a change needs to be made in
>>> drivers/pci/quirks.c
>>> ?
>>> I do not mind making the change locally here and letting you know if
>>> it works or not.
>>
>> Indeed, something like this (make sure the IDs actually match what your
>> device reports, I'm just guessing):
>>
>>
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 44cab813bf95..a9166e886b75 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -4161,6 +4161,8 @@
>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9220,
>>  /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
>>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>>               quirk_dma_func1_alias);
>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
>> +             quirk_dma_func1_alias);
>>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>>               quirk_dma_func1_alias);
>>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,
>>
>>
>> Marvell themselves seem to lump the 88SE92xx products together as a
>> closely-related family, so given that we do have quirks for 3 of the 4
>> already, this one does rather seem conspicuous by its absence...
>>
>> Thanks,
>> Robin.
>>
>>> On 24/04/2023 22:07, Robin Murphy wrote:
>>>> On 2023-04-24 14:44, Jason Adriaanse wrote:
>>>>> I took out "iommu=soft" and the server failed to boot, so yes it
>>>>> does break.
>>>>>
>>>>> The first error was
>>>>> ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)
>>>>
>>>> OK, great, that confirms the underlying issue existed all along, so
>>>> the regression is only a change in who wins a fight between certain
>>>> conflicting command-line arguments, which is arguably not so critical.
>>>>
>>>> The rest of the evidence points to 88SE9235 wanting the same phantom
>>>> function quirk as most other Marvell controllers, since although
>>>> it's apparently been half-fixed such that DMA for two of the ports
>>>> is being correctly emitted from function 0 - given that you say two
>>>> of the disks *are* detected OK - the other two are still claiming to
>>>> be function 1 after all.
>>>>
>>>> Thanks,
>>>> Robin.
>>>>
>>>>> On 24/04/2023 21:20, Robin Murphy wrote:
>>>>>> On 2023-04-22 07:25, Jason Adriaanse wrote:
>>>>>>> Hi Christoph,
>>>>>>>
>>>>>>> Sorry for my late reply, I have been on the road.
>>>>>>>
>>>>>>> So, if I boot with
>>>>>>> intel_iommu=off
>>>>>>> Then the server boots fine..although that is not a solution
>>>>>>> because I need Intel iommu for virtualisation.
>>>>>>>
>>>>>>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>>>>>>
>>>>>>
>>>>>> If you boot 5.15 *without* the "iommu=soft" argument, just
>>>>>> "intel_iommu=on", does that also break?
>>>>>>
>>>>>> Robin.
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-05-22 10:26                       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-05-22 11:01                         ` Robin Murphy
  2023-05-22 11:33                           ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 19+ messages in thread
From: Robin Murphy @ 2023-05-22 11:01 UTC (permalink / raw)
  To: Linux regressions mailing list, Jason Adriaanse, hch
  Cc: baolu.lu, iommu, linux-kernel, linux-pci

On 2023-05-22 11:26, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
> 
> I might be missing something, but it looks to me like this regression
> was never fixed in mainline. Which is strange, as we apparently had a
> patch from Robin that fixed the issue for the reporter.
> 
> Did it fall through the cracks or what am I missing?

Strictly, the regression itself has not been fixed - I guess it does 
just about qualify since the rather-out-of-date 
Documentation/arch/x86/x86_64/boot-options.rst does still say that 
iommu=soft "can be used to prevent the usage of an available hardware 
IOMMU", and that seems to be what has stopped happening here.

What it exposed was a latent issue that this particular device has never 
been properly supported for use with an IOMMU, and that's what I guessed 
at a fix for.

Thanks,
Robin.

> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
> 
> #regzbot poke
> 
> On 25.04.23 15:58, Jason Adriaanse wrote:
>> I am happy to report that the change worked, this is what
>> drivers/pci/quirks.c looks like
>>
>> /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>>                           quirk_dma_func1_alias);
>> /* https://bugzilla.kernel.org/show_bug.cgi?id=217218 */
>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
>>                           quirk_dma_func1_alias);
>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>>                           quirk_dma_func1_alias);
>>
>> Relevant output of dmesg -T with the new kernel running
>>
>> Tue Apr 25 21:45:13 2023] scsi host0: ahci
>> [Tue Apr 25 21:45:13 2023] scsi host1: ahci
>> [Tue Apr 25 21:45:13 2023] scsi host2: ahci
>> [Tue Apr 25 21:45:13 2023] scsi host3: ahci
>> [Tue Apr 25 21:45:13 2023] ata1: SATA max UDMA/133 abar m2048@0xf7d06000
>> port 0xf7d06100 irq 40
>> [Tue Apr 25 21:45:13 2023] ata2: SATA max UDMA/133 abar m2048@0xf7d06000
>> port 0xf7d06180 irq 40
>> [Tue Apr 25 21:45:13 2023] ata3: DUMMY
>> [Tue Apr 25 21:45:13 2023] ata4: DUMMY
>> [Tue Apr 25 21:45:13 2023] igb 0000:05:00.0 enp5s0: renamed from eth0
>> [Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: AHCI 0001.0000 32 slots 4
>> ports 6 Gbps 0xf impl SATA mode
>> [Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: flags: 64bit ncq sntf led
>> only pmp fbs pio slum part sxs
>> [Tue Apr 25 21:45:13 2023] scsi host4: ahci
>> [Tue Apr 25 21:45:13 2023] scsi host5: ahci
>> [Tue Apr 25 21:45:13 2023] scsi host6: ahci
>> [Tue Apr 25 21:45:13 2023] scsi host7: ahci
>> [Tue Apr 25 21:45:13 2023] ata5: SATA max UDMA/133 abar m2048@0xf7b10000
>> port 0xf7b10100 irq 41
>> [Tue Apr 25 21:45:13 2023] ata6: SATA max UDMA/133 abar m2048@0xf7b10000
>> port 0xf7b10180 irq 41
>> [Tue Apr 25 21:45:13 2023] ata7: SATA max UDMA/133 abar m2048@0xf7b10000
>> port 0xf7b10200 irq 41
>> [Tue Apr 25 21:45:13 2023] ata8: SATA max UDMA/133 abar m2048@0xf7b10000
>> port 0xf7b10280 irq 41
>> [Tue Apr 25 21:45:13 2023] usb 1-1: new high-speed USB device number 2
>> using ehci-pci
>> [Tue Apr 25 21:45:14 2023] usb 3-1: new high-speed USB device number 2
>> using ehci-pci
>> [Tue Apr 25 21:45:14 2023] ata8: SATA link up 6.0 Gbps (SStatus 133
>> SControl 300)
>> [Tue Apr 25 21:45:14 2023] ata6: SATA link up 6.0 Gbps (SStatus 133
>> SControl 300)
>> [Tue Apr 25 21:45:14 2023] ata7: SATA link up 6.0 Gbps (SStatus 133
>> SControl 300)
>> [Tue Apr 25 21:45:14 2023] ata5: SATA link up 6.0 Gbps (SStatus 133
>> SControl 300)
>> [Tue Apr 25 21:45:14 2023] ata7.00: ATA-9: WDC WD40EFRX-68WT0N0,
>> 80.00A80, max UDMA/133
>> [Tue Apr 25 21:45:14 2023] ata6.00: ATA-9: WDC WD40EFRX-68WT0N0,
>> 80.00A80, max UDMA/133
>> [Tue Apr 25 21:45:14 2023] ata8.00: ATA-9: WDC WD40EFRX-68WT0N0,
>> 80.00A80, max UDMA/133
>> [Tue Apr 25 21:45:14 2023] ata5.00: ATA-10: CT2000BX500SSD1, M6CR030,
>> max UDMA/133
>> [Tue Apr 25 21:45:14 2023] ata6.00: 7814037168 sectors, multi 0: LBA48
>> NCQ (depth 32), AA
>> [Tue Apr 25 21:45:14 2023] ata7.00: 7814037168 sectors, multi 0: LBA48
>> NCQ (depth 32), AA
>> [Tue Apr 25 21:45:14 2023] ata8.00: 7814037168 sectors, multi 0: LBA48
>> NCQ (depth 32), AA
>> [Tue Apr 25 21:45:14 2023] ata5.00: 3907029168 sectors, multi 1: LBA48
>> NCQ (depth 32), AA
>> [Tue Apr 25 21:45:14 2023] ata6.00: configured for UDMA/133
>> [Tue Apr 25 21:45:14 2023] ata7.00: configured for UDMA/133
>> [Tue Apr 25 21:45:14 2023] ata8.00: configured for UDMA/133
>> [Tue Apr 25 21:45:14 2023] ata1: SATA link down (SStatus 0 SControl 300)
>> [Tue Apr 25 21:45:14 2023] ata5.00: Features: Dev-Sleep
>> [Tue Apr 25 21:45:14 2023] ata5.00: configured for UDMA/133
>> [Tue Apr 25 21:45:14 2023] usb 1-1: New USB device found, idVendor=8087,
>> idProduct=0024, bcdDevice= 0.00
>> [Tue Apr 25 21:45:14 2023] usb 1-1: New USB device strings: Mfr=0,
>> Product=0, SerialNumber=0
>> [Tue Apr 25 21:45:14 2023] hub 1-1:1.0: USB hub found
>> [Tue Apr 25 21:45:14 2023] hub 1-1:1.0: 4 ports detected
>> [Tue Apr 25 21:45:14 2023] usb 3-1: New USB device found, idVendor=8087,
>> idProduct=0024, bcdDevice= 0.00
>> [Tue Apr 25 21:45:14 2023] usb 3-1: New USB device strings: Mfr=0,
>> Product=0, SerialNumber=0
>> [Tue Apr 25 21:45:14 2023] hub 3-1:1.0: USB hub found
>> [Tue Apr 25 21:45:14 2023] hub 3-1:1.0: 6 ports detected
>> [Tue Apr 25 21:45:14 2023] ata2: SATA link down (SStatus 0 SControl 300)
>> [Tue Apr 25 21:45:14 2023] scsi 4:0:0:0: Direct-Access ATA
>> CT2000BX500SSD1  030  PQ: 0 ANSI: 5
>> [Tue Apr 25 21:45:14 2023] scsi 5:0:0:0: Direct-Access ATA      WDC
>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>> [Tue Apr 25 21:45:14 2023] scsi 6:0:0:0: Direct-Access ATA      WDC
>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>> [Tue Apr 25 21:45:14 2023] scsi 7:0:0:0: Direct-Access ATA      WDC
>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>>
>> Thanks everyone for all your help.
>>
>> Jason
>>
>>
>> On 25/04/2023 19:37, Robin Murphy wrote:
>>> On 2023-04-25 05:17, Jason Adriaanse wrote:
>>>> Ok great,
>>>>
>>>> I take it a change needs to be made in
>>>> drivers/pci/quirks.c
>>>> ?
>>>> I do not mind making the change locally here and letting you know if
>>>> it works or not.
>>>
>>> Indeed, something like this (make sure the IDs actually match what your
>>> device reports, I'm just guessing):
>>>
>>>
>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>> index 44cab813bf95..a9166e886b75 100644
>>> --- a/drivers/pci/quirks.c
>>> +++ b/drivers/pci/quirks.c
>>> @@ -4161,6 +4161,8 @@
>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9220,
>>>   /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>>>                quirk_dma_func1_alias);
>>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
>>> +             quirk_dma_func1_alias);
>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>>>                quirk_dma_func1_alias);
>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,
>>>
>>>
>>> Marvell themselves seem to lump the 88SE92xx products together as a
>>> closely-related family, so given that we do have quirks for 3 of the 4
>>> already, this one does rather seem conspicuous by its absence...
>>>
>>> Thanks,
>>> Robin.
>>>
>>>> On 24/04/2023 22:07, Robin Murphy wrote:
>>>>> On 2023-04-24 14:44, Jason Adriaanse wrote:
>>>>>> I took out "iommu=soft" and the server failed to boot, so yes it
>>>>>> does break.
>>>>>>
>>>>>> The first error was
>>>>>> ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)
>>>>>
>>>>> OK, great, that confirms the underlying issue existed all along, so
>>>>> the regression is only a change in who wins a fight between certain
>>>>> conflicting command-line arguments, which is arguably not so critical.
>>>>>
>>>>> The rest of the evidence points to 88SE9235 wanting the same phantom
>>>>> function quirk as most other Marvell controllers, since although
>>>>> it's apparently been half-fixed such that DMA for two of the ports
>>>>> is being correctly emitted from function 0 - given that you say two
>>>>> of the disks *are* detected OK - the other two are still claiming to
>>>>> be function 1 after all.
>>>>>
>>>>> Thanks,
>>>>> Robin.
>>>>>
>>>>>> On 24/04/2023 21:20, Robin Murphy wrote:
>>>>>>> On 2023-04-22 07:25, Jason Adriaanse wrote:
>>>>>>>> Hi Christoph,
>>>>>>>>
>>>>>>>> Sorry for my late reply, I have been on the road.
>>>>>>>>
>>>>>>>> So, if I boot with
>>>>>>>> intel_iommu=off
>>>>>>>> Then the server boots fine..although that is not a solution
>>>>>>>> because I need Intel iommu for virtualisation.
>>>>>>>>
>>>>>>>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>>>>>>>
>>>>>>>
>>>>>>> If you boot 5.15 *without* the "iommu=soft" argument, just
>>>>>>> "intel_iommu=on", does that also break?
>>>>>>>
>>>>>>> Robin.
>>
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-05-22 11:01                         ` Robin Murphy
@ 2023-05-22 11:33                           ` Linux regression tracking (Thorsten Leemhuis)
  2023-06-02 13:07                             ` Thorsten Leemhuis
  0 siblings, 1 reply; 19+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-22 11:33 UTC (permalink / raw)
  To: Robin Murphy, Linux regressions mailing list, Jason Adriaanse,
	hch, Alex Williamson, Bjorn Helgaas
  Cc: baolu.lu, iommu, linux-kernel, linux-pci

On 22.05.23 13:01, Robin Murphy wrote:
> On 2023-05-22 11:26, Linux regression tracking (Thorsten Leemhuis) wrote:
>>
>> I might be missing something, but it looks to me like this regression
>> was never fixed in mainline. Which is strange, as we apparently had a
>> patch from Robin that fixed the issue for the reporter.
>>
>> Did it fall through the cracks or what am I missing?
> 
> Strictly, the regression itself has not been fixed - I guess it does
> just about qualify since the rather-out-of-date
> Documentation/arch/x86/x86_64/boot-options.rst does still say that
> iommu=soft "can be used to prevent the usage of an available hardware
> IOMMU", and that seems to be what has stopped happening here.
> 
> What it exposed was a latent issue that this particular device has never
> been properly supported for use with an IOMMU, and that's what I guessed
> at a fix for.

Thx for the summary. This sounds a lot like you have no interest in
submitting the quirk entry yourself (please correct me if I'm wrong).
Jason from looking at lore doesn't seem to be involved in kernel
development regularly. And I try to stay out of such waters as well, as
I try to draw a line there. Which leads to the question:

Who will now submit the quirk entry?

From "git blame" it seems Bjorn and Alex added most of the other quirk
entries for the marvel controllers (both CCed now). Could one of you add
this one that Ron suggested in [1] as well?

/me wonders if they'd need a "Signed-off-by" from Ron for a one-liner
that is mainly copy-n-paste

Ciao, Thorsten

[1]
https://lore.kernel.org/all/1539e760-392f-a33e-436e-bbf043e79bfc@arm.com/

>> On 25.04.23 15:58, Jason Adriaanse wrote:
>>> I am happy to report that the change worked, this is what
>>> drivers/pci/quirks.c looks like
>>>
>>> /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>>>                           quirk_dma_func1_alias);
>>> /* https://bugzilla.kernel.org/show_bug.cgi?id=217218 */
>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
>>>                           quirk_dma_func1_alias);
>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>>>                           quirk_dma_func1_alias);
>>>
>>> Relevant output of dmesg -T with the new kernel running
>>>
>>> Tue Apr 25 21:45:13 2023] scsi host0: ahci
>>> [Tue Apr 25 21:45:13 2023] scsi host1: ahci
>>> [Tue Apr 25 21:45:13 2023] scsi host2: ahci
>>> [Tue Apr 25 21:45:13 2023] scsi host3: ahci
>>> [Tue Apr 25 21:45:13 2023] ata1: SATA max UDMA/133 abar m2048@0xf7d06000
>>> port 0xf7d06100 irq 40
>>> [Tue Apr 25 21:45:13 2023] ata2: SATA max UDMA/133 abar m2048@0xf7d06000
>>> port 0xf7d06180 irq 40
>>> [Tue Apr 25 21:45:13 2023] ata3: DUMMY
>>> [Tue Apr 25 21:45:13 2023] ata4: DUMMY
>>> [Tue Apr 25 21:45:13 2023] igb 0000:05:00.0 enp5s0: renamed from eth0
>>> [Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: AHCI 0001.0000 32 slots 4
>>> ports 6 Gbps 0xf impl SATA mode
>>> [Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: flags: 64bit ncq sntf led
>>> only pmp fbs pio slum part sxs
>>> [Tue Apr 25 21:45:13 2023] scsi host4: ahci
>>> [Tue Apr 25 21:45:13 2023] scsi host5: ahci
>>> [Tue Apr 25 21:45:13 2023] scsi host6: ahci
>>> [Tue Apr 25 21:45:13 2023] scsi host7: ahci
>>> [Tue Apr 25 21:45:13 2023] ata5: SATA max UDMA/133 abar m2048@0xf7b10000
>>> port 0xf7b10100 irq 41
>>> [Tue Apr 25 21:45:13 2023] ata6: SATA max UDMA/133 abar m2048@0xf7b10000
>>> port 0xf7b10180 irq 41
>>> [Tue Apr 25 21:45:13 2023] ata7: SATA max UDMA/133 abar m2048@0xf7b10000
>>> port 0xf7b10200 irq 41
>>> [Tue Apr 25 21:45:13 2023] ata8: SATA max UDMA/133 abar m2048@0xf7b10000
>>> port 0xf7b10280 irq 41
>>> [Tue Apr 25 21:45:13 2023] usb 1-1: new high-speed USB device number 2
>>> using ehci-pci
>>> [Tue Apr 25 21:45:14 2023] usb 3-1: new high-speed USB device number 2
>>> using ehci-pci
>>> [Tue Apr 25 21:45:14 2023] ata8: SATA link up 6.0 Gbps (SStatus 133
>>> SControl 300)
>>> [Tue Apr 25 21:45:14 2023] ata6: SATA link up 6.0 Gbps (SStatus 133
>>> SControl 300)
>>> [Tue Apr 25 21:45:14 2023] ata7: SATA link up 6.0 Gbps (SStatus 133
>>> SControl 300)
>>> [Tue Apr 25 21:45:14 2023] ata5: SATA link up 6.0 Gbps (SStatus 133
>>> SControl 300)
>>> [Tue Apr 25 21:45:14 2023] ata7.00: ATA-9: WDC WD40EFRX-68WT0N0,
>>> 80.00A80, max UDMA/133
>>> [Tue Apr 25 21:45:14 2023] ata6.00: ATA-9: WDC WD40EFRX-68WT0N0,
>>> 80.00A80, max UDMA/133
>>> [Tue Apr 25 21:45:14 2023] ata8.00: ATA-9: WDC WD40EFRX-68WT0N0,
>>> 80.00A80, max UDMA/133
>>> [Tue Apr 25 21:45:14 2023] ata5.00: ATA-10: CT2000BX500SSD1, M6CR030,
>>> max UDMA/133
>>> [Tue Apr 25 21:45:14 2023] ata6.00: 7814037168 sectors, multi 0: LBA48
>>> NCQ (depth 32), AA
>>> [Tue Apr 25 21:45:14 2023] ata7.00: 7814037168 sectors, multi 0: LBA48
>>> NCQ (depth 32), AA
>>> [Tue Apr 25 21:45:14 2023] ata8.00: 7814037168 sectors, multi 0: LBA48
>>> NCQ (depth 32), AA
>>> [Tue Apr 25 21:45:14 2023] ata5.00: 3907029168 sectors, multi 1: LBA48
>>> NCQ (depth 32), AA
>>> [Tue Apr 25 21:45:14 2023] ata6.00: configured for UDMA/133
>>> [Tue Apr 25 21:45:14 2023] ata7.00: configured for UDMA/133
>>> [Tue Apr 25 21:45:14 2023] ata8.00: configured for UDMA/133
>>> [Tue Apr 25 21:45:14 2023] ata1: SATA link down (SStatus 0 SControl 300)
>>> [Tue Apr 25 21:45:14 2023] ata5.00: Features: Dev-Sleep
>>> [Tue Apr 25 21:45:14 2023] ata5.00: configured for UDMA/133
>>> [Tue Apr 25 21:45:14 2023] usb 1-1: New USB device found, idVendor=8087,
>>> idProduct=0024, bcdDevice= 0.00
>>> [Tue Apr 25 21:45:14 2023] usb 1-1: New USB device strings: Mfr=0,
>>> Product=0, SerialNumber=0
>>> [Tue Apr 25 21:45:14 2023] hub 1-1:1.0: USB hub found
>>> [Tue Apr 25 21:45:14 2023] hub 1-1:1.0: 4 ports detected
>>> [Tue Apr 25 21:45:14 2023] usb 3-1: New USB device found, idVendor=8087,
>>> idProduct=0024, bcdDevice= 0.00
>>> [Tue Apr 25 21:45:14 2023] usb 3-1: New USB device strings: Mfr=0,
>>> Product=0, SerialNumber=0
>>> [Tue Apr 25 21:45:14 2023] hub 3-1:1.0: USB hub found
>>> [Tue Apr 25 21:45:14 2023] hub 3-1:1.0: 6 ports detected
>>> [Tue Apr 25 21:45:14 2023] ata2: SATA link down (SStatus 0 SControl 300)
>>> [Tue Apr 25 21:45:14 2023] scsi 4:0:0:0: Direct-Access ATA
>>> CT2000BX500SSD1  030  PQ: 0 ANSI: 5
>>> [Tue Apr 25 21:45:14 2023] scsi 5:0:0:0: Direct-Access ATA      WDC
>>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>>> [Tue Apr 25 21:45:14 2023] scsi 6:0:0:0: Direct-Access ATA      WDC
>>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>>> [Tue Apr 25 21:45:14 2023] scsi 7:0:0:0: Direct-Access ATA      WDC
>>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>>>
>>> Thanks everyone for all your help.
>>>
>>> Jason
>>>
>>>
>>> On 25/04/2023 19:37, Robin Murphy wrote:
>>>> On 2023-04-25 05:17, Jason Adriaanse wrote:
>>>>> Ok great,
>>>>>
>>>>> I take it a change needs to be made in
>>>>> drivers/pci/quirks.c
>>>>> ?
>>>>> I do not mind making the change locally here and letting you know if
>>>>> it works or not.
>>>>
>>>> Indeed, something like this (make sure the IDs actually match what your
>>>> device reports, I'm just guessing):
>>>>
>>>>
>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>> index 44cab813bf95..a9166e886b75 100644
>>>> --- a/drivers/pci/quirks.c
>>>> +++ b/drivers/pci/quirks.c
>>>> @@ -4161,6 +4161,8 @@
>>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9220,
>>>>   /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
>>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>>>>                quirk_dma_func1_alias);
>>>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
>>>> +             quirk_dma_func1_alias);
>>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>>>>                quirk_dma_func1_alias);
>>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,
>>>>
>>>>
>>>> Marvell themselves seem to lump the 88SE92xx products together as a
>>>> closely-related family, so given that we do have quirks for 3 of the 4
>>>> already, this one does rather seem conspicuous by its absence...
>>>>
>>>> Thanks,
>>>> Robin.
>>>>
>>>>> On 24/04/2023 22:07, Robin Murphy wrote:
>>>>>> On 2023-04-24 14:44, Jason Adriaanse wrote:
>>>>>>> I took out "iommu=soft" and the server failed to boot, so yes it
>>>>>>> does break.
>>>>>>>
>>>>>>> The first error was
>>>>>>> ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)
>>>>>>
>>>>>> OK, great, that confirms the underlying issue existed all along, so
>>>>>> the regression is only a change in who wins a fight between certain
>>>>>> conflicting command-line arguments, which is arguably not so
>>>>>> critical.
>>>>>>
>>>>>> The rest of the evidence points to 88SE9235 wanting the same phantom
>>>>>> function quirk as most other Marvell controllers, since although
>>>>>> it's apparently been half-fixed such that DMA for two of the ports
>>>>>> is being correctly emitted from function 0 - given that you say two
>>>>>> of the disks *are* detected OK - the other two are still claiming to
>>>>>> be function 1 after all.
>>>>>>
>>>>>> Thanks,
>>>>>> Robin.
>>>>>>
>>>>>>> On 24/04/2023 21:20, Robin Murphy wrote:
>>>>>>>> On 2023-04-22 07:25, Jason Adriaanse wrote:
>>>>>>>>> Hi Christoph,
>>>>>>>>>
>>>>>>>>> Sorry for my late reply, I have been on the road.
>>>>>>>>>
>>>>>>>>> So, if I boot with
>>>>>>>>> intel_iommu=off
>>>>>>>>> Then the server boots fine..although that is not a solution
>>>>>>>>> because I need Intel iommu for virtualisation.
>>>>>>>>>
>>>>>>>>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>>>>>>>>
>>>>>>>>
>>>>>>>> If you boot 5.15 *without* the "iommu=soft" argument, just
>>>>>>>> "intel_iommu=on", does that also break?
>>>>>>>>
>>>>>>>> Robin.
>>>
>>>
> 
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-05-22 11:33                           ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-06-02 13:07                             ` Thorsten Leemhuis
  2023-06-06  9:24                               ` Christoph Hellwig
  0 siblings, 1 reply; 19+ messages in thread
From: Thorsten Leemhuis @ 2023-06-02 13:07 UTC (permalink / raw)
  To: Robin Murphy, Linux regressions mailing list, Jason Adriaanse,
	hch, Alex Williamson, Bjorn Helgaas
  Cc: baolu.lu, iommu, linux-kernel, linux-pci

Christoph, could you do me a favor and...

On 22.05.23 13:33, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 22.05.23 13:01, Robin Murphy wrote:
>> On 2023-05-22 11:26, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>
>>> I might be missing something, but it looks to me like this regression
>>> was never fixed in mainline. Which is strange, as we apparently had a
>>> patch from Robin that fixed the issue for the reporter.
>>>
>>> Did it fall through the cracks or what am I missing?
>>
>> Strictly, the regression itself has not been fixed - I guess it does
>> just about qualify since the rather-out-of-date
>> Documentation/arch/x86/x86_64/boot-options.rst does still say that
>> iommu=soft "can be used to prevent the usage of an available hardware
>> IOMMU", and that seems to be what has stopped happening here.
>>
>> What it exposed was a latent issue that this particular device has never
>> been properly supported for use with an IOMMU, and that's what I guessed
>> at a fix for.
> 
> Thx for the summary. This sounds a lot like you have no interest in
> submitting the quirk entry yourself (please correct me if I'm wrong).
> Jason from looking at lore doesn't seem to be involved in kernel
> development regularly. And I try to stay out of such waters as well, as
> I try to draw a line there. Which leads to the question:
> 
> Who will now submit the quirk entry?
> 
> From "git blame" it seems Bjorn and Alex added most of the other quirk
> entries for the marvel controllers (both CCed now). Could one of you add
> this one that Ron suggested in [1] as well?

...submit that quirk, as Bjorn and Alex apparently didn't pick this up?
I could do so myself, but prefer to leave that to people that actually
known what they are doing -- and thus can also handle problems later, in
case any show up. And strictly speaking it apparently was you who caused
this regression with 78013eaadf6 ("x86: remove the IOMMU table
infrastructure").

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

> /me wonders if they'd need a "Signed-off-by" from Ron for a one-liner
> that is mainly copy-n-paste
> 
> Ciao, Thorsten
> 
> [1]
> https://lore.kernel.org/all/1539e760-392f-a33e-436e-bbf043e79bfc@arm.com/
> 
>>> On 25.04.23 15:58, Jason Adriaanse wrote:
>>>> I am happy to report that the change worked, this is what
>>>> drivers/pci/quirks.c looks like
>>>>
>>>> /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
>>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>>>>                           quirk_dma_func1_alias);
>>>> /* https://bugzilla.kernel.org/show_bug.cgi?id=217218 */
>>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
>>>>                           quirk_dma_func1_alias);
>>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>>>>                           quirk_dma_func1_alias);
>>>>
>>>> Relevant output of dmesg -T with the new kernel running
>>>>
>>>> Tue Apr 25 21:45:13 2023] scsi host0: ahci
>>>> [Tue Apr 25 21:45:13 2023] scsi host1: ahci
>>>> [Tue Apr 25 21:45:13 2023] scsi host2: ahci
>>>> [Tue Apr 25 21:45:13 2023] scsi host3: ahci
>>>> [Tue Apr 25 21:45:13 2023] ata1: SATA max UDMA/133 abar m2048@0xf7d06000
>>>> port 0xf7d06100 irq 40
>>>> [Tue Apr 25 21:45:13 2023] ata2: SATA max UDMA/133 abar m2048@0xf7d06000
>>>> port 0xf7d06180 irq 40
>>>> [Tue Apr 25 21:45:13 2023] ata3: DUMMY
>>>> [Tue Apr 25 21:45:13 2023] ata4: DUMMY
>>>> [Tue Apr 25 21:45:13 2023] igb 0000:05:00.0 enp5s0: renamed from eth0
>>>> [Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: AHCI 0001.0000 32 slots 4
>>>> ports 6 Gbps 0xf impl SATA mode
>>>> [Tue Apr 25 21:45:13 2023] ahci 0000:07:00.0: flags: 64bit ncq sntf led
>>>> only pmp fbs pio slum part sxs
>>>> [Tue Apr 25 21:45:13 2023] scsi host4: ahci
>>>> [Tue Apr 25 21:45:13 2023] scsi host5: ahci
>>>> [Tue Apr 25 21:45:13 2023] scsi host6: ahci
>>>> [Tue Apr 25 21:45:13 2023] scsi host7: ahci
>>>> [Tue Apr 25 21:45:13 2023] ata5: SATA max UDMA/133 abar m2048@0xf7b10000
>>>> port 0xf7b10100 irq 41
>>>> [Tue Apr 25 21:45:13 2023] ata6: SATA max UDMA/133 abar m2048@0xf7b10000
>>>> port 0xf7b10180 irq 41
>>>> [Tue Apr 25 21:45:13 2023] ata7: SATA max UDMA/133 abar m2048@0xf7b10000
>>>> port 0xf7b10200 irq 41
>>>> [Tue Apr 25 21:45:13 2023] ata8: SATA max UDMA/133 abar m2048@0xf7b10000
>>>> port 0xf7b10280 irq 41
>>>> [Tue Apr 25 21:45:13 2023] usb 1-1: new high-speed USB device number 2
>>>> using ehci-pci
>>>> [Tue Apr 25 21:45:14 2023] usb 3-1: new high-speed USB device number 2
>>>> using ehci-pci
>>>> [Tue Apr 25 21:45:14 2023] ata8: SATA link up 6.0 Gbps (SStatus 133
>>>> SControl 300)
>>>> [Tue Apr 25 21:45:14 2023] ata6: SATA link up 6.0 Gbps (SStatus 133
>>>> SControl 300)
>>>> [Tue Apr 25 21:45:14 2023] ata7: SATA link up 6.0 Gbps (SStatus 133
>>>> SControl 300)
>>>> [Tue Apr 25 21:45:14 2023] ata5: SATA link up 6.0 Gbps (SStatus 133
>>>> SControl 300)
>>>> [Tue Apr 25 21:45:14 2023] ata7.00: ATA-9: WDC WD40EFRX-68WT0N0,
>>>> 80.00A80, max UDMA/133
>>>> [Tue Apr 25 21:45:14 2023] ata6.00: ATA-9: WDC WD40EFRX-68WT0N0,
>>>> 80.00A80, max UDMA/133
>>>> [Tue Apr 25 21:45:14 2023] ata8.00: ATA-9: WDC WD40EFRX-68WT0N0,
>>>> 80.00A80, max UDMA/133
>>>> [Tue Apr 25 21:45:14 2023] ata5.00: ATA-10: CT2000BX500SSD1, M6CR030,
>>>> max UDMA/133
>>>> [Tue Apr 25 21:45:14 2023] ata6.00: 7814037168 sectors, multi 0: LBA48
>>>> NCQ (depth 32), AA
>>>> [Tue Apr 25 21:45:14 2023] ata7.00: 7814037168 sectors, multi 0: LBA48
>>>> NCQ (depth 32), AA
>>>> [Tue Apr 25 21:45:14 2023] ata8.00: 7814037168 sectors, multi 0: LBA48
>>>> NCQ (depth 32), AA
>>>> [Tue Apr 25 21:45:14 2023] ata5.00: 3907029168 sectors, multi 1: LBA48
>>>> NCQ (depth 32), AA
>>>> [Tue Apr 25 21:45:14 2023] ata6.00: configured for UDMA/133
>>>> [Tue Apr 25 21:45:14 2023] ata7.00: configured for UDMA/133
>>>> [Tue Apr 25 21:45:14 2023] ata8.00: configured for UDMA/133
>>>> [Tue Apr 25 21:45:14 2023] ata1: SATA link down (SStatus 0 SControl 300)
>>>> [Tue Apr 25 21:45:14 2023] ata5.00: Features: Dev-Sleep
>>>> [Tue Apr 25 21:45:14 2023] ata5.00: configured for UDMA/133
>>>> [Tue Apr 25 21:45:14 2023] usb 1-1: New USB device found, idVendor=8087,
>>>> idProduct=0024, bcdDevice= 0.00
>>>> [Tue Apr 25 21:45:14 2023] usb 1-1: New USB device strings: Mfr=0,
>>>> Product=0, SerialNumber=0
>>>> [Tue Apr 25 21:45:14 2023] hub 1-1:1.0: USB hub found
>>>> [Tue Apr 25 21:45:14 2023] hub 1-1:1.0: 4 ports detected
>>>> [Tue Apr 25 21:45:14 2023] usb 3-1: New USB device found, idVendor=8087,
>>>> idProduct=0024, bcdDevice= 0.00
>>>> [Tue Apr 25 21:45:14 2023] usb 3-1: New USB device strings: Mfr=0,
>>>> Product=0, SerialNumber=0
>>>> [Tue Apr 25 21:45:14 2023] hub 3-1:1.0: USB hub found
>>>> [Tue Apr 25 21:45:14 2023] hub 3-1:1.0: 6 ports detected
>>>> [Tue Apr 25 21:45:14 2023] ata2: SATA link down (SStatus 0 SControl 300)
>>>> [Tue Apr 25 21:45:14 2023] scsi 4:0:0:0: Direct-Access ATA
>>>> CT2000BX500SSD1  030  PQ: 0 ANSI: 5
>>>> [Tue Apr 25 21:45:14 2023] scsi 5:0:0:0: Direct-Access ATA      WDC
>>>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>>>> [Tue Apr 25 21:45:14 2023] scsi 6:0:0:0: Direct-Access ATA      WDC
>>>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>>>> [Tue Apr 25 21:45:14 2023] scsi 7:0:0:0: Direct-Access ATA      WDC
>>>> WD40EFRX-68W 0A80 PQ: 0 ANSI: 5
>>>>
>>>> Thanks everyone for all your help.
>>>>
>>>> Jason
>>>>
>>>>
>>>> On 25/04/2023 19:37, Robin Murphy wrote:
>>>>> On 2023-04-25 05:17, Jason Adriaanse wrote:
>>>>>> Ok great,
>>>>>>
>>>>>> I take it a change needs to be made in
>>>>>> drivers/pci/quirks.c
>>>>>> ?
>>>>>> I do not mind making the change locally here and letting you know if
>>>>>> it works or not.
>>>>>
>>>>> Indeed, something like this (make sure the IDs actually match what your
>>>>> device reports, I'm just guessing):
>>>>>
>>>>>
>>>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>>>> index 44cab813bf95..a9166e886b75 100644
>>>>> --- a/drivers/pci/quirks.c
>>>>> +++ b/drivers/pci/quirks.c
>>>>> @@ -4161,6 +4161,8 @@
>>>>> DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9220,
>>>>>   /* https://bugzilla.kernel.org/show_bug.cgi?id=42679#c49 */
>>>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
>>>>>                quirk_dma_func1_alias);
>>>>> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9235,
>>>>> +             quirk_dma_func1_alias);
>>>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
>>>>>                quirk_dma_func1_alias);
>>>>>   DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,
>>>>>
>>>>>
>>>>> Marvell themselves seem to lump the 88SE92xx products together as a
>>>>> closely-related family, so given that we do have quirks for 3 of the 4
>>>>> already, this one does rather seem conspicuous by its absence...
>>>>>
>>>>> Thanks,
>>>>> Robin.
>>>>>
>>>>>> On 24/04/2023 22:07, Robin Murphy wrote:
>>>>>>> On 2023-04-24 14:44, Jason Adriaanse wrote:
>>>>>>>> I took out "iommu=soft" and the server failed to boot, so yes it
>>>>>>>> does break.
>>>>>>>>
>>>>>>>> The first error was
>>>>>>>> ata7.00: Failed to IDENTIFY (INIT_DEV_PARAMS failed , err_mask=0x80)
>>>>>>>
>>>>>>> OK, great, that confirms the underlying issue existed all along, so
>>>>>>> the regression is only a change in who wins a fight between certain
>>>>>>> conflicting command-line arguments, which is arguably not so
>>>>>>> critical.
>>>>>>>
>>>>>>> The rest of the evidence points to 88SE9235 wanting the same phantom
>>>>>>> function quirk as most other Marvell controllers, since although
>>>>>>> it's apparently been half-fixed such that DMA for two of the ports
>>>>>>> is being correctly emitted from function 0 - given that you say two
>>>>>>> of the disks *are* detected OK - the other two are still claiming to
>>>>>>> be function 1 after all.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Robin.
>>>>>>>
>>>>>>>> On 24/04/2023 21:20, Robin Murphy wrote:
>>>>>>>>> On 2023-04-22 07:25, Jason Adriaanse wrote:
>>>>>>>>>> Hi Christoph,
>>>>>>>>>>
>>>>>>>>>> Sorry for my late reply, I have been on the road.
>>>>>>>>>>
>>>>>>>>>> So, if I boot with
>>>>>>>>>> intel_iommu=off
>>>>>>>>>> Then the server boots fine..although that is not a solution
>>>>>>>>>> because I need Intel iommu for virtualisation.
>>>>>>>>>>
>>>>>>>>>> Also, I build all my kernels with CONFIG_INTEL_IOMMU=y
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you boot 5.15 *without* the "iommu=soft" argument, just
>>>>>>>>> "intel_iommu=on", does that also break?
>>>>>>>>>
>>>>>>>>> Robin.
>>>>
>>>>
>>
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-06-02 13:07                             ` Thorsten Leemhuis
@ 2023-06-06  9:24                               ` Christoph Hellwig
  2023-06-06 10:26                                 ` Jason Adriaanse
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2023-06-06  9:24 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Robin Murphy, Jason Adriaanse, hch, Alex Williamson,
	Bjorn Helgaas, baolu.lu, iommu, linux-kernel, linux-pci

On Fri, Jun 02, 2023 at 03:07:19PM +0200, Thorsten Leemhuis wrote:
> Christoph, could you do me a favor and...

> ...submit that quirk, as Bjorn and Alex apparently didn't pick this up?
> I could do so myself, but prefer to leave that to people that actually
> known what they are doing -- and thus can also handle problems later, in
> case any show up. And strictly speaking it apparently was you who caused
> this regression with 78013eaadf6 ("x86: remove the IOMMU table
> infrastructure").

Well, Robin posted it so I think he should also finish it up and get
the credit.  Robin, can you send the quirk with a formal signoff?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235
  2023-06-06  9:24                               ` Christoph Hellwig
@ 2023-06-06 10:26                                 ` Jason Adriaanse
  0 siblings, 0 replies; 19+ messages in thread
From: Jason Adriaanse @ 2023-06-06 10:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: regressions, Robin Murphy, Alex Williamson, Bjorn Helgaas,
	baolu.lu, iommu, linux-kernel, linux-pci

I'll take a sub credit 🙂

On 6 Jun 2023, 11:24 am, at 11:24 am, Christoph Hellwig <hch@lst.de> wrote:
>On Fri, Jun 02, 2023 at 03:07:19PM +0200, Thorsten Leemhuis wrote:
>> Christoph, could you do me a favor and...
>
>> ...submit that quirk, as Bjorn and Alex apparently didn't pick this
>up?
>> I could do so myself, but prefer to leave that to people that
>actually
>> known what they are doing -- and thus can also handle problems later,
>in
>> case any show up. And strictly speaking it apparently was you who
>caused
>> this regression with 78013eaadf6 ("x86: remove the IOMMU table
>> infrastructure").
>
>Well, Robin posted it so I think he should also finish it up and get
>the credit.  Robin, can you send the quirk with a formal signoff?


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-06-06 10:28 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <a79ea7f5-6a41-a6c9-cfec-ba01aa2a3cfa@leemhuis.info>
2023-03-28  1:22 ` [regression] Bug 217218 - Trying to boot Linux version 6-2.2 kernel with Marvell SATA controller 88SE9235 Christoph Hellwig
2023-03-30 12:18   ` Robin Murphy
2023-03-31  2:20     ` Jason Adriaanse
2023-04-16  6:55       ` Christoph Hellwig
2023-04-22  6:25         ` Jason Adriaanse
2023-04-24 13:20           ` Robin Murphy
2023-04-24 13:44             ` Jason Adriaanse
2023-04-24 14:07               ` Robin Murphy
2023-04-25  4:17                 ` Jason Adriaanse
2023-04-25 11:37                   ` Robin Murphy
2023-04-25 13:58                     ` Jason Adriaanse
2023-05-22 10:26                       ` Linux regression tracking (Thorsten Leemhuis)
2023-05-22 11:01                         ` Robin Murphy
2023-05-22 11:33                           ` Linux regression tracking (Thorsten Leemhuis)
2023-06-02 13:07                             ` Thorsten Leemhuis
2023-06-06  9:24                               ` Christoph Hellwig
2023-06-06 10:26                                 ` Jason Adriaanse
2023-04-16  6:41     ` Christoph Hellwig
2023-04-17 11:21       ` Robin Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).