All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Walle <michael@walle.cc>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org,
	Bjorn Helgaas <bhelgaas@google.com>,
	Paul Menzel <pmenzel@molgen.mpg.de>
Subject: Re: [PATCH v2] PCI: Fix Intel i210 by avoiding overlapping of BARs
Date: Thu, 23 Dec 2021 10:27:15 +0100	[thread overview]
Message-ID: <39ecedffaf8e2ee931379de7e2f7924f@walle.cc> (raw)
In-Reply-To: <20211221174808.GA1094860@bhelgaas>

Hi Bjorn,

Am 2021-12-21 18:48, schrieb Bjorn Helgaas:
> [+to Jesse, Tony for Intel advice;
> beginning of thread:
> https://lore.kernel.org/all/20201230185317.30915-1-michael@walle.cc/]
> 
> On Mon, Dec 20, 2021 at 06:43:03PM +0100, Michael Walle wrote:
>> ...
>> ping #4
>> 
>> In a few days this is a year old. Please have a look at it and
>> either add my quirk patch or apply your patch. This is still
>> breaking i210 on my board.
>> 
>> TBH, this is really frustrating.
> 
> You are right to be frustrated.  I'm very sorry that I have dropped
> the ball on this.  Thanks for reminding me *again*.
> 
> I think we agree that this looks like an I210 defect.  I210 should
> ignore the ROM BAR contents unless PCI_ROM_ADDRESS_ENABLE is set.  It
> would be great if an Intel person could confirm/deny this and supply
> an erratum reference and verify the affected device IDs.
> 
> It seems that when the BARs are programmed like this:
> 
>   BAR 0: 0x40000000 (32-bit, non-prefetchable) [size=1M]
>   BAR 3: 0x40200000 (32-bit, non-prefetchable) [size=16K]
>   ROM:   0x40200000 (disabled) [size=1M]
> 
> networking doesn't work at all and the transmit queue times out.
> 
> Linux assigns non-overlapping address space to the ROM BAR, but
> pci_std_update_resource() currently doesn't update the BAR itself
> unless it is enabled.
> 
> My proposal [1] worked around the defect by always updating the BAR,
> but there's no clue that this covers up the I210 issue, so it remains
> as sort of a land mine.  A future change could re-expose the problem,
> so I don't think this was a good approach.
> 
> Your original patch [2] makes it clear that it's an issue with I210,
> but there's an implicit connection between the normal BAR update path
> (which skips the actual BAR write) and the quirk that does the BAR
> write:
> 
>   <enumeration resource assignment>
>     ...
>       pci_assign_resource
>         pci_update_resource
>           pci_std_update_resource
>             if (ROM && ROM-disabled)
>               return
>             pci_write_config_dword      # ROM BAR update (skipped)
> 
>   pci_fixup_write_rom_bar               # final fixup
>     pci_write_config_dword              # ROM BAR update
> 
> In the boot-time resource assignment path, this works fine, but if
> pci_assign_resource() is called from pci_map_rom(), the fixup will not
> happen, so we could still have problem.

I'm not sure I follow. pci_map_rom() should work fine. That was also
the workaround IIRC, that is, enable the rom via sysfs. pci_map_rom()
will call pci_enable_rom(), which should be the same as the fixup.
If memory serves correctly, that is where I shamelessly copied the
code from ;)

btw. the problem is not in pci_assign_resource() which assigns the
correct offsets, but that they are not written to the PCI card.
Eg. see lspci:

# lspci -s 2:1:0 -v
0002:01:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network 
Connection (rev 03)
	Subsystem: Hewlett-Packard Company Ethernet I210-T1 GbE NIC
	Flags: bus master, fast devsel, latency 0, IRQ 34, IOMMU group 8
	Memory at 8840000000 (32-bit, non-prefetchable) [size=1M]
	Memory at 8840200000 (32-bit, non-prefetchable) [size=16K]
	Expansion ROM at 8840100000 [disabled] [size=1M]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
	Capabilities: [a0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 00-de-ad-ff-ff-be-ef-04
	Capabilities: [1a0] Transaction Processing Hints
	Kernel driver in use: igb
lspci: Unable to load libkmod resources: error -2

# lspci -s 2:1:0 -xx
0002:01:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network 
Connection (rev 03)
00: 86 80 33 15 06 04 10 00 03 00 00 02 08 00 00 00
10: 00 00 00 40 00 00 00 00 00 00 00 00 00 00 20 40
20: 00 00 00 00 00 00 00 00 00 00 00 00 3c 10 03 00
30: 00 00 20 40 40 00 00 00 00 00 00 00 22 01 00 00

Note the difference between "Expansion ROM at 8840100000" (assigned
by pci_assign_resource() I guess) and the actual value at offset
0x30: 0x40200000. The latter will be updated either by pci_enable_rom()
or my pci fixup quirk.

> If we tweaked pci_std_update_resource() to take account of this
> defect, I think we could cover that path, too.
> 
> Can you try the patch below?

I tried, but it doesn't work because the fixup function is called
after pci_std_update_resource(), thus dev->rom_bar_overlap is still
0.

Thanks,
-michael

WARNING: multiple messages have this Message-ID (diff)
From: Michael Walle <michael@walle.cc>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [PATCH v2] PCI: Fix Intel i210 by avoiding overlapping of BARs
Date: Thu, 23 Dec 2021 10:27:15 +0100	[thread overview]
Message-ID: <39ecedffaf8e2ee931379de7e2f7924f@walle.cc> (raw)
In-Reply-To: <20211221174808.GA1094860@bhelgaas>

Hi Bjorn,

Am 2021-12-21 18:48, schrieb Bjorn Helgaas:
> [+to Jesse, Tony for Intel advice;
> beginning of thread:
> https://lore.kernel.org/all/20201230185317.30915-1-michael at walle.cc/]
> 
> On Mon, Dec 20, 2021 at 06:43:03PM +0100, Michael Walle wrote:
>> ...
>> ping #4
>> 
>> In a few days this is a year old. Please have a look at it and
>> either add my quirk patch or apply your patch. This is still
>> breaking i210 on my board.
>> 
>> TBH, this is really frustrating.
> 
> You are right to be frustrated.  I'm very sorry that I have dropped
> the ball on this.  Thanks for reminding me *again*.
> 
> I think we agree that this looks like an I210 defect.  I210 should
> ignore the ROM BAR contents unless PCI_ROM_ADDRESS_ENABLE is set.  It
> would be great if an Intel person could confirm/deny this and supply
> an erratum reference and verify the affected device IDs.
> 
> It seems that when the BARs are programmed like this:
> 
>   BAR 0: 0x40000000 (32-bit, non-prefetchable) [size=1M]
>   BAR 3: 0x40200000 (32-bit, non-prefetchable) [size=16K]
>   ROM:   0x40200000 (disabled) [size=1M]
> 
> networking doesn't work at all and the transmit queue times out.
> 
> Linux assigns non-overlapping address space to the ROM BAR, but
> pci_std_update_resource() currently doesn't update the BAR itself
> unless it is enabled.
> 
> My proposal [1] worked around the defect by always updating the BAR,
> but there's no clue that this covers up the I210 issue, so it remains
> as sort of a land mine.  A future change could re-expose the problem,
> so I don't think this was a good approach.
> 
> Your original patch [2] makes it clear that it's an issue with I210,
> but there's an implicit connection between the normal BAR update path
> (which skips the actual BAR write) and the quirk that does the BAR
> write:
> 
>   <enumeration resource assignment>
>     ...
>       pci_assign_resource
>         pci_update_resource
>           pci_std_update_resource
>             if (ROM && ROM-disabled)
>               return
>             pci_write_config_dword      # ROM BAR update (skipped)
> 
>   pci_fixup_write_rom_bar               # final fixup
>     pci_write_config_dword              # ROM BAR update
> 
> In the boot-time resource assignment path, this works fine, but if
> pci_assign_resource() is called from pci_map_rom(), the fixup will not
> happen, so we could still have problem.

I'm not sure I follow. pci_map_rom() should work fine. That was also
the workaround IIRC, that is, enable the rom via sysfs. pci_map_rom()
will call pci_enable_rom(), which should be the same as the fixup.
If memory serves correctly, that is where I shamelessly copied the
code from ;)

btw. the problem is not in pci_assign_resource() which assigns the
correct offsets, but that they are not written to the PCI card.
Eg. see lspci:

# lspci -s 2:1:0 -v
0002:01:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network 
Connection (rev 03)
	Subsystem: Hewlett-Packard Company Ethernet I210-T1 GbE NIC
	Flags: bus master, fast devsel, latency 0, IRQ 34, IOMMU group 8
	Memory at 8840000000 (32-bit, non-prefetchable) [size=1M]
	Memory at 8840200000 (32-bit, non-prefetchable) [size=16K]
	Expansion ROM at 8840100000 [disabled] [size=1M]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
	Capabilities: [a0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 00-de-ad-ff-ff-be-ef-04
	Capabilities: [1a0] Transaction Processing Hints
	Kernel driver in use: igb
lspci: Unable to load libkmod resources: error -2

# lspci -s 2:1:0 -xx
0002:01:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network 
Connection (rev 03)
00: 86 80 33 15 06 04 10 00 03 00 00 02 08 00 00 00
10: 00 00 00 40 00 00 00 00 00 00 00 00 00 00 20 40
20: 00 00 00 00 00 00 00 00 00 00 00 00 3c 10 03 00
30: 00 00 20 40 40 00 00 00 00 00 00 00 22 01 00 00

Note the difference between "Expansion ROM at 8840100000" (assigned
by pci_assign_resource() I guess) and the actual value at offset
0x30: 0x40200000. The latter will be updated either by pci_enable_rom()
or my pci fixup quirk.

> If we tweaked pci_std_update_resource() to take account of this
> defect, I think we could cover that path, too.
> 
> Can you try the patch below?

I tried, but it doesn't work because the fixup function is called
after pci_std_update_resource(), thus dev->rom_bar_overlap is still
0.

Thanks,
-michael

  reply	other threads:[~2021-12-23  9:27 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-30 18:53 [PATCH v2] PCI: Fix Intel i210 by avoiding overlapping of BARs Michael Walle
2020-12-30 18:53 ` [Intel-wired-lan] " Michael Walle
2021-01-08 21:20 ` Bjorn Helgaas
2021-01-08 21:20   ` [Intel-wired-lan] " Bjorn Helgaas
2021-01-09 18:31   ` Michael Walle
2021-01-09 18:31     ` [Intel-wired-lan] " Michael Walle
2021-01-12 22:58     ` Bjorn Helgaas
2021-01-12 22:58       ` [Intel-wired-lan] " Bjorn Helgaas
2021-01-12 23:32       ` Michael Walle
2021-01-12 23:32         ` [Intel-wired-lan] " Michael Walle
2021-01-15 23:57         ` Bjorn Helgaas
2021-01-15 23:57           ` [Intel-wired-lan] " Bjorn Helgaas
2021-01-17 19:27           ` Michael Walle
2021-01-17 19:27             ` [Intel-wired-lan] " Michael Walle
2021-02-01 19:49             ` Michael Walle
2021-02-01 19:49               ` [Intel-wired-lan] " Michael Walle
2021-02-01 22:20               ` Bjorn Helgaas
2021-02-01 22:20                 ` [Intel-wired-lan] " Bjorn Helgaas
2021-03-15 21:51                 ` Michael Walle
2021-03-15 21:51                   ` [Intel-wired-lan] " Michael Walle
2021-08-20 15:12                   ` Michael Walle
2021-08-20 15:12                     ` [Intel-wired-lan] " Michael Walle
2021-12-20 17:43                     ` Michael Walle
2021-12-20 17:43                       ` [Intel-wired-lan] " Michael Walle
2021-12-21 17:48                       ` Bjorn Helgaas
2021-12-21 17:48                         ` [Intel-wired-lan] " Bjorn Helgaas
2021-12-23  9:27                         ` Michael Walle [this message]
2021-12-23  9:27                           ` Michael Walle
2021-12-23 16:37                           ` Bjorn Helgaas
2021-12-23 16:37                             ` [Intel-wired-lan] " Bjorn Helgaas
2021-12-23 18:12                             ` Michael Walle
2021-12-23 18:12                               ` [Intel-wired-lan] " Michael Walle
2022-01-12 14:50                               ` Bjorn Helgaas
2022-01-12 14:50                                 ` [Intel-wired-lan] " Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=39ecedffaf8e2ee931379de7e2f7924f@walle.cc \
    --to=michael@walle.cc \
    --cc=anthony.l.nguyen@intel.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jesse.brandeburg@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=pmenzel@molgen.mpg.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.