All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Daniel J Blueman <daniel@numascale.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Jiang Liu <jiang.liu@linux.intel.com>,
	H Peter Anvin <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Steffen Persvold <sp@numascale.com>,
	"x86@kernel.org" <x86@kernel.org>,
	Yinghai Lu <yinghai@kernel.org>
Subject: Re: PCIe 32-bit MMIO exhaustion
Date: Thu, 29 Jan 2015 09:23:29 -0600	[thread overview]
Message-ID: <CAErSpo4t94BXYhhE+bh5-3_PsdKTSa113beqqz14W+_emdCGMQ@mail.gmail.com> (raw)
In-Reply-To: <54C8A10B.3070207@numascale.com>

[+cc Yinghai]

Hi Daniel,

On Wed, Jan 28, 2015 at 2:42 AM, Daniel J Blueman <daniel@numascale.com> wrote:
> With systems with a large number of PCI devices, we're seeing lack of 32-bit
> MMIO space, eg one quad-port NetXtreme-2 adapter takes 128MB of space [1].
>
> An errata to the PCIe 2.1 spec provides guidance on limitations with 64-bit
> non-prefetchable BARs (since bridges have only 32-bit non-prefetchable
> ranges) stating that vendors can enable the prefetchable bit in BARs under
> certain circumstances to allow 64-bit allocation [2].
>
> The problem with that, is that vendors can't know apriori what hosts their
> products will be in, so can't just advertise prefetchable 64-bit BARs. What
> can be done, is system firmware can use the 64-bit prefetchable BAR in
> bridges, and assign a 64-bit non-prefetchable device BAR into that area,
> where it is safe to do so (following the guidance).
>
> At present, linux denies such allocations [3] and disables the BARs. It
> seems a practical solution to allow them if the firmware believes it is
> safe.

This particular message ([3]):

> pci 0002:01:00.0: BAR 0: [mem size 0x00002000 64bit] conflicts with PCI Bus
> 0002:00 [mem 0x10020000000-0x10027ffffff pref]

is misleading at best and likely a symptom of a bug.  We printed the
*size* of BAR 0, not an address, which means we haven't assigned space
for the BAR.  That means it should not conflict with anything.

We already do revert to firmware assignments in some situations when
Linux can't figure out how to assign things itself.  But apparently
not in *this* situation.

Without seeing the whole picture, it's hard for me to figure out
what's going on here.  Could you open a bug report at
http://bugzilla.kernel.org (category drivers/PCI) and attach a
complete dmesg and "lspci -vv" output?  Then we can look at what
firmware did and what Linux thought was wrong with it.

Bjorn

> --- [1]
>
> 0000:01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
>         Subsystem: Dell Device 1f26
>         Flags: bus master, fast devsel, latency 0, IRQ 24
>         Memory at e6000000 (64-bit, non-prefetchable) [size=32M]
>         Capabilities: [48] Power Management version 3
>         Capabilities: [50] Vital Product Data
>         Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+
>         Capabilities: [a0] MSI-X: Enable+ Count=9 Masked-
>         Capabilities: [ac] Express Endpoint, MSI 00
>         Capabilities: [100] Device Serial Number d4-ae-52-ff-fe-ea-5c-e8
>         Capabilities: [110] Advanced Error Reporting
>         Capabilities: [150] Power Budgeting <?>
>         Capabilities: [160] Virtual Channel
>         Kernel driver in use: bnx2
>
> 0000:01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
>         Subsystem: Dell Device 1f26
>         Flags: bus master, fast devsel, latency 0, IRQ 25
>         Memory at e8000000 (64-bit, non-prefetchable) [size=32M]
>         Capabilities: [48] Power Management version 3
>         Capabilities: [50] Vital Product Data
>         Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+
>         Capabilities: [a0] MSI-X: Enable- Count=9 Masked-
>         Capabilities: [ac] Express Endpoint, MSI 00
>         Capabilities: [100] Device Serial Number d4-ae-52-ff-fe-ea-5c-ea
>         Capabilities: [110] Advanced Error Reporting
>         Capabilities: [150] Power Budgeting <?>
>         Capabilities: [160] Virtual Channel
>         Kernel driver in use: bnx2
>
> 0000:02:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
>         Subsystem: Dell Device 1f26
>         Flags: bus master, fast devsel, latency 0, IRQ 28
>         Memory at ea000000 (64-bit, non-prefetchable) [size=32M]
>         Capabilities: [48] Power Management version 3
>         Capabilities: [50] Vital Product Data
>         Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+
>         Capabilities: [a0] MSI-X: Enable- Count=9 Masked-
>         Capabilities: [ac] Express Endpoint, MSI 00
>         Capabilities: [100] Device Serial Number d4-ae-52-ff-fe-ea-5c-ec
>         Capabilities: [110] Advanced Error Reporting
>         Capabilities: [150] Power Budgeting <?>
>         Capabilities: [160] Virtual Channel
>         Kernel driver in use: bnx2
>
> 0000:02:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709
> Gigabit Ethernet (rev 20)
>         Subsystem: Dell Device 1f26
>         Flags: bus master, fast devsel, latency 0, IRQ 29
>         Memory at ec000000 (64-bit, non-prefetchable) [size=32M]
>         Capabilities: [48] Power Management version 3
>         Capabilities: [50] Vital Product Data
>         Capabilities: [58] MSI: Enable- Count=1/16 Maskable- 64bit+
>         Capabilities: [a0] MSI-X: Enable- Count=9 Masked-
>         Capabilities: [ac] Express Endpoint, MSI 00
>         Capabilities: [100] Device Serial Number d4-ae-52-ff-fe-ea-5c-ee
>         Capabilities: [110] Advanced Error Reporting
>         Capabilities: [150] Power Budgeting <?>
>         Capabilities: [160] Virtual Channel
>         Kernel driver in use: bnx2
>
> -- [2] p13
>
> https://www.pcisig.com/specifications/pciexpress/base2/PCIe_Base_r2.1_Errata_08Jun10.pdf
>
> -- [3]
>
> pci 0002:01:00.0: BAR 0: [mem size 0x00002000 64bit] conflicts with PCI Bus
> 0002:00 [mem 0x10020000000-0x10027ffffff pref]
> --
> Daniel J Blueman
> Principal Software Engineer, Numascale

  reply	other threads:[~2015-01-29 15:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-28  8:42 PCIe 32-bit MMIO exhaustion Daniel J Blueman
2015-01-29 15:23 ` Bjorn Helgaas [this message]
2015-02-24  4:37   ` Daniel J Blueman
2015-03-03 22:38     ` Bjorn Helgaas
2015-03-03 22:38       ` Bjorn Helgaas
2015-03-04  7:12       ` Daniel J Blueman
2015-03-04 17:01         ` Bjorn Helgaas
2015-03-19 15:04           ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAErSpo4t94BXYhhE+bh5-3_PsdKTSa113beqqz14W+_emdCGMQ@mail.gmail.com \
    --to=bhelgaas@google.com \
    --cc=daniel@numascale.com \
    --cc=hpa@zytor.com \
    --cc=jiang.liu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=sp@numascale.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.