linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hinko Kocevar <hinko.kocevar@ess.eu>
To: Sergei Miroshnichenko <s.miroshnichenko@yadro.com>,
	"helgaas@kernel.org" <helgaas@kernel.org>
Cc: "David.Laight@ACULAB.COM" <David.Laight@ACULAB.COM>,
	"rajatja@google.com" <rajatja@google.com>,
	"christian.koenig@amd.com" <christian.koenig@amd.com>,
	"andy.lavr@gmail.com" <andy.lavr@gmail.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"sr@denx.de" <sr@denx.de>, "lukas@wunner.de" <lukas@wunner.de>,
	"linux@yadro.com" <linux@yadro.com>
Subject: Re: [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug
Date: Thu, 4 Feb 2021 09:26:18 +0100	[thread overview]
Message-ID: <9fa281a7-a81b-a6e7-8e90-21e4ac058e33@ess.eu> (raw)
In-Reply-To: <de82694ed11f2c8a996d3701da04a8ee87fe6be5.camel@yadro.com>

Dear,

On 2/3/21 8:59 PM, Sergei Miroshnichenko wrote:
> Hi Bjorn,
> 
> On Thu, 2021-01-28 at 08:53 -0600, Bjorn Helgaas wrote:
>> I can certainly see scenarios where this functionality will be
>> useful,
>> but the series currently doesn't mention bug reports that it
>> fixes.  I
>> suspect there *are* some related bug reports, e.g., for Thunderbolt
>> hotplug.  We should dig them up, include pointers to them, and get
>> the
>> reporters to test the series and provide feedback.
> 
> It never occurred to me to actually search for reports, thanks.
> 
>> We had some review traffic on earlier versions, but as far as I can
>> see, nobody has stepped up with any actual signs of approval, e.g.,
>> Tested-by, Reviewed-by, or Acked-by tags.  That's a problem because
>> this touches a lot of sensitive code and I don't want to be stuck
>> fixing issues all by myself.  When issues crop up, I look to the
>> author and reviewers to help out.
> 
> Indeed, I had only a few informal personal responses [from outside of
> our company]:
>   - some machines have no regressions with the kernel built with Clang-
> 12+LTO+IAS;
>   - NVMe hotplug started working for AMD Epyc with a PEX switch;
>   - Another system with several PEX switches is under ongoing
> experiments.
> 


I reached out to Sergei a while ago about these patches on systems I 
work with. They involve a PEX switch on a separate card (MCH).

FWIW, I find this patchset really interesting and useful in the modular 
systems such as microTCA [1]. In those, up to 12 PCIe hotswappable 
mezzanine cards (AMCs) can be added/removed at anytime. Each can be a 
PCIe endpoint. The infrastructure usually involves a CPU card with a 
PCIe root complex. In addition, PCIe switch is located on the carrier 
hub (MCH) card which handles the PCIe links for AMCs.

At this time, if the system is booted without a certain slot being 
populated at CPU/Linux startup adding a card requires system power 
cycle. Today there are already (minor?) issues with the hotplug/hotswap 
of the cards that are present at Linux bootup hence folks that use these 
systems usually do not rely on any these features; as a result boxes are 
power cycled. I'm not saying it is all Linux;s fault, as the microTCA is 
inherently complex due to modularity and versatility of configurations 
one can have, even without hotplugging. It is also a matter of 
understanding how to configure and use the hotplugging that can be a 
challenge on its own.

Some preliminary tests of this patchset, I managed to run so far, 
suggest that newly added cards are nicely being recognized by the Linux 
and are ready for use without CPU reboot. I currently do not have access 
to the microTCA boxes I use. I hope to be in a better place in couple of 
weeks and look into these patches again!

Thanks!
//hinko

[1] 
https://www.picmg.org/wp-content/uploads/MicroTCA_Short_Form_Sept_2006.pdf

> Providing an ability to quickly return to the old BARs allocating is
> the reason why I keep wrapping the most intrusive code with "if
> (pci_can_move_bars)". Originally I wanted it to be disabled by default,
> available for those truly desperate. But there was an objection that
> the code will not be ever tested this way.
> 
>> Along that line, I'm a little concerned about how we will be able to
>> reproduce and debug problems.  Issues will likely depend on the
>> topology, the resources of the specific devices involved, and a
>> specific sequence of hotplug operations.  Those may be hard to
>> reproduce even for the reporter.  Maybe this is nothing more than a
>> grep pattern to extract relevant events from dmesg.  Ideal, but maybe
>> impractical, would be a way to reproduce things in a VM using qemu
>> and
>> a script to simulate hotplugs.
> 
> I haven't yet tried the qemu for PCI debugging, definitely need to take
> a look. To help with that, the patchset is supplied with additional
> prints, but CONFIG_PCI_DEBUG=y is still preferred. A simulation will
> reveal if additional debug messages are needed.
> 
>>> The feature is usable not only for hotplug, but also for booting
>>> with a
>>> complete set of BARs assigned, and for Resizable BARs.
>>
>> I'm interested in more details about both of these.  What does "a
>> complete set of BARs assigned" mean?  On x86, the BIOS usually
>> assigns
>> all the BARs ahead of time, but I know that's not the case for all
>> arches.
> 
> Usually yes, but we have at least one x86 PC that skips some BARs (need
> to check out again its particular model and version, IIRC that was
> Z370-F) -- not the crucial ones like BAR0, though. That really got me
> by surprise, so now it is one more case covered.
> 
>> For Resizable BARs, is the point that this allows more flexibility in
>> resizing BARs because we can now move things out of the way?
> 
> Not only things out of the way, but most importantly the bridge windows
> are now floating as well. So it's more freedom for a new BAR to
> "choose" a place.
> 
> An awfully synthetic example:
> 
> |                       RC Address Space                       |
> | orig GPU BAR | fixed BAR |                                   |
> |              | fixed BAR |     Resized GPU BAR        |      |
> 
>>> Tested on a number of x86_64 machines without any special kernel
>>> command
>>> line arguments (some of them -- with older v8 revision of this
>>> patchset):
>>>   - PC: i7-5930K + ASUS X99-A;
>>>   - PC: i5-8500 + ASUS Z370-F;
>>>   - PC: AMD Ryzen 7 3700X + ASRock X570 + AMD 5700 XT (Resizable
>>> BARs);
>>>   - Supermicro Super Server/H11SSL-i: AMD EPYC 7251;
>>>   - HP ProLiant DL380 G5: Xeon X5460;
>>>   - Dell Inspiron N5010: i5 M 480;
>>>   - Dell Precision M6600: i7-2920XM.
>>
>> Does this testing show no change in behavior and no regressions, or
>> does it show that this series fixes cases that previously did not
>> work?  If the latter, some bugzilla reports with before/after dmesg
>> logs would give some insight into how this works and also some good
>> justification.
> 
> Both, actually, but the dmesg logs were never published, so they will
> be the next step -- combined with qemu scripts, if it works.
> 
>>> This patchset is a part of our work on adding support for hot-
>>> adding
>>> chains of chassis full of other bridges, NVME drives, SAS HBAs,
>>> GPUs, etc.
>>> without special requirements such as Hot-Plug Controller,
>>> reservation of
>>> bus numbers or memory regions by firmware, etc.
>>
>> This is a little bit of a chicken and egg situation.  I suspect many
>> people would like this functionality, but currently we either avoid
>> it
>> because it's "known not to work" or we hack around it with the
>> firmware reservations you mention.
> 
> The current (v9) patchset became more symbiotic with other hotplug
> facilities: it started to respect pci=hpmemsize etc, and only drops it
> if such allocation is impossible.
> 
> Attention Button for Assisted Hotplug (pciehp driver) was always
> supported. It is also fully compatible with DPC.
> 
> The goal is to squeeze as much as possible even without special
> hw+fw+sw support, when the only available tool is a manual rescan via
> sysfs.
> 
> Serge
> 

      reply	other threads:[~2021-02-04  8:27 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-18 17:39 [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 01/26] PCI: Fix race condition in pci_enable/disable_device() Sergei Miroshnichenko
2020-12-28 15:37   ` Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 02/26] PCI: Ensure a bridge has I/O and MEM access for hot-added devices Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 03/26] PCI: hotplug: Initial support of the movable BARs feature Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 04/26] PCI: Add version of release_child_resources() aware of fixed BARs Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 05/26] PCI: hotplug: Fix reassigning the released BARs Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 06/26] PCI: hotplug: Recalculate every bridge window during rescan Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 07/26] PCI: hotplug: Don't allow hot-added devices to steal resources Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 08/26] PCI: Reassign BARs if BIOS/bootloader had assigned not all of them Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 09/26] PCI: Movable BARs: Make just a single attempt to assign bridge resources Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 10/26] PCI: hotplug: Calculate fixed parts of bridge windows Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 11/26] PCI: Include fixed BARs into the bus size calculating Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 12/26] PCI: Make sure bridge windows include their fixed BARs Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 13/26] PCI: hotplug: Add support of fixed BARs to pci_assign_resource() Sergei Miroshnichenko
2020-12-18 17:39 ` [PATCH v9 14/26] PCI: hotplug: Sort fixed BARs before assignment Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 15/26] x86/PCI/ACPI: Fix up PCIBIOS_MIN_MEM if value computed from e820 is invalid Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 16/26] PCI: hotplug: Configure MPS after manual bus rescan Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 17/26] PCI: hotplug: Don't disable the released bridge windows immediately Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 18/26] PCI: pciehp: Trigger a domain rescan on hp events when enabled movable BARs Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 19/26] PCI: Don't claim fixed BARs Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 20/26] PCI: hotplug: Retry bus assignment without reserved space Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 21/26] PCI: Rescan the bus to resize a BAR Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 22/26] PCI: hotplug: Enable the movable BARs feature by default Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 23/26] PCI/portdrv: Declare support of movable BARs Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 24/26] nvme-pci: Handle " Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 25/26] PCI: Add a message for updating BARs Sergei Miroshnichenko
2020-12-18 17:40 ` [PATCH v9 26/26] resource: increase max nesting level for /proc/iomem Sergei Miroshnichenko
2021-01-28 14:53 ` [PATCH v9 00/26] PCI: Allow BAR movement during boot and hotplug Bjorn Helgaas
2021-01-28 20:39   ` Lukas Wunner
2021-02-01 12:55     ` Mika Westerberg
2021-02-03 20:17       ` Sergei Miroshnichenko
2021-02-04 10:49         ` mika.westerberg
2021-02-10 19:40           ` Sergei Miroshnichenko
2021-02-10 21:46             ` Lukas Wunner
2021-02-11 11:39             ` mika.westerberg
2021-02-11 17:45               ` Sergei Miroshnichenko
2021-02-12 12:52                 ` mika.westerberg
2021-02-12 20:54                   ` Sergei Miroshnichenko
2021-02-15 14:56                     ` mika.westerberg
2021-02-03 20:01     ` Sergei Miroshnichenko
2021-02-04  9:34       ` David Laight
2021-02-03 19:59   ` Sergei Miroshnichenko
2021-02-04  8:26     ` Hinko Kocevar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9fa281a7-a81b-a6e7-8e90-21e4ac058e33@ess.eu \
    --to=hinko.kocevar@ess.eu \
    --cc=David.Laight@ACULAB.COM \
    --cc=andy.lavr@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux@yadro.com \
    --cc=lukas@wunner.de \
    --cc=rajatja@google.com \
    --cc=s.miroshnichenko@yadro.com \
    --cc=sr@denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).