linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
@ 2022-04-21 18:26 Dexuan Cui
  2022-04-26 16:44 ` Lorenzo Pieralisi
  0 siblings, 1 reply; 9+ messages in thread
From: Dexuan Cui @ 2022-04-21 18:26 UTC (permalink / raw)
  To: Jake Oshins, Bjorn Helgaas, lorenzo.pieralisi, bhelgaas, Alex Williamson
  Cc: wei.liu, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	linux-hyperv, linux-pci, linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

> From: Jake Oshins <jakeo@microsoft.com>
> Sent: Wednesday, April 20, 2022 9:01 AM
> To: Bjorn Helgaas <helgaas@kernel.org>; Dexuan Cui <decui@microsoft.com>
> Cc: wei.liu@kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> lorenzo.pieralisi@arm.com; bhelgaas@google.com;
> linux-hyperv@vger.kernel.org; linux-pci@vger.kernel.org;
> linux-kernel@vger.kernel.org; Michael Kelley (LINUX)
> <mikelley@microsoft.com>; robh@kernel.org; kw@linux.com; Alex Williamson
> <alex.williamson@redhat.com>; .kvm@vger.kernel.org
> Subject: RE: [EXTERNAL] Re: [PATCH] PCI: hv: Do not set
> PCI_COMMAND_MEMORY to reduce VM boot time

I removed the "[EXTERNAL]" tag as it looks like that prevents the email from being
archived properly. See this link for the archived email thread:
https://lwn.net/ml/linux-kernel/20220419220007.26550-1-decui%40microsoft.com/

> > -----Original Message-----
> > From: Bjorn Helgaas <helgaas@kernel.org>
> > Sent: Wednesday, April 20, 2022 8:36 AM
> > To: Dexuan Cui <decui@microsoft.com>
> > Cc: wei.liu@kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > <haiyangz@microsoft.com>; Stephen Hemminger
> <sthemmin@microsoft.com>;
> > lorenzo.pieralisi@arm.com; bhelgaas@google.com; linux-
> > hyperv@vger.kernel.org; linux-pci@vger.kernel.org; linux-
> > kernel@vger.kernel.org; Michael Kelley (LINUX) <mikelley@microsoft.com>;
> > robh@kernel.org; kw@linux.com; Jake Oshins <jakeo@microsoft.com>; Alex
> > Williamson <alex.williamson@redhat.com>; .kvm@vger.kernel.org

I removed the period from ".kvm@" so the KVM list is correctly Cc'd.

> > Subject: [EXTERNAL] Re: [PATCH] PCI: hv: Do not set
> PCI_COMMAND_MEMORY
> > to reduce VM boot time
> >
> > [+cc Alex, kvm in case this idea is applicable for more than just Hyper-V]

Alex, your comments are welcome!

> > On Tue, Apr 19, 2022 at 03:00:07PM -0700, Dexuan Cui wrote:
> > > A VM on Azure can have 14 GPUs, and each GPU may have a huge MMIO
> > BAR,
> > > e.g. 128 GB. Currently the boot time of such a VM can be 4+ minutes,
> > > and most of the time is used by the host to unmap/map the vBAR from/to
> > > pBAR when the VM clears and sets the PCI_COMMAND_MEMORY bit: each
> > > unmap/map operation for a 128GB BAR needs about 1.8 seconds, and the
> > > pci-hyperv driver and the Linux PCI subsystem flip the
> > > PCI_COMMAND_MEMORY bit eight times (see pci_setup_device() ->
> > > pci_read_bases() and pci_std_update_resource()), increasing the boot
> > > time by 1.8 * 8 = 14.4 seconds per GPU, i.e. 14.4 * 14 = 201.6 seconds in
> total.
> > >
> > > Fix the slowness by not turning on the PCI_COMMAND_MEMORY in
> > > pci-hyperv.c, so the bit stays in the off state before the PCI device
> > > driver calls
> > > pci_enable_device(): when the bit is off, pci_read_bases() and
> > > pci_std_update_resource() don't cause Hyper-V to unmap/map the vBARs.
> > > With this change, the boot time of such a VM is reduced by
> > > 1.8 * (8-1) * 14 = 176.4 seconds.
> > >
> > > Tested-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
> > > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> > > Cc: Jake Oshins <jakeo@microsoft.com>
> > > ---
> > >  drivers/pci/controller/pci-hyperv.c | 17 +++++++++++------
> > >  1 file changed, 11 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/pci-hyperv.c
> > > b/drivers/pci/controller/pci-hyperv.c
> > > index d270a204324e..f9fbbd8d94db 100644
> > > --- a/drivers/pci/controller/pci-hyperv.c
> > > +++ b/drivers/pci/controller/pci-hyperv.c
> > > @@ -2082,12 +2082,17 @@ static void prepopulate_bars(struct
> > hv_pcibus_device *hbus)
> > >  				}
> > >  			}
> > >  			if (high_size <= 1 && low_size <= 1) {
> > > -				/* Set the memory enable bit. */
> > > -				_hv_pcifront_read_config(hpdev,
> > PCI_COMMAND, 2,
> > > -							 &command);
> > > -				command |= PCI_COMMAND_MEMORY;
> > > -				_hv_pcifront_write_config(hpdev,
> > PCI_COMMAND, 2,
> > > -							  command);
> > > +				/*
> > > +				 * No need to set the
> > PCI_COMMAND_MEMORY bit as
> > > +				 * the core PCI driver doesn't require the bit
> > > +				 * to be pre-set. Actually here we intentionally
> > > +				 * keep the bit off so that the PCI BAR probing
> > > +				 * in the core PCI driver doesn't cause Hyper-V
> > > +				 * to unnecessarily unmap/map the virtual
> > BARs
> > > +				 * from/to the physical BARs multiple times.
> > > +				 * This reduces the VM boot time significantly
> > > +				 * if the BAR sizes are huge.
> > > +				 */
> > >  				break;
> > >  			}
> > >  		}
> > > --
> > > 2.17.1
> > >
> 
> My question about this, directed at the people who know a lot more about the
> PCI subsystem in Linux than I do (Bjorn, Alex, etc.), is whether this change can
> create a problem.

Bjorn, Lorenzo, Alex, it would be nice to have your insights!

> In a physical computer, you might sometimes want to move
> a device from one part of address space to another, typically when another
> device is being hot-added to the system.  Imagine a scenario where you have,
> in this case a VM, and there are perhaps 15 NVMe SSDs passed through to the
> VM.  One of them dies and so it is removed.  The replacement SSD has a
> larger BAR than the one that was removed (because it's a different SKU) and
> now it doesn't fit neatly into the hole that was vacated by the previous SSD.
> 
> In this case, will the kernel ever attempt to move some of the existing SSDs to
> make room for the new one?  And if so, does this come with the requirement
> that they actually get mapped out of the address space while they are being
> moved?  (This was the scenario that prompted me to write the code above as
> it was, originally.)
> 
> Thanks,
> Jake Oshins

Sorry I don't quite follow. pci-hyperv allocates MMIO for the bridge window
in hv_pci_allocate_bridge_windows() and registers the MMIO ranges to the core
PCI driver via pci_add_resource(), and later the core PCI driver probes the
bus/device(s), validates the BAR sizes and the pre-initialized BAR values, and
uses the BAR configuration. IMO the whole process doesn't require the bit
PCI_COMMAND_MEMORY to be pre-set, and there should be no issue to
delay setting the bit to a PCI device device's .probe() -> pci_enable_device().

When a device is removed, hv_eject_device_work() -> 
pci_stop_and_remove_bus_device() triggers the PCI device driver's .remove(),
which calls pci_disable_device(), which instructs the hypervisor to unmap
the vBAR from the pBAR, so there should not be any issue when a new device
is added later. 

With the patch, if no PCI device driver is loaded, the PCI_COMMAND_MEMORY
bit stays in the off state, and the hypervisor doesn't map the vBAR to the pBAR.
I don't see any issue with this.

hv_pci_allocate_bridge_windows() -> vmbus_allocate_mmio() makes sure
that there is enough MMIO space for the devices on the bus, so the core
PCI driver doesn't have to worry about any MMIO conflict issue.

Please let me know if I understood and answered the questions.

Thanks,
-- Dexuan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
  2022-04-21 18:26 [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time Dexuan Cui
@ 2022-04-26 16:44 ` Lorenzo Pieralisi
  2022-04-26 18:31   ` Dexuan Cui
  0 siblings, 1 reply; 9+ messages in thread
From: Lorenzo Pieralisi @ 2022-04-26 16:44 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Jake Oshins, Bjorn Helgaas, bhelgaas, Alex Williamson, wei.liu,
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-hyperv,
	linux-pci, linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

On Thu, Apr 21, 2022 at 06:26:36PM +0000, Dexuan Cui wrote:
> > From: Jake Oshins <jakeo@microsoft.com>
> > Sent: Wednesday, April 20, 2022 9:01 AM
> > To: Bjorn Helgaas <helgaas@kernel.org>; Dexuan Cui <decui@microsoft.com>
> > Cc: wei.liu@kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> > lorenzo.pieralisi@arm.com; bhelgaas@google.com;
> > linux-hyperv@vger.kernel.org; linux-pci@vger.kernel.org;
> > linux-kernel@vger.kernel.org; Michael Kelley (LINUX)
> > <mikelley@microsoft.com>; robh@kernel.org; kw@linux.com; Alex Williamson
> > <alex.williamson@redhat.com>; .kvm@vger.kernel.org
> > Subject: RE: [EXTERNAL] Re: [PATCH] PCI: hv: Do not set
> > PCI_COMMAND_MEMORY to reduce VM boot time
> 
> I removed the "[EXTERNAL]" tag as it looks like that prevents the email from being
> archived properly. See this link for the archived email thread:
> https://lwn.net/ml/linux-kernel/20220419220007.26550-1-decui%40microsoft.com/
> 
> > > -----Original Message-----
> > > From: Bjorn Helgaas <helgaas@kernel.org>
> > > Sent: Wednesday, April 20, 2022 8:36 AM
> > > To: Dexuan Cui <decui@microsoft.com>
> > > Cc: wei.liu@kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > > <haiyangz@microsoft.com>; Stephen Hemminger
> > <sthemmin@microsoft.com>;
> > > lorenzo.pieralisi@arm.com; bhelgaas@google.com; linux-
> > > hyperv@vger.kernel.org; linux-pci@vger.kernel.org; linux-
> > > kernel@vger.kernel.org; Michael Kelley (LINUX) <mikelley@microsoft.com>;
> > > robh@kernel.org; kw@linux.com; Jake Oshins <jakeo@microsoft.com>; Alex
> > > Williamson <alex.williamson@redhat.com>; .kvm@vger.kernel.org
> 
> I removed the period from ".kvm@" so the KVM list is correctly Cc'd.
> 
> > > Subject: [EXTERNAL] Re: [PATCH] PCI: hv: Do not set
> > PCI_COMMAND_MEMORY
> > > to reduce VM boot time
> > >
> > > [+cc Alex, kvm in case this idea is applicable for more than just Hyper-V]
> 
> Alex, your comments are welcome!
> 
> > > On Tue, Apr 19, 2022 at 03:00:07PM -0700, Dexuan Cui wrote:
> > > > A VM on Azure can have 14 GPUs, and each GPU may have a huge MMIO
> > > BAR,
> > > > e.g. 128 GB. Currently the boot time of such a VM can be 4+ minutes,
> > > > and most of the time is used by the host to unmap/map the vBAR from/to
> > > > pBAR when the VM clears and sets the PCI_COMMAND_MEMORY bit: each
> > > > unmap/map operation for a 128GB BAR needs about 1.8 seconds, and the
> > > > pci-hyperv driver and the Linux PCI subsystem flip the
> > > > PCI_COMMAND_MEMORY bit eight times (see pci_setup_device() ->
> > > > pci_read_bases() and pci_std_update_resource()), increasing the boot
> > > > time by 1.8 * 8 = 14.4 seconds per GPU, i.e. 14.4 * 14 = 201.6 seconds in
> > total.
> > > >
> > > > Fix the slowness by not turning on the PCI_COMMAND_MEMORY in
> > > > pci-hyperv.c, so the bit stays in the off state before the PCI device
> > > > driver calls
> > > > pci_enable_device(): when the bit is off, pci_read_bases() and
> > > > pci_std_update_resource() don't cause Hyper-V to unmap/map the vBARs.
> > > > With this change, the boot time of such a VM is reduced by
> > > > 1.8 * (8-1) * 14 = 176.4 seconds.
> > > >
> > > > Tested-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
> > > > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> > > > Cc: Jake Oshins <jakeo@microsoft.com>
> > > > ---
> > > >  drivers/pci/controller/pci-hyperv.c | 17 +++++++++++------
> > > >  1 file changed, 11 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/drivers/pci/controller/pci-hyperv.c
> > > > b/drivers/pci/controller/pci-hyperv.c
> > > > index d270a204324e..f9fbbd8d94db 100644
> > > > --- a/drivers/pci/controller/pci-hyperv.c
> > > > +++ b/drivers/pci/controller/pci-hyperv.c
> > > > @@ -2082,12 +2082,17 @@ static void prepopulate_bars(struct
> > > hv_pcibus_device *hbus)
> > > >  				}
> > > >  			}
> > > >  			if (high_size <= 1 && low_size <= 1) {
> > > > -				/* Set the memory enable bit. */
> > > > -				_hv_pcifront_read_config(hpdev,
> > > PCI_COMMAND, 2,
> > > > -							 &command);
> > > > -				command |= PCI_COMMAND_MEMORY;
> > > > -				_hv_pcifront_write_config(hpdev,
> > > PCI_COMMAND, 2,
> > > > -							  command);
> > > > +				/*
> > > > +				 * No need to set the
> > > PCI_COMMAND_MEMORY bit as
> > > > +				 * the core PCI driver doesn't require the bit
> > > > +				 * to be pre-set. Actually here we intentionally
> > > > +				 * keep the bit off so that the PCI BAR probing
> > > > +				 * in the core PCI driver doesn't cause Hyper-V
> > > > +				 * to unnecessarily unmap/map the virtual
> > > BARs
> > > > +				 * from/to the physical BARs multiple times.
> > > > +				 * This reduces the VM boot time significantly
> > > > +				 * if the BAR sizes are huge.
> > > > +				 */
> > > >  				break;
> > > >  			}
> > > >  		}
> > > > --
> > > > 2.17.1
> > > >
> > 
> > My question about this, directed at the people who know a lot more about the
> > PCI subsystem in Linux than I do (Bjorn, Alex, etc.), is whether this change can
> > create a problem.
> 
> Bjorn, Lorenzo, Alex, it would be nice to have your insights!
> 
> > In a physical computer, you might sometimes want to move
> > a device from one part of address space to another, typically when another
> > device is being hot-added to the system.  Imagine a scenario where you have,
> > in this case a VM, and there are perhaps 15 NVMe SSDs passed through to the
> > VM.  One of them dies and so it is removed.  The replacement SSD has a
> > larger BAR than the one that was removed (because it's a different SKU) and
> > now it doesn't fit neatly into the hole that was vacated by the previous SSD.
> > 
> > In this case, will the kernel ever attempt to move some of the existing SSDs to
> > make room for the new one?  And if so, does this come with the requirement
> > that they actually get mapped out of the address space while they are being
> > moved?  (This was the scenario that prompted me to write the code above as
> > it was, originally.)
> > 
> > Thanks,
> > Jake Oshins
> 
> Sorry I don't quite follow. pci-hyperv allocates MMIO for the bridge
> window in hv_pci_allocate_bridge_windows() and registers the MMIO
> ranges to the core PCI driver via pci_add_resource(), and later the
> core PCI driver probes the bus/device(s), validates the BAR sizes and
> the pre-initialized BAR values, and uses the BAR configuration. IMO
> the whole process doesn't require the bit PCI_COMMAND_MEMORY to be
> pre-set, and there should be no issue to delay setting the bit to a
> PCI device device's .probe() -> pci_enable_device().

IIUC you want to bootstrap devices with PCI_COMMAND_MEMORY clear
(otherwise PCI core would toggle it on and off for eg BAR sizing).

Is that correct ?

If I read PCI core correctly PCI_COMMAND_MEMORY is obviously cleared
only if it is set in the first place and that's what your patch is
changing, namely you boostrap your devices with PCI_COMMAND_MEMORY
clear so that PCI core does not touch it.

I am not familiar with Hyper-V code so forgive me the question.

Thanks,
Lorenzo

> 
> When a device is removed, hv_eject_device_work() -> 
> pci_stop_and_remove_bus_device() triggers the PCI device driver's .remove(),
> which calls pci_disable_device(), which instructs the hypervisor to unmap
> the vBAR from the pBAR, so there should not be any issue when a new device
> is added later. 
> 
> With the patch, if no PCI device driver is loaded, the PCI_COMMAND_MEMORY
> bit stays in the off state, and the hypervisor doesn't map the vBAR to the pBAR.
> I don't see any issue with this.
> 
> hv_pci_allocate_bridge_windows() -> vmbus_allocate_mmio() makes sure
> that there is enough MMIO space for the devices on the bus, so the core
> PCI driver doesn't have to worry about any MMIO conflict issue.
> 
> Please let me know if I understood and answered the questions.
> 
> Thanks,
> -- Dexuan
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
  2022-04-26 16:44 ` Lorenzo Pieralisi
@ 2022-04-26 18:31   ` Dexuan Cui
  2022-04-26 19:25     ` Jake Oshins
  0 siblings, 1 reply; 9+ messages in thread
From: Dexuan Cui @ 2022-04-26 18:31 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: Jake Oshins, Bjorn Helgaas, bhelgaas, Alex Williamson, wei.liu,
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-hyperv,
	linux-pci, linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

> From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Sent: Tuesday, April 26, 2022 9:45 AM
> > ...
> > Sorry I don't quite follow. pci-hyperv allocates MMIO for the bridge
> > window in hv_pci_allocate_bridge_windows() and registers the MMIO
> > ranges to the core PCI driver via pci_add_resource(), and later the
> > core PCI driver probes the bus/device(s), validates the BAR sizes and
> > the pre-initialized BAR values, and uses the BAR configuration. IMO
> > the whole process doesn't require the bit PCI_COMMAND_MEMORY to be
> > pre-set, and there should be no issue to delay setting the bit to a
> > PCI device device's .probe() -> pci_enable_device().
> 
> IIUC you want to bootstrap devices with PCI_COMMAND_MEMORY clear
> (otherwise PCI core would toggle it on and off for eg BAR sizing).
> 
> Is that correct ?

Yes, that's the exact purpose of this patch.

Do you see any potential architectural issue with the patch? 
From my reading of the core PCI code, it looks like this should be safe.

Jake has some concerns that I don't quite follow. 
@Jake, could you please explain the concerns with more details?

> If I read PCI core correctly PCI_COMMAND_MEMORY is obviously cleared
> only if it is set in the first place and that's what your patch is
> changing, namely you boostrap your devices with PCI_COMMAND_MEMORY
> clear so that PCI core does not touch it.

Yes, this is what exactly the patch is doing.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
  2022-04-26 18:31   ` Dexuan Cui
@ 2022-04-26 19:25     ` Jake Oshins
  2022-04-27 22:41       ` Alex Williamson
  2022-04-28 19:12       ` Bjorn Helgaas
  0 siblings, 2 replies; 9+ messages in thread
From: Jake Oshins @ 2022-04-26 19:25 UTC (permalink / raw)
  To: Dexuan Cui, Lorenzo Pieralisi
  Cc: Bjorn Helgaas, bhelgaas, Alex Williamson, wei.liu, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, linux-hyperv, linux-pci,
	linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

> -----Original Message-----
> From: Dexuan Cui <decui@microsoft.com>
> Sent: Tuesday, April 26, 2022 11:32 AM
> To: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Jake Oshins <jakeo@microsoft.com>; Bjorn Helgaas <helgaas@kernel.org>;
> bhelgaas@google.com; Alex Williamson <alex.williamson@redhat.com>;
> wei.liu@kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> linux-hyperv@vger.kernel.org; linux-pci@vger.kernel.org; linux-
> kernel@vger.kernel.org; Michael Kelley (LINUX) <mikelley@microsoft.com>;
> robh@kernel.org; kw@linux.com; kvm@vger.kernel.org
> Subject: RE: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce
> VM boot time
> 
> > From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > Sent: Tuesday, April 26, 2022 9:45 AM
> > > ...
> > > Sorry I don't quite follow. pci-hyperv allocates MMIO for the bridge
> > > window in hv_pci_allocate_bridge_windows() and registers the MMIO
> > > ranges to the core PCI driver via pci_add_resource(), and later the
> > > core PCI driver probes the bus/device(s), validates the BAR sizes
> > > and the pre-initialized BAR values, and uses the BAR configuration.
> > > IMO the whole process doesn't require the bit PCI_COMMAND_MEMORY to
> > > be pre-set, and there should be no issue to delay setting the bit to
> > > a PCI device device's .probe() -> pci_enable_device().
> >
> > IIUC you want to bootstrap devices with PCI_COMMAND_MEMORY clear
> > (otherwise PCI core would toggle it on and off for eg BAR sizing).
> >
> > Is that correct ?
> 
> Yes, that's the exact purpose of this patch.
> 
> Do you see any potential architectural issue with the patch?
> From my reading of the core PCI code, it looks like this should be safe.
> 
> Jake has some concerns that I don't quite follow.
> @Jake, could you please explain the concerns with more details?
> 

First, let me say that I really don't know whether this is an issue.  I know it's an issue with other operating system kernels.  I'm curious whether the Linux kernel / Linux PCI driver would behave in a way that has an issue here.

The VM has a window of address space into which it chooses to put PCI device's BARs.  The guest OS will generally pick the value that is within the BAR, by default, but it can theoretically place the device in any free address space.  The subset of the VM's memory address space which can be populated by devices' BARs is finite, and generally not particularly large.

Imagine a VM that is configured with 25 NVMe controllers, each of which requires 64KiB of address space.  (This is just an example.)  At first boot, all of these NVMe controllers are packed into address space, one after the other.

While that VM is running, one of the 25 NVMe controllers fails and is replaced with an NVMe controller from a separate manufacturer, but this one requires 128KiB of memory, for some reason.  Perhaps it implements the "controller buffer" feature of NVMe.  It doesn't fit in the hole that was vacated by the failed NVMe controller, so it needs to be placed somewhere else in address space.  This process continues over months, with several more failures and replacements.  Eventually, the address space is very fragmented.

At some point, there is an attempt to place an NVMe controller into the VM but there is no contiguous block of address space free which would allow that NVMe controller to operate.  There is, however, enough total address space if the other, currently functioning, NVMe controllers are moved from the address space that they are using to other ranges, consolidating their usage and reducing fragmentation.  Let's call this a rebalancing of memory resources.

When the NVMe controllers are moved, a new value is written into their BAR.  In general, the PCI spec would require that you clear the memory enable bit in the command register (PCI_COMMAND_MEMORY) during this move operation, both so that there's never a moment when two devices are occupying the same address space and because writing a 64-bit BAR atomically isn't possible.  This is the reason that I originally wrote the code in this driver to unmap the device from the VM's address space when the memory enable bit is cleared.

What I don't know is whether this sequence of operations can ever happen in Linux, or perhaps in a VM running Linux.  Will it rebalance resources in order to consolidate address space?  If it will, will this involve clearing the memory enable bit to ensure that two devices never overlap?

Thanks,
Jake Oshins

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
  2022-04-26 19:25     ` Jake Oshins
@ 2022-04-27 22:41       ` Alex Williamson
  2022-04-28 19:12       ` Bjorn Helgaas
  1 sibling, 0 replies; 9+ messages in thread
From: Alex Williamson @ 2022-04-27 22:41 UTC (permalink / raw)
  To: Jake Oshins
  Cc: Dexuan Cui, Lorenzo Pieralisi, Bjorn Helgaas, bhelgaas, wei.liu,
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-hyperv,
	linux-pci, linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

On Tue, 26 Apr 2022 19:25:43 +0000
Jake Oshins <jakeo@microsoft.com> wrote:

> > -----Original Message-----
> > From: Dexuan Cui <decui@microsoft.com>
> > Sent: Tuesday, April 26, 2022 11:32 AM
> > To: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > Cc: Jake Oshins <jakeo@microsoft.com>; Bjorn Helgaas <helgaas@kernel.org>;
> > bhelgaas@google.com; Alex Williamson <alex.williamson@redhat.com>;
> > wei.liu@kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> > linux-hyperv@vger.kernel.org; linux-pci@vger.kernel.org; linux-
> > kernel@vger.kernel.org; Michael Kelley (LINUX) <mikelley@microsoft.com>;
> > robh@kernel.org; kw@linux.com; kvm@vger.kernel.org
> > Subject: RE: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce
> > VM boot time
> >   
> > > From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > > Sent: Tuesday, April 26, 2022 9:45 AM  
> > > > ...
> > > > Sorry I don't quite follow. pci-hyperv allocates MMIO for the bridge
> > > > window in hv_pci_allocate_bridge_windows() and registers the MMIO
> > > > ranges to the core PCI driver via pci_add_resource(), and later the
> > > > core PCI driver probes the bus/device(s), validates the BAR sizes
> > > > and the pre-initialized BAR values, and uses the BAR configuration.
> > > > IMO the whole process doesn't require the bit PCI_COMMAND_MEMORY to
> > > > be pre-set, and there should be no issue to delay setting the bit to
> > > > a PCI device device's .probe() -> pci_enable_device().  
> > >
> > > IIUC you want to bootstrap devices with PCI_COMMAND_MEMORY clear
> > > (otherwise PCI core would toggle it on and off for eg BAR sizing).
> > >
> > > Is that correct ?  
> > 
> > Yes, that's the exact purpose of this patch.
> > 
> > Do you see any potential architectural issue with the patch?
> > From my reading of the core PCI code, it looks like this should be safe.
> > 
> > Jake has some concerns that I don't quite follow.
> > @Jake, could you please explain the concerns with more details?
> >   
> 
> First, let me say that I really don't know whether this is an issue.
> I know it's an issue with other operating system kernels.  I'm
> curious whether the Linux kernel / Linux PCI driver would behave in a
> way that has an issue here.
> 
> The VM has a window of address space into which it chooses to put PCI
> device's BARs.  The guest OS will generally pick the value that is
> within the BAR, by default, but it can theoretically place the device
> in any free address space.  The subset of the VM's memory address
> space which can be populated by devices' BARs is finite, and
> generally not particularly large.

AIUI, if the firmware has programmed the BAR addresses, Linux will
generally try to leave them alone, it's only unprogrammed devices or
when using the pci=realloc option that we'll try to shuffle things
around.

If you talk to bare metal system firmware developers, you might find
disagreement regarding whether the OS or system firmware owns the
device address space, which I believe also factors into our handling of
the memory space enable bit of the command register.  Minimally, system
firmware is required to allocate resources and enable boot devices, and
often these are left enabled after the hand-off to the OS.  This might
include some peripherals, for instance legacy emulation on a USB
keyboard might leave the USB host controller enabled.  There are also
more devious use cases, where there might be device monitoring running
across the bus under the OS, perhaps via SMI or other means, where if
we start moving devices around, that could theoretically break.

However, I don't really see any obvious problems with your proposal
that we simply leave the memory enable bit in the hand-off state.

> Imagine a VM that is configured with 25 NVMe controllers, each of
> which requires 64KiB of address space.  (This is just an example.)
> At first boot, all of these NVMe controllers are packed into address
> space, one after the other.
> 
> While that VM is running, one of the 25 NVMe controllers fails and is
> replaced with an NVMe controller from a separate manufacturer, but
> this one requires 128KiB of memory, for some reason.  Perhaps it
> implements the "controller buffer" feature of NVMe.  It doesn't fit
> in the hole that was vacated by the failed NVMe controller, so it
> needs to be placed somewhere else in address space.  This process
> continues over months, with several more failures and replacements.
> Eventually, the address space is very fragmented.
> 
> At some point, there is an attempt to place an NVMe controller into
> the VM but there is no contiguous block of address space free which
> would allow that NVMe controller to operate.  There is, however,
> enough total address space if the other, currently functioning, NVMe
> controllers are moved from the address space that they are using to
> other ranges, consolidating their usage and reducing fragmentation.
> Let's call this a rebalancing of memory resources.
> 
> When the NVMe controllers are moved, a new value is written into
> their BAR.  In general, the PCI spec would require that you clear the
> memory enable bit in the command register (PCI_COMMAND_MEMORY) during
> this move operation, both so that there's never a moment when two
> devices are occupying the same address space and because writing a
> 64-bit BAR atomically isn't possible.  This is the reason that I
> originally wrote the code in this driver to unmap the device from the
> VM's address space when the memory enable bit is cleared.
> 
> What I don't know is whether this sequence of operations can ever
> happen in Linux, or perhaps in a VM running Linux.  Will it rebalance
> resources in order to consolidate address space?  If it will, will
> this involve clearing the memory enable bit to ensure that two
> devices never overlap?

Once the OS is running and drivers are attached to devices, any
reshuffling of resources for those devices would require coordination
of the driver to release the resources and reprogram them.  Even if an
atomic update of the BAR were possible, that can't account for possible
in-flight use cases, such as p2p DMA.

There were a couple sessions from the 2019 Linux Plumbers conference
that might be useful to review regarding these issues.  IIRC the
first[1] was specifically looking at whether we could do partial BAR
allocations for NVMe devices, where we might have functionality but
reduced performance or features with a partial mapping.  In your
example, perhaps we're replacing a device with one that has twice the
BAR space, but is functional with only half that, so we can slide it
into the same slot as the previous device.  This would likely mean
enlightening the PCI core with device or class specific information.
I've not followed whether anything occurred here.

The second[2] (next session, same recording) discusses problems around
resource allocation and dynamic reallocation.  Again, I haven't
followed further discussions here, but I don't expect much has changed.
Thanks,

Alex

[1]https://youtu.be/ozlQ1XQreac?list=PLVsQ_xZBEyN1PDehCCAiztGf45K_D6txS&t=6481
[2]https://youtu.be/ozlQ1XQreac?list=PLVsQ_xZBEyN1PDehCCAiztGf45K_D6txS&t=7980


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
  2022-04-26 19:25     ` Jake Oshins
  2022-04-27 22:41       ` Alex Williamson
@ 2022-04-28 19:12       ` Bjorn Helgaas
  2022-04-28 19:21         ` [EXTERNAL] " Jake Oshins
  1 sibling, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2022-04-28 19:12 UTC (permalink / raw)
  To: Jake Oshins
  Cc: Dexuan Cui, Lorenzo Pieralisi, bhelgaas, Alex Williamson,
	wei.liu, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	linux-hyperv, linux-pci, linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

On Tue, Apr 26, 2022 at 07:25:43PM +0000, Jake Oshins wrote:
> > -----Original Message-----
> > From: Dexuan Cui <decui@microsoft.com>
> > Sent: Tuesday, April 26, 2022 11:32 AM
> > To: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > Cc: Jake Oshins <jakeo@microsoft.com>; Bjorn Helgaas <helgaas@kernel.org>;
> > bhelgaas@google.com; Alex Williamson <alex.williamson@redhat.com>;
> > wei.liu@kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> > <haiyangz@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>;
> > linux-hyperv@vger.kernel.org; linux-pci@vger.kernel.org; linux-
> > kernel@vger.kernel.org; Michael Kelley (LINUX) <mikelley@microsoft.com>;
> > robh@kernel.org; kw@linux.com; kvm@vger.kernel.org
> > Subject: RE: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce
> > VM boot time
> > 
> > > From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > > Sent: Tuesday, April 26, 2022 9:45 AM
> > > > ...
> > > > Sorry I don't quite follow. pci-hyperv allocates MMIO for the bridge
> > > > window in hv_pci_allocate_bridge_windows() and registers the MMIO
> > > > ranges to the core PCI driver via pci_add_resource(), and later the
> > > > core PCI driver probes the bus/device(s), validates the BAR sizes
> > > > and the pre-initialized BAR values, and uses the BAR configuration.
> > > > IMO the whole process doesn't require the bit PCI_COMMAND_MEMORY to
> > > > be pre-set, and there should be no issue to delay setting the bit to
> > > > a PCI device device's .probe() -> pci_enable_device().
> > >
> > > IIUC you want to bootstrap devices with PCI_COMMAND_MEMORY clear
> > > (otherwise PCI core would toggle it on and off for eg BAR sizing).
> > >
> > > Is that correct ?
> > 
> > Yes, that's the exact purpose of this patch.
> > 
> > Do you see any potential architectural issue with the patch?
> > From my reading of the core PCI code, it looks like this should be safe.

I don't know much about Hyper-V, but in general I don't think the PCI
core should turn on PCI_COMMAND_MEMORY at all unless a driver requests
it.  I assume that if a guest OS depends on PCI_COMMAND_MEMORY being
set, guest firmware would take care of setting it.

> > Jake has some concerns that I don't quite follow.
> > @Jake, could you please explain the concerns with more details?
> 
> First, let me say that I really don't know whether this is an issue.
> I know it's an issue with other operating system kernels.  I'm
> curious whether the Linux kernel / Linux PCI driver would behave in
> a way that has an issue here.
> 
> The VM has a window of address space into which it chooses to put
> PCI device's BARs.  The guest OS will generally pick the value that
> is within the BAR, by default, but it can theoretically place the
> device in any free address space.  The subset of the VM's memory
> address space which can be populated by devices' BARs is finite, and
> generally not particularly large.
> 
> Imagine a VM that is configured with 25 NVMe controllers, each of
> which requires 64KiB of address space.  (This is just an example.)
> At first boot, all of these NVMe controllers are packed into address
> space, one after the other.
> 
> While that VM is running, one of the 25 NVMe controllers fails and
> is replaced with an NVMe controller from a separate manufacturer,
> but this one requires 128KiB of memory, for some reason.  Perhaps it
> implements the "controller buffer" feature of NVMe.  It doesn't fit
> in the hole that was vacated by the failed NVMe controller, so it
> needs to be placed somewhere else in address space.  This process
> continues over months, with several more failures and replacements.
> Eventually, the address space is very fragmented.
> 
> At some point, there is an attempt to place an NVMe controller into
> the VM but there is no contiguous block of address space free which
> would allow that NVMe controller to operate.  There is, however,
> enough total address space if the other, currently functioning, NVMe
> controllers are moved from the address space that they are using to
> other ranges, consolidating their usage and reducing fragmentation.
> Let's call this a rebalancing of memory resources.
> 
> When the NVMe controllers are moved, a new value is written into
> their BAR.  In general, the PCI spec would require that you clear
> the memory enable bit in the command register (PCI_COMMAND_MEMORY)
> during this move operation, both so that there's never a moment when
> two devices are occupying the same address space and because writing
> a 64-bit BAR atomically isn't possible.  This is the reason that I
> originally wrote the code in this driver to unmap the device from
> the VM's address space when the memory enable bit is cleared.
> 
> What I don't know is whether this sequence of operations can ever
> happen in Linux, or perhaps in a VM running Linux.  Will it
> rebalance resources in order to consolidate address space?  If it
> will, will this involve clearing the memory enable bit to ensure
> that two devices never overlap?

This sequence definitely can occur in Linux, but it hasn't yet become
a real priority.  But we do already have issues with assigning space
for hot-added devices in general, especially if firmware hasn't
assigned large windows to things like Thunderbolt controllers.  I
suspect that we have or will soon have issues where resource
assignment starts failing after a few hotplugs, e.g., dock/undock
events.

There have been patches posted to rebalance resources (quiesce
drivers, reassign, restart drivers), but they haven't gone anywhere
yet for lack of interest and momentum.  I do feel like we're the
tracks and the train is coming, though ;)

Bjorn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [EXTERNAL] Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
  2022-04-28 19:12       ` Bjorn Helgaas
@ 2022-04-28 19:21         ` Jake Oshins
  2022-04-29  1:11           ` Dexuan Cui
  0 siblings, 1 reply; 9+ messages in thread
From: Jake Oshins @ 2022-04-28 19:21 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Dexuan Cui, Lorenzo Pieralisi, bhelgaas, Alex Williamson,
	wei.liu, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	linux-hyperv, linux-pci, linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

> -----Original Message-----
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: Thursday, April 28, 2022 12:12 PM
> To: Jake Oshins <jakeo@microsoft.com>
> Cc: Dexuan Cui <decui@microsoft.com>; Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com>; bhelgaas@google.com; Alex Williamson
> <alex.williamson@redhat.com>; wei.liu@kernel.org; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen
> Hemminger <sthemmin@microsoft.com>; linux-hyperv@vger.kernel.org; linux-
> pci@vger.kernel.org; linux-kernel@vger.kernel.org; Michael Kelley (LINUX)
> <mikelley@microsoft.com>; robh@kernel.org; kw@linux.com;
> kvm@vger.kernel.org
> Subject: [EXTERNAL] Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY
> to reduce VM boot time
> 
> On Tue, Apr 26, 2022 at 07:25:43PM +0000, Jake Oshins wrote:
> > > -----Original Message-----
> > > From: Dexuan Cui <decui@microsoft.com>
> > > Sent: Tuesday, April 26, 2022 11:32 AM
> > > To: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > > Cc: Jake Oshins <jakeo@microsoft.com>; Bjorn Helgaas
> > > <helgaas@kernel.org>; bhelgaas@google.com; Alex Williamson
> > > <alex.williamson@redhat.com>; wei.liu@kernel.org; KY Srinivasan
> > > <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Stephen
> > > Hemminger <sthemmin@microsoft.com>; linux-hyperv@vger.kernel.org;
> > > linux-pci@vger.kernel.org; linux- kernel@vger.kernel.org; Michael
> > > Kelley (LINUX) <mikelley@microsoft.com>; robh@kernel.org;
> > > kw@linux.com; kvm@vger.kernel.org
> > > Subject: RE: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to
> > > reduce VM boot time
> > >
> > > > From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > > > Sent: Tuesday, April 26, 2022 9:45 AM
> > > > > ...
> > > > > Sorry I don't quite follow. pci-hyperv allocates MMIO for the
> > > > > bridge window in hv_pci_allocate_bridge_windows() and registers
> > > > > the MMIO ranges to the core PCI driver via pci_add_resource(),
> > > > > and later the core PCI driver probes the bus/device(s),
> > > > > validates the BAR sizes and the pre-initialized BAR values, and uses the
> BAR configuration.
> > > > > IMO the whole process doesn't require the bit
> PCI_COMMAND_MEMORY
> > > > > to be pre-set, and there should be no issue to delay setting the
> > > > > bit to a PCI device device's .probe() -> pci_enable_device().
> > > >
> > > > IIUC you want to bootstrap devices with PCI_COMMAND_MEMORY clear
> > > > (otherwise PCI core would toggle it on and off for eg BAR sizing).
> > > >
> > > > Is that correct ?
> > >
> > > Yes, that's the exact purpose of this patch.
> > >
> > > Do you see any potential architectural issue with the patch?
> > > From my reading of the core PCI code, it looks like this should be safe.
> 
> I don't know much about Hyper-V, but in general I don't think the PCI core
> should turn on PCI_COMMAND_MEMORY at all unless a driver requests it.  I
> assume that if a guest OS depends on PCI_COMMAND_MEMORY being set,
> guest firmware would take care of setting it.
> 
> > > Jake has some concerns that I don't quite follow.
> > > @Jake, could you please explain the concerns with more details?
> >
> > First, let me say that I really don't know whether this is an issue.
> > I know it's an issue with other operating system kernels.  I'm curious
> > whether the Linux kernel / Linux PCI driver would behave in a way that
> > has an issue here.
> >
> > The VM has a window of address space into which it chooses to put PCI
> > device's BARs.  The guest OS will generally pick the value that is
> > within the BAR, by default, but it can theoretically place the device
> > in any free address space.  The subset of the VM's memory address
> > space which can be populated by devices' BARs is finite, and generally
> > not particularly large.
> >
> > Imagine a VM that is configured with 25 NVMe controllers, each of
> > which requires 64KiB of address space.  (This is just an example.) At
> > first boot, all of these NVMe controllers are packed into address
> > space, one after the other.
> >
> > While that VM is running, one of the 25 NVMe controllers fails and is
> > replaced with an NVMe controller from a separate manufacturer, but
> > this one requires 128KiB of memory, for some reason.  Perhaps it
> > implements the "controller buffer" feature of NVMe.  It doesn't fit in
> > the hole that was vacated by the failed NVMe controller, so it needs
> > to be placed somewhere else in address space.  This process continues
> > over months, with several more failures and replacements.
> > Eventually, the address space is very fragmented.
> >
> > At some point, there is an attempt to place an NVMe controller into
> > the VM but there is no contiguous block of address space free which
> > would allow that NVMe controller to operate.  There is, however,
> > enough total address space if the other, currently functioning, NVMe
> > controllers are moved from the address space that they are using to
> > other ranges, consolidating their usage and reducing fragmentation.
> > Let's call this a rebalancing of memory resources.
> >
> > When the NVMe controllers are moved, a new value is written into their
> > BAR.  In general, the PCI spec would require that you clear the memory
> > enable bit in the command register (PCI_COMMAND_MEMORY) during this
> > move operation, both so that there's never a moment when two devices
> > are occupying the same address space and because writing a 64-bit BAR
> > atomically isn't possible.  This is the reason that I originally wrote
> > the code in this driver to unmap the device from the VM's address
> > space when the memory enable bit is cleared.
> >
> > What I don't know is whether this sequence of operations can ever
> > happen in Linux, or perhaps in a VM running Linux.  Will it rebalance
> > resources in order to consolidate address space?  If it will, will
> > this involve clearing the memory enable bit to ensure that two devices
> > never overlap?
> 
> This sequence definitely can occur in Linux, but it hasn't yet become a real
> priority.  But we do already have issues with assigning space for hot-added
> devices in general, especially if firmware hasn't assigned large windows to
> things like Thunderbolt controllers.  I suspect that we have or will soon have
> issues where resource assignment starts failing after a few hotplugs, e.g.,
> dock/undock events.
> 
> There have been patches posted to rebalance resources (quiesce drivers,
> reassign, restart drivers), but they haven't gone anywhere yet for lack of
> interest and momentum.  I do feel like we're the tracks and the train is coming,
> though ;)
> 
> Bjorn

Thanks everybody for responding to my questions.

Bjorn, from your response, it sounds like this change is safe until some possible future which new functionality is introduced for rebalancing resources.

Dexuan, I don't have any further objection to the patch.

-- Jake Oshins

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [EXTERNAL] Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
  2022-04-28 19:21         ` [EXTERNAL] " Jake Oshins
@ 2022-04-29  1:11           ` Dexuan Cui
  2022-04-29  9:43             ` Lorenzo Pieralisi
  0 siblings, 1 reply; 9+ messages in thread
From: Dexuan Cui @ 2022-04-29  1:11 UTC (permalink / raw)
  To: Jake Oshins, bhelgaas, Bjorn Helgaas, Lorenzo Pieralisi, wei.liu
  Cc: Alex Williamson, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	linux-hyperv, linux-pci, linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

> From: Jake Oshins <jakeo@microsoft.com>
> Sent: Thursday, April 28, 2022 12:21 PM
> ... 
> Thanks everybody for responding to my questions.
> 
> Bjorn, from your response, it sounds like this change is safe until some possible
> future which new functionality is introduced for rebalancing resources.
> 
> Dexuan, I don't have any further objection to the patch.
> 
> -- Jake Oshins

Thank all for the informative discussion!!

@Bjorn, Lorenzo: can this patch go through the hyperv tree?
I see some recent pci-hyperv commits in the hyperv tree:
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [EXTERNAL] Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time
  2022-04-29  1:11           ` Dexuan Cui
@ 2022-04-29  9:43             ` Lorenzo Pieralisi
  0 siblings, 0 replies; 9+ messages in thread
From: Lorenzo Pieralisi @ 2022-04-29  9:43 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: Jake Oshins, bhelgaas, Bjorn Helgaas, wei.liu, Alex Williamson,
	KY Srinivasan, Haiyang Zhang, Stephen Hemminger, linux-hyperv,
	linux-pci, linux-kernel, Michael Kelley (LINUX),
	robh, kw, kvm

On Fri, Apr 29, 2022 at 01:11:29AM +0000, Dexuan Cui wrote:
> > From: Jake Oshins <jakeo@microsoft.com>
> > Sent: Thursday, April 28, 2022 12:21 PM
> > ... 
> > Thanks everybody for responding to my questions.
> > 
> > Bjorn, from your response, it sounds like this change is safe until some possible
> > future which new functionality is introduced for rebalancing resources.
> > 
> > Dexuan, I don't have any further objection to the patch.
> > 
> > -- Jake Oshins
> 
> Thank all for the informative discussion!!
> 
> @Bjorn, Lorenzo: can this patch go through the hyperv tree?
> I see some recent pci-hyperv commits in the hyperv tree:
> https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git

For this patch it should be fine as long as it does not become a rule,
we still want to be in the review loop for PCI hyper-V changes.

I will comment and ACK shortly.

Lorenzo

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-04-29  9:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-21 18:26 [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time Dexuan Cui
2022-04-26 16:44 ` Lorenzo Pieralisi
2022-04-26 18:31   ` Dexuan Cui
2022-04-26 19:25     ` Jake Oshins
2022-04-27 22:41       ` Alex Williamson
2022-04-28 19:12       ` Bjorn Helgaas
2022-04-28 19:21         ` [EXTERNAL] " Jake Oshins
2022-04-29  1:11           ` Dexuan Cui
2022-04-29  9:43             ` Lorenzo Pieralisi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).