linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pci: call _cond_resched() after pci_bus_write_config
@ 2021-10-13 12:55 menglong8.dong
  2021-10-13 19:00 ` Bjorn Helgaas
  0 siblings, 1 reply; 3+ messages in thread
From: menglong8.dong @ 2021-10-13 12:55 UTC (permalink / raw)
  To: bhelgaas; +Cc: linux-pci, linux-kernel, Menglong Dong

From: Menglong Dong <imagedong@tencent.com>

While the system is running in KVM, pci config writing for virtio devices
may cost long time(about 1-2ms), as it causes VM-exit. During
__pci_bus_assign_resources(), pci_setup_bridge, which can do pci config
writing up to 10 times, can be called many times without any
_cond_resched(). So __pci_bus_assign_resources can cause 25+ms scheduling
latency with !CONFIG_PREEMPT.

To solve this problem, call _cond_resched() after pci config writing.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
---
 drivers/pci/access.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 46935695cfb9..babed43702df 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -57,6 +57,7 @@ int noinline pci_bus_write_config_##size \
 	pci_lock_config(flags);						\
 	res = bus->ops->write(bus, devfn, pos, len, value);		\
 	pci_unlock_config(flags);					\
+	_cond_resched();						\
 	return res;							\
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] pci: call _cond_resched() after pci_bus_write_config
  2021-10-13 12:55 [PATCH] pci: call _cond_resched() after pci_bus_write_config menglong8.dong
@ 2021-10-13 19:00 ` Bjorn Helgaas
  2021-10-14  2:34   ` Menglong Dong
  0 siblings, 1 reply; 3+ messages in thread
From: Bjorn Helgaas @ 2021-10-13 19:00 UTC (permalink / raw)
  To: menglong8.dong; +Cc: bhelgaas, linux-pci, linux-kernel, Menglong Dong

Match previous subject lines (use "git log --oneline
drivers/pci/access.c" to see them).

On Wed, Oct 13, 2021 at 08:55:42PM +0800, menglong8.dong@gmail.com wrote:
> From: Menglong Dong <imagedong@tencent.com>
> 
> While the system is running in KVM, pci config writing for virtio devices
> may cost long time(about 1-2ms), as it causes VM-exit. During
> __pci_bus_assign_resources(), pci_setup_bridge, which can do pci config
> writing up to 10 times, can be called many times without any
> _cond_resched(). So __pci_bus_assign_resources can cause 25+ms scheduling
> latency with !CONFIG_PREEMPT.
> 
> To solve this problem, call _cond_resched() after pci config writing.

s/pci/PCI/ above.
Add space before "(".
Add "()" after function names consistently (some have it, some don't).

What exactly is the problem?  I expect __pci_bus_assign_resources() to
be used mostly during boot-time enumeration.  How much of a problem is
the latency at that point?  Why is this particularly a problem in the
KVM environment?  Or is it also a problem on bare metal?

Are there other config write paths that should have a similar change?

_cond_resched() only appears here:

  $ git grep "\<_cond_resched\>"
  include/linux/sched.h:static __always_inline int _cond_resched(void)
  include/linux/sched.h:static inline int _cond_resched(void)
  include/linux/sched.h:static inline int _cond_resched(void) { return 0; }
  include/linux/sched.h:  _cond_resched();

so I don't believe PCI is so special that this needs to be the only
other use.  Maybe a different resched interface is more appropriate?

> Signed-off-by: Menglong Dong <imagedong@tencent.com>
> ---
>  drivers/pci/access.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/access.c b/drivers/pci/access.c
> index 46935695cfb9..babed43702df 100644
> --- a/drivers/pci/access.c
> +++ b/drivers/pci/access.c
> @@ -57,6 +57,7 @@ int noinline pci_bus_write_config_##size \
>  	pci_lock_config(flags);						\
>  	res = bus->ops->write(bus, devfn, pos, len, value);		\
>  	pci_unlock_config(flags);					\
> +	_cond_resched();						\
>  	return res;							\
>  }
>  
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] pci: call _cond_resched() after pci_bus_write_config
  2021-10-13 19:00 ` Bjorn Helgaas
@ 2021-10-14  2:34   ` Menglong Dong
  0 siblings, 0 replies; 3+ messages in thread
From: Menglong Dong @ 2021-10-14  2:34 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Bjorn Helgaas, linux-pci, LKML, Menglong Dong

Hello,

On Thu, Oct 14, 2021 at 3:00 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
[...]
>
> s/pci/PCI/ above.
> Add space before "(".
> Add "()" after function names consistently (some have it, some don't).

Thanks, get it!

>
> What exactly is the problem?  I expect __pci_bus_assign_resources() to
> be used mostly during boot-time enumeration.  How much of a problem is
> the latency at that point?  Why is this particularly a problem in the
> KVM environment?  Or is it also a problem on bare metal?
>

In fact, this is a problem on KVM when hotplug virtual devices. The
initialization
of this devices will be done in a workqueue, which can be seen from the call
stack:

62485   62485   kworker/u8:0    pcibios_resource_survey_bus
        b'pcibios_resource_survey_bus+0x1 [kernel]'
        b'acpiphp_check_bridge.part.13+0x11c [kernel]'
        b'acpiphp_hotplug_notify+0x14b [kernel]'
        b'acpi_device_hotplug+0xe0 [kernel]'
        b'acpi_hotplug_work_fn+0x1e [kernel]'
        b'process_one_work+0x19f [kernel]'
        b'worker_thread+0x37 [kernel]'
        b'kthread+0x117 [kernel]'
        b'ret_from_fork+0x24 [kernel]'

And __pci_bus_assign_resources() will be called too. However, as the device
is virtual, it is simulated by qemu, which makes its pci config can't be written
directly. During pci config writing, it will cause KVM exit guest mode and qemu
in HOST can process the pci config writing. Therefore, the kworker in KVM will
block the CPU.

It is't a problem on bare metal.

The latency can be different in different machines. With 4-core CPU and 2G
memory, single pci config writing can cost 1-2ms, and
__pci_bus_assign_resources()
can cost up to 30ms.


> Are there other config write paths that should have a similar change?
>
> _cond_resched() only appears here:
>
>   $ git grep "\<_cond_resched\>"
>   include/linux/sched.h:static __always_inline int _cond_resched(void)
>   include/linux/sched.h:static inline int _cond_resched(void)
>   include/linux/sched.h:static inline int _cond_resched(void) { return 0; }
>   include/linux/sched.h:  _cond_resched();
>
> so I don't believe PCI is so special that this needs to be the only
> other use.  Maybe a different resched interface is more appropriate?

Seems _cond_resched() is not directly used any more. cond_resched()
should be used here.

Thanks!
Menglong Dong

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-14  2:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-13 12:55 [PATCH] pci: call _cond_resched() after pci_bus_write_config menglong8.dong
2021-10-13 19:00 ` Bjorn Helgaas
2021-10-14  2:34   ` Menglong Dong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).