All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-pci@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Rob Herring <robh@kernel.org>, Clint Sbisa <csbisa@amazon.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Sunil Muthuswamy <sunilmut@microsoft.com>,
	Arnd Bergmann <arnd@arndb.de>, Marc Zyngier <maz@kernel.org>
Subject: Re: [RFC 1/2] arm64: PCI: Allow use arch-specific pci sysdata
Date: Fri, 19 Mar 2021 16:12:46 -0500	[thread overview]
Message-ID: <20210319211246.GA250618@bjorn-Precision-5520> (raw)
In-Reply-To: <20210319161956.2838291-2-boqun.feng@gmail.com>

[+cc Arnd (author of 37d6a0a6f470 ("PCI: Add
pci_register_host_bridge() interface"), which I think would make my
idea below possible), Marc (IRQ domains maintainer)]

On Sat, Mar 20, 2021 at 12:19:55AM +0800, Boqun Feng wrote:
> Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the
> ->sysdata in bus and bridge will be treated as struct pci_config_window,
> which is created by generic ECAM using the data from acpi.

It might be a mistake that we put the struct pci_config_window
pointer, which is really arch-independent, in the ->sysdata element,
which normally contains a pointer to arch- or host bridge-dependent 
data.

> However, for a virtualized PCI bus, there might be no enough data in of
> or acpi table to create a pci_config_window. This is similar to the case
> where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
> structure for sysdata, so no apci table lookup is required.
> 
> In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
> entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
> introduce arch-specific pci sysdata (similar to the one for x86) for
> ARM64, and allow the core PCI code to detect the type of sysdata at the
> runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
> field.
> 
> Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
> ---
>  arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++
>  arch/arm64/kernel/pci.c      | 15 ++++++++++++---
>  include/linux/pci.h          |  3 +++
>  3 files changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b33ca260e3c9..dade061a0658 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -22,6 +22,16 @@
>  
>  extern int isa_dma_bridge_buggy;
>  
> +struct pci_sysdata {
> +	int domain;	/* PCI domain */
> +	int node;	/* NUMA Node */
> +#ifdef CONFIG_ACPI
> +	struct acpi_device *companion;	/* ACPI companion device */
> +#endif
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +	void *fwnode;			/* IRQ domain for MSI assignment */
> +#endif
> +};

Our PCI domain code is really a mess (mostly my fault) and I hate to
make it even more complicated by adding more switches, e.g.,
->use_arch_sysdata.

I think the design problem is that PCI host bridge drivers should
supply the PCI domain up front instead of having callbacks to extract
it.

We could put "int domain_nr" in struct pci_host_bridge, and the arch
code or host bridge driver (pcibios_init_hw(), *_pcie_probe(), VMD,
HV, etc) could fill in pci_host_bridge.domain_nr before calling
pci_scan_root_bus_bridge() or pci_host_probe().

Then maybe we could get rid of pci_bus_find_domain_nr() and some of
the needlessly arch-specific implementations of pci_domain_nr().
I think we likely could get rid of CONFIG_PCI_DOMAINS_GENERIC, too,
eventually.

>  #ifdef CONFIG_PCI
>  static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  {
> @@ -31,8 +41,27 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
> +	if (bus->ops->use_arch_sysdata)
> +		return pci_domain_nr(bus);
>  	return 1;

I don't understand this.  pci_proc_domain() returns a boolean and
determines whether the /proc/bus/pci/ directory contains, e.g.,

  /proc/bus/pci/00            or
  /proc/bus/pci/0000:00

On arm64, pci_proc_domain() currently always returns 1, so the
directory contains "0000:00".  After these patches, pci_proc_domain()
returns 0 if CONFIG_PCI_DOMAINS_GENERIC=y and "bus" is in domain 0,
so buses in domain 0 will be "00" instead of "0000:00".

This doesn't make sense to me, but at the very least, this
user-visible change needs to be explained.

>  }
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
> +{
> +	struct pci_sysdata *sd = bus->sysdata;
> +
> +	if (bus->ops->use_arch_sysdata)
> +		return sd->fwnode;
> +
> +	/*
> +	 * bus->sysdata is not struct pci_sysdata, fwnode should be able to
> +	 * be queried from of/acpi.
> +	 */
> +	return NULL;
> +}
> +#define pci_root_bus_fwnode	_pci_root_bus_fwnode

Ugh.  pci_root_bus_fwnode() is another callback to find the
irq_domain.  Only one call, from pci_host_bridge_msi_domain(), which
itself is only called from pci_set_bus_msi_domain().  This feels like
another case where we could simplify things by having the host bridge
driver figure out the irq_domain explicitly when it creates the
pci_host_bridge.  It seems like that's where we have the most
information about how to find the irq_domain.

> +#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
> +
>  #endif  /* CONFIG_PCI */
>  
>  #endif  /* __ASM_PCI_H */
> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index 1006ed2d7c60..63d420d57e63 100644
> --- a/arch/arm64/kernel/pci.c
> +++ b/arch/arm64/kernel/pci.c
> @@ -74,15 +74,24 @@ struct acpi_pci_generic_root_info {
>  int acpi_pci_bus_find_domain_nr(struct pci_bus *bus)
>  {
>  	struct pci_config_window *cfg = bus->sysdata;
> -	struct acpi_device *adev = to_acpi_device(cfg->parent);
> -	struct acpi_pci_root *root = acpi_driver_data(adev);
> +	struct pci_sysdata *sd = bus->sysdata;
> +	struct acpi_device *adev;
> +	struct acpi_pci_root *root;
> +
> +	/* struct pci_sysdata has domain nr in it */
> +	if (bus->ops->use_arch_sysdata)
> +		return sd->domain;
> +
> +	/* or pci_config_window is used as sysdata */
> +	adev = to_acpi_device(cfg->parent);
> +	root = acpi_driver_data(adev);

My comments above are a lot of hand-waving without a very clear way
forward.  Would it simplify things to just add a "struct
pci_config_window *ecam_info" to pci_host_bridge, so we wouldn't have
to overload sysdata?

>  	return root->segment;
>  }
>  
>  int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
>  {
> -	if (!acpi_disabled) {
> +	if (!acpi_disabled && bridge->ops->use_arch_sysdata) {
>  		struct pci_config_window *cfg = bridge->bus->sysdata;
>  		struct acpi_device *adev = to_acpi_device(cfg->parent);
>  		struct device *bus_dev = &bridge->bus->dev;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..4036aac40361 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -740,6 +740,9 @@ struct pci_ops {
>  	void __iomem *(*map_bus)(struct pci_bus *bus, unsigned int devfn, int where);
>  	int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
>  	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
> +#ifdef CONFIG_PCI_DOMAINS_GENERIC
> +	int use_arch_sysdata;	/* ->sysdata is arch-specific */
> +#endif
>  };
>  
>  /*
> -- 
> 2.30.2
> 

WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Helgaas <helgaas@kernel.org>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-pci@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Rob Herring <robh@kernel.org>, Clint Sbisa <csbisa@amazon.com>,
	Ard Biesheuvel <ardb@kernel.org>,
	Sunil Muthuswamy <sunilmut@microsoft.com>,
	Arnd Bergmann <arnd@arndb.de>, Marc Zyngier <maz@kernel.org>
Subject: Re: [RFC 1/2] arm64: PCI: Allow use arch-specific pci sysdata
Date: Fri, 19 Mar 2021 16:12:46 -0500	[thread overview]
Message-ID: <20210319211246.GA250618@bjorn-Precision-5520> (raw)
In-Reply-To: <20210319161956.2838291-2-boqun.feng@gmail.com>

[+cc Arnd (author of 37d6a0a6f470 ("PCI: Add
pci_register_host_bridge() interface"), which I think would make my
idea below possible), Marc (IRQ domains maintainer)]

On Sat, Mar 20, 2021 at 12:19:55AM +0800, Boqun Feng wrote:
> Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the
> ->sysdata in bus and bridge will be treated as struct pci_config_window,
> which is created by generic ECAM using the data from acpi.

It might be a mistake that we put the struct pci_config_window
pointer, which is really arch-independent, in the ->sysdata element,
which normally contains a pointer to arch- or host bridge-dependent 
data.

> However, for a virtualized PCI bus, there might be no enough data in of
> or acpi table to create a pci_config_window. This is similar to the case
> where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
> structure for sysdata, so no apci table lookup is required.
> 
> In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
> entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
> introduce arch-specific pci sysdata (similar to the one for x86) for
> ARM64, and allow the core PCI code to detect the type of sysdata at the
> runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
> field.
> 
> Originally-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com>
> ---
>  arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++
>  arch/arm64/kernel/pci.c      | 15 ++++++++++++---
>  include/linux/pci.h          |  3 +++
>  3 files changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b33ca260e3c9..dade061a0658 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -22,6 +22,16 @@
>  
>  extern int isa_dma_bridge_buggy;
>  
> +struct pci_sysdata {
> +	int domain;	/* PCI domain */
> +	int node;	/* NUMA Node */
> +#ifdef CONFIG_ACPI
> +	struct acpi_device *companion;	/* ACPI companion device */
> +#endif
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +	void *fwnode;			/* IRQ domain for MSI assignment */
> +#endif
> +};

Our PCI domain code is really a mess (mostly my fault) and I hate to
make it even more complicated by adding more switches, e.g.,
->use_arch_sysdata.

I think the design problem is that PCI host bridge drivers should
supply the PCI domain up front instead of having callbacks to extract
it.

We could put "int domain_nr" in struct pci_host_bridge, and the arch
code or host bridge driver (pcibios_init_hw(), *_pcie_probe(), VMD,
HV, etc) could fill in pci_host_bridge.domain_nr before calling
pci_scan_root_bus_bridge() or pci_host_probe().

Then maybe we could get rid of pci_bus_find_domain_nr() and some of
the needlessly arch-specific implementations of pci_domain_nr().
I think we likely could get rid of CONFIG_PCI_DOMAINS_GENERIC, too,
eventually.

>  #ifdef CONFIG_PCI
>  static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  {
> @@ -31,8 +41,27 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
> +	if (bus->ops->use_arch_sysdata)
> +		return pci_domain_nr(bus);
>  	return 1;

I don't understand this.  pci_proc_domain() returns a boolean and
determines whether the /proc/bus/pci/ directory contains, e.g.,

  /proc/bus/pci/00            or
  /proc/bus/pci/0000:00

On arm64, pci_proc_domain() currently always returns 1, so the
directory contains "0000:00".  After these patches, pci_proc_domain()
returns 0 if CONFIG_PCI_DOMAINS_GENERIC=y and "bus" is in domain 0,
so buses in domain 0 will be "00" instead of "0000:00".

This doesn't make sense to me, but at the very least, this
user-visible change needs to be explained.

>  }
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
> +{
> +	struct pci_sysdata *sd = bus->sysdata;
> +
> +	if (bus->ops->use_arch_sysdata)
> +		return sd->fwnode;
> +
> +	/*
> +	 * bus->sysdata is not struct pci_sysdata, fwnode should be able to
> +	 * be queried from of/acpi.
> +	 */
> +	return NULL;
> +}
> +#define pci_root_bus_fwnode	_pci_root_bus_fwnode

Ugh.  pci_root_bus_fwnode() is another callback to find the
irq_domain.  Only one call, from pci_host_bridge_msi_domain(), which
itself is only called from pci_set_bus_msi_domain().  This feels like
another case where we could simplify things by having the host bridge
driver figure out the irq_domain explicitly when it creates the
pci_host_bridge.  It seems like that's where we have the most
information about how to find the irq_domain.

> +#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
> +
>  #endif  /* CONFIG_PCI */
>  
>  #endif  /* __ASM_PCI_H */
> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index 1006ed2d7c60..63d420d57e63 100644
> --- a/arch/arm64/kernel/pci.c
> +++ b/arch/arm64/kernel/pci.c
> @@ -74,15 +74,24 @@ struct acpi_pci_generic_root_info {
>  int acpi_pci_bus_find_domain_nr(struct pci_bus *bus)
>  {
>  	struct pci_config_window *cfg = bus->sysdata;
> -	struct acpi_device *adev = to_acpi_device(cfg->parent);
> -	struct acpi_pci_root *root = acpi_driver_data(adev);
> +	struct pci_sysdata *sd = bus->sysdata;
> +	struct acpi_device *adev;
> +	struct acpi_pci_root *root;
> +
> +	/* struct pci_sysdata has domain nr in it */
> +	if (bus->ops->use_arch_sysdata)
> +		return sd->domain;
> +
> +	/* or pci_config_window is used as sysdata */
> +	adev = to_acpi_device(cfg->parent);
> +	root = acpi_driver_data(adev);

My comments above are a lot of hand-waving without a very clear way
forward.  Would it simplify things to just add a "struct
pci_config_window *ecam_info" to pci_host_bridge, so we wouldn't have
to overload sysdata?

>  	return root->segment;
>  }
>  
>  int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
>  {
> -	if (!acpi_disabled) {
> +	if (!acpi_disabled && bridge->ops->use_arch_sysdata) {
>  		struct pci_config_window *cfg = bridge->bus->sysdata;
>  		struct acpi_device *adev = to_acpi_device(cfg->parent);
>  		struct device *bus_dev = &bridge->bus->dev;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..4036aac40361 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -740,6 +740,9 @@ struct pci_ops {
>  	void __iomem *(*map_bus)(struct pci_bus *bus, unsigned int devfn, int where);
>  	int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
>  	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
> +#ifdef CONFIG_PCI_DOMAINS_GENERIC
> +	int use_arch_sysdata;	/* ->sysdata is arch-specific */
> +#endif
>  };
>  
>  /*
> -- 
> 2.30.2
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-03-19 21:13 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19 16:19 [RFC 0/2] PCI: Introduce pci_ops::use_arch_sysdata Boqun Feng
2021-03-19 16:19 ` Boqun Feng
2021-03-19 16:19 ` [RFC 1/2] arm64: PCI: Allow use arch-specific pci sysdata Boqun Feng
2021-03-19 16:19   ` Boqun Feng
2021-03-19 21:12   ` Bjorn Helgaas [this message]
2021-03-19 21:12     ` Bjorn Helgaas
2021-03-20 12:54     ` Marc Zyngier
2021-03-20 12:54       ` Marc Zyngier
2021-03-20 13:03       ` Arnd Bergmann
2021-03-20 13:03         ` Arnd Bergmann
2021-03-20 13:23         ` Marc Zyngier
2021-03-20 13:23           ` Marc Zyngier
2021-03-20 14:24           ` Arnd Bergmann
2021-03-20 14:24             ` Arnd Bergmann
2021-03-20 17:14             ` Marc Zyngier
2021-03-20 17:14               ` Marc Zyngier
2021-03-20 12:54     ` Arnd Bergmann
2021-03-20 12:54       ` Arnd Bergmann
2021-03-20 16:09       ` Arnd Bergmann
2021-03-20 16:09         ` Arnd Bergmann
2021-03-29 14:32         ` Boqun Feng
2021-03-29 14:32           ` Boqun Feng
2021-03-29 14:43           ` Arnd Bergmann
2021-03-29 14:43             ` Arnd Bergmann
2021-03-20 12:52   ` Arnd Bergmann
2021-03-20 12:52     ` Arnd Bergmann
2021-03-19 16:19 ` [RFC 2/2] PCI: hv: Tell PCI core arch-specific sysdata is used Boqun Feng
2021-03-19 16:19   ` Boqun Feng
2021-03-19 19:04 ` [RFC 0/2] PCI: Introduce pci_ops::use_arch_sysdata Bjorn Helgaas
2021-03-19 19:04   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210319211246.GA250618@bjorn-Precision-5520 \
    --to=helgaas@kernel.org \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=bhelgaas@google.com \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=csbisa@amazon.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=maz@kernel.org \
    --cc=robh@kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=sunilmut@microsoft.com \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.