Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Sinan Kaya <okaya@kernel.org>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH v2] arm64: acpi/pci: invoke _DSM whether to preserve firmware PCI setup
Date: Fri, 14 Jun 2019 07:59:20 +1000
Message-ID: <2b5302fead4075fd73e7821af27ccd16a0e487ae.camel@kernel.crashing.org> (raw)
In-Reply-To: <20190613190248.GH13533@google.com>

On Thu, 2019-06-13 at 14:02 -0500, Bjorn Helgaas wrote:
> > +	/*
> > +	 * Invoke the PCI device specific method (_DSM) #5 'Ignore PCI Boot
> > +	 * Configuration', on the host bridge. This tells us whether the
> > +	 * firmware wants us to preserve or reassign the configuration of
> > +	 * the PCI resource tree for this root bridge.
> > +	 *
> > +	 * There are three possible outcomes here:
> > +	 *
> > +	 *  - _DSM #5 is absent. This is the default. Currently it will be
> > +	 *    architecture specific in order to maintain existing behaviours
> > +	 *    but the plan is to move arm64 into the fold: x86 and ia64 will
> > +	 *    claim the existing config, and reassign if needed. arm64 will
> > +	 *    always reassign.
> 
> The spec (PCI FW r3.2) says a _DSM that returns 0 means "OS must not
> ignore config done by firmware".  The ECR in the works changes the
> wording to something like "OS must preserve config done by firmware",
> which is equivalent but clearer.
>
> The r3.2 spec goes on to suggest that a missing _DSM means the same
> thing ("OS must not ignore firmware config").  I think that part is
> crap, and the ECR removes that wording.
> 
> I don't accept the _DSM #5 section in the PCI FW spec as being
> normative about what a missing _DSM #5 means.  This section didn't
> even exist until r3.2, and all it says is "this situation is the same
> as the legacy situation where this _DSM is not provided".  That's just
> hand-waving; it's not a requirement.
> 
> I'm not aware of any spec that says the OS can't change PCI resources
> (if there is such a spec, please cite it).
> 
> So my opinion is that a missing _DSM means nothing, and the default
> situation is that the OS can change PCI resources as necessary.

Sure, but what do you think of my wording in the patch ? This is pretty
much what I've done: Keep the current (default) behaviour, which on
x86/ia64 is to claim what's there if it looks sensible and change thing
as deemed fit (reassigned what's not sensible, extent HP bridge
resources etc...).

This patch doesn't change the arm64 behaviour in that case which is to
currently reallocate everything but Lorenzo seems to agree that this
isn't great and we should change towards a more x86-like default, which
I'm working on, but I want to keep that change separate from this
patch, if anything for bisectability of potential regressions.

> The ECR input from Windows was that in the absence of a _DSM #5, they
> keep the boot configuration unless a FW bug causes an overlap, a
> hot-add requires rebalancing, or the system includes external (e.g.,
> Thunderbolt) devices.  That's what I think Linux should do, too: keep
> the config from firmware unless we have a reason to change it.

Right, and this is what x86 does, and this is what we'll eventually do
on arm64 as above.

> > +	 *  - _DSM #5 exists and is 1. This is the FW telling us to ignore
> > +	 *    the configuration it performed. This is currently only supported
> > +	 *    on arm64.
> 
> The r3.2 spec actually says "the OS *may* ignore config done by
> firmware".  There's no *requirement* that the OS change anything.
> 
> IMHO *this* is the same as the case where there's no _DSM at all.

May or may not be. Seattle platforms here tells us explicitely that we
should ignore the configuration, unless I'm mistaken. Lorenzo do you
know better here ?

> > +	 *  - _DSM #5 exists and is 0. This should be the same as the default
> > +	 *    (_DSM #5 absent). However there are some assumptions flying around
> > +	 *    that this means we must keep the FW configuration intact. So we
> > +	 *    treat that as "probe only" for the time being. This is currently
> > +	 *    only supported on arm64.
> 
> PCI FW r3.2 says 0 means "the OS must not ignore config done by
> firmware."  That means we must keep the FW configuration intact.

So "must not ignore" doesn't mean "keep intact" in my book. In fact, to
be honest, the wording in the spec pretty much means the same for all 3
cases, ie, you "may ignore" or "must not ignore" is bullshit. I can
perfectly "not ignore but decide it's bad and change it anyway". This
is all a terrible wording.

Like all firmware specs, it's designed for people to misinterpret it
and get things wrong :-)

What we know is that there is one platform that implements _DSM #5
today and returns 1 with the intention, unless I misread what I was
told, to tell us to explicitly ignore what FW did.

> > +	obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), &pci_acpi_dsm_guid, 1,
> > +	                        IGNORE_PCI_BOOT_CONFIG_DSM, NULL);
> > +	if (obj && obj->type == ACPI_TYPE_INTEGER) {
> > +		if (obj->integer.value == 1)
> > +			host_bridge->rsrc_policy = pci_rsrc_reassign;
> > +		else
> > +			host_bridge->rsrc_policy = pci_rsrc_probe_only;
> > +	} else {
> > +		/* Default is arch specific ... for now */
> > +#ifdef CONFIG_ARM64
> > +		host_bridge->rsrc_policy = pci_rsrc_reassign;
> > +#else
> > +		host_bridge->rsrc_policy = pci_rsrc_normal;
> > +#endif
> > +	}
> 
> I think this needs to be a single bit, not a 3-choice thing. 

Well, no :-)

The enum in host_bridge needs to be 3 choices because it's meant to
represent the 3 policies that Linux has today sprinkled accross
architectures in rather inconsistent ways, but I'm not going to break
that in one patch and there are reasons for all 3 to exist. We can
discuss that separately if you want.

I plan to start cleaning up code to use that field to represent the
policies properly accross the board as I also consolidate code accross
archs/platforms.

However, what can possibly be debated here is how _DSM #5 maps to those
3. You think it should map to only two of them, I tend to disagree with
your arguments above but short of sitting on the spec commitee and
clarifying the wording, it's a matter of how has the pointiest hat and
the rounder crystal ball here.

>  I don't
> think it's possible to clearly explain how pci_rsrc_normal is
> different from pci_rsrc_reassign.  We either need to preserve the
> config or we don't.

No. "reassign" is "must reassign" or "must ignore FW settings". It may
not be what we want to do with ACPI _DSM #5 = 1, but it's definitely
what all current ARM32, powerpc4xx, and in general most embedded
platforms want today. In fact we have more platforms in the tree who
want just that.

Now you may want to make the argument that those platforms shouldn't
want that, but changing it would probably break a good half of them in
one go, that's what they've always done and I'm not here to mess around
with this.

Once I've consolidated more code, it might become easier on a per
platform basis to chose to switch to the "use what's there and reassign
what's broken or missing" model (ie, "pci_rsrc_normal"), but that will
have to be done on a case by case basis with appropriate testing etc..

> A middle ground of "we don't need to preserve the config and in fact
> we *must* reassign resources" is pointless because we don't know *why*
> we have to reassign things, and we don't know what sort of change
> would be correct.

For the ACPI case, maybe ... for all the other cases this is pretty
moot. Reassign all is what they do today, and changing it will probably
break them.

Cheers,
Ben.


> > +	ACPI_FREE(obj);
> > +
> >  	pci_scan_child_bus(bus);
> >  	pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info,
> >  				    info);
> > diff --git a/include/linux/pci-acpi.h b/include/linux/pci-acpi.h
> > index 8082b612f561..62b7fdcc661c 100644
> > --- a/include/linux/pci-acpi.h
> > +++ b/include/linux/pci-acpi.h
> > @@ -107,9 +107,10 @@ static inline void acpiphp_check_host_bridge(struct acpi_device *adev) { }
> >  #endif
> >  
> >  extern const guid_t pci_acpi_dsm_guid;
> > -#define DEVICE_LABEL_DSM	0x07
> > -#define RESET_DELAY_DSM		0x08
> > -#define FUNCTION_DELAY_DSM	0x09
> > +#define IGNORE_PCI_BOOT_CONFIG_DSM	0x05
> > +#define DEVICE_LABEL_DSM		0x07
> > +#define RESET_DELAY_DSM			0x08
> > +#define FUNCTION_DELAY_DSM		0x09
> >  
> >  #else	/* CONFIG_ACPI */
> >  static inline void acpi_pci_add_bus(struct pci_bus *bus) { }
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index dd436da7eccc..7ff5cedb30cf 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -486,6 +486,12 @@ static inline int pci_channel_offline(struct pci_dev *pdev)
> >  	return (pdev->error_state != pci_channel_io_normal);
> >  }
> >  
> > +enum pci_host_rsrc_policy {
> > +	pci_rsrc_normal,	/* Probe and (re)assign what's missing/broken */
> > +	pci_rsrc_probe_only,	/* Probe only */
> > +	pci_rsrc_reassign,	/* Reassign resources */
> > +};
> > +
> >  struct pci_host_bridge {
> >  	struct device	dev;
> >  	struct pci_bus	*bus;		/* Root bus */
> > @@ -506,6 +512,10 @@ struct pci_host_bridge {
> >  	unsigned int	native_shpc_hotplug:1;	/* OS may use SHPC hotplug */
> >  	unsigned int	native_pme:1;		/* OS may use PCIe PME */
> >  	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
> > +
> > +	/* Resource assignment/allocation policy */
> > +	enum pci_host_rsrc_policy rsrc_policy;
> > +
> >  	/* Resource alignment requirements */
> >  	resource_size_t (*align_resource)(struct pci_dev *dev,
> >  			const struct resource *res,
> > 
> > 


  reply index

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-13  7:54 Benjamin Herrenschmidt
2019-06-13 19:02 ` Bjorn Helgaas
2019-06-13 21:59   ` Benjamin Herrenschmidt [this message]
2019-06-13 23:07   ` Benjamin Herrenschmidt
2019-06-14  7:42     ` Ard Biesheuvel
2019-06-14  8:36       ` Benjamin Herrenschmidt
2019-06-14  9:57         ` Lorenzo Pieralisi
2019-06-14 10:36           ` Benjamin Herrenschmidt
2019-06-14 10:43             ` Benjamin Herrenschmidt
2019-06-14 13:12               ` Bjorn Helgaas
2019-06-14 13:48                 ` Benjamin Herrenschmidt
2019-06-14 20:13                   ` Bjorn Helgaas
2019-06-15  1:18                     ` Benjamin Herrenschmidt
2019-06-14 13:09         ` Bjorn Helgaas
2019-06-14 13:46           ` Benjamin Herrenschmidt

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b5302fead4075fd73e7821af27ccd16a0e487ae.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=helgaas@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=okaya@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org linux-pci@archiver.kernel.org
	public-inbox-index linux-pci


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/ public-inbox