All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] acpi,pci: handle duplicate IRQ routing entries returned from _PRT
@ 2022-09-17  9:09 Mateusz Jończyk
  2022-09-27 19:28 ` Bjorn Helgaas
  2022-11-12  0:20 ` Bjorn Helgaas
  0 siblings, 2 replies; 9+ messages in thread
From: Mateusz Jończyk @ 2022-09-17  9:09 UTC (permalink / raw)
  To: linux-kernel, linux-pci, linux-acpi
  Cc: Mateusz Jończyk, Jonathan Corbet, Bjorn Helgaas,
	Rafael J. Wysocki, Len Brown, Borislav Petkov

On some platforms, the ACPI _PRT function returns duplicate interrupt
routing entries. Linux uses the first matching entry, but sometimes the
second matching entry contains the correct interrupt vector.

This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
SMBus controller. This controller was nonfunctional unless its interrupt
usage was disabled (using the "disable_features=0x10" module parameter).

After investigation, it turned out that the driver was using an
incorrect interrupt vector: in lspci output for this device there was:
        Interrupt: pin B routed to IRQ 19
but after running i2cdetect (without using any i2c-i801 module
parameters) the following was logged to dmesg:

        [...]
        [  132.248657] i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        [  132.248669] i801_smbus 0000:00:1f.3: Transaction timeout
        [  132.452649] i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        [  132.452662] i801_smbus 0000:00:1f.3: Transaction timeout
        [  132.467682] irq 17: nobody cared (try booting with the "irqpoll" option)

Existence of duplicate entries in a table returned by the _PRT method
was confirmed by disassembling the ACPI DSTD table.

Linux used the first matching entry, which was incorrect. In order not
to disrupt existing systems, use the first matching entry unless the
pci=prtlast kernel parameter is used or a Dell Latitude E6500 laptop is
detected.

Disclaimer: there is nothing really interesting connected to the SMBus
controller on this laptop, but this change may help other systems.

Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Borislav Petkov <bp@suse.de>

---
v2: do not quote the disassembled ACPI DSDT table - for copyright reasons.
---
 .../admin-guide/kernel-parameters.txt         |  8 ++
 drivers/acpi/pci_irq.c                        | 89 ++++++++++++++++++-
 drivers/pci/pci.c                             |  9 ++
 3 files changed, 102 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 426fa892d311..2ff351db10b8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4190,6 +4190,14 @@
 				bridge windows. This is the default on modern
 				hardware. If you need to use this, please report
 				a bug to <linux-pci@vger.kernel.org>.
+		prtlast		If the _PRT ACPI method returns duplicate
+				IRQ routing entries, use the last matching entry
+				for a given device. If the platform may be
+				affected by this problem, an error message is
+				printed to dmesg - this parameter is
+				ineffective otherwise. If you need to use this,
+				please report a bug to
+				<linux-pci@vger.kernel.org>.
 		routeirq	Do IRQ routing for all PCI devices.
 				This is normally done in pci_enable_device(),
 				so this option is a temporary workaround
diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index 08e15774fb9f..5cead840de0b 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -196,12 +196,73 @@ static int acpi_pci_irq_check_entry(acpi_handle handle, struct pci_dev *dev,
 	return 0;
 }
 
+extern bool pci_prtlast;
+
+static const struct dmi_system_id pci_prtlast_dmi[] = {
+	{
+		.ident = "Dell Latitude E6500",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Latitude E6500"),
+		},
+	},
+	{ }
+};
+
+static bool acpi_pci_prt_use_last(struct acpi_prt_entry *curr,
+				  const char *current_source,
+				  const char *previous_match_source,
+				  int previous_match_index)
+{
+	bool ret;
+	const struct dmi_system_id *id;
+	const int msg_bufsize = 512;
+	char *msg = kmalloc(msg_bufsize, GFP_KERNEL);
+
+	if (!msg)
+		return false;
+
+	snprintf(msg, msg_bufsize,
+		 FW_BUG
+		 "ACPI _PRT returned duplicate IRQ routing entries for PCI device "
+		 "%04x:%02x:%02x[INT%c]: %s[%d] and %s[%d]. ",
+		 curr->id.segment, curr->id.bus, curr->id.device,
+		 pin_name(curr->pin),
+		 previous_match_source, previous_match_index,
+		 current_source, curr->index);
+
+	id = dmi_first_match(pci_prtlast_dmi);
+
+	if (id) {
+		pr_warn("%s%s detected, using last entry.\n",
+			msg, id->ident);
+
+		ret = true;
+	} else if (pci_prtlast) {
+		pr_err(
+"%sUsing last entry, as directed on the command line. If this helps, report a bug.\n",
+		       msg);
+
+		ret = true;
+	} else {
+		pr_err("%sIf necessary, use \"pci=prtlast\" and report a bug.\n",
+		       msg);
+
+		ret = false;
+	}
+
+	kfree(msg);
+	return ret;
+}
+
 static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
-			  int pin, struct acpi_prt_entry **entry_ptr)
+			  int pin, struct acpi_prt_entry **entry_ptr_out)
 {
 	acpi_status status;
 	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
 	struct acpi_pci_routing_table *entry;
+	struct acpi_prt_entry *match = NULL;
+	const char *match_source = NULL;
 	acpi_handle handle = NULL;
 
 	if (dev->bus->bridge)
@@ -219,13 +280,33 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 
 	entry = buffer.pointer;
 	while (entry && (entry->length > 0)) {
-		if (!acpi_pci_irq_check_entry(handle, dev, pin,
-						 entry, entry_ptr))
-			break;
+		struct acpi_prt_entry *curr;
+
+		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
+			if (!match) {
+				// first match
+				match = curr;
+				match_source = entry->source;
+			} else if (!acpi_pci_prt_use_last(curr,
+							  entry->source,
+							  match_source,
+							  match->index)) {
+				// duplicates found, use first entry
+				kfree(curr);
+			} else {
+				// duplicates found, use last entry
+				kfree(match);
+				match = curr;
+				match_source = entry->source;
+			}
+		}
+
 		entry = (struct acpi_pci_routing_table *)
 		    ((unsigned long)entry + entry->length);
 	}
 
+	*entry_ptr_out = match;
+
 	kfree(buffer.pointer);
 	return 0;
 }
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 95bc329e74c0..a14a2e4e4197 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -155,6 +155,11 @@ static bool pci_bridge_d3_disable;
 /* Force bridge_d3 for all PCIe ports */
 static bool pci_bridge_d3_force;
 
+#ifdef CONFIG_ACPI
+/* Use the last matching entry from the table returned by the _PRT ACPI method. */
+bool pci_prtlast;
+#endif
+
 static int __init pcie_port_pm_setup(char *str)
 {
 	if (!strcmp(str, "off"))
@@ -6896,6 +6901,10 @@ static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+#ifdef CONFIG_ACPI
+			} else if (!strncmp(str, "prtlast", 7)) {
+				pci_prtlast = true;
+#endif
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}

base-commit: 7e18e42e4b280c85b76967a9106a13ca61c16179
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] acpi,pci: handle duplicate IRQ routing entries returned from _PRT
  2022-09-17  9:09 [PATCH v2] acpi,pci: handle duplicate IRQ routing entries returned from _PRT Mateusz Jończyk
@ 2022-09-27 19:28 ` Bjorn Helgaas
  2022-09-29 18:03   ` Mateusz Jończyk
  2022-11-12  0:20 ` Bjorn Helgaas
  1 sibling, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2022-09-27 19:28 UTC (permalink / raw)
  To: Mateusz Jończyk
  Cc: linux-kernel, linux-pci, linux-acpi, Jonathan Corbet,
	Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Borislav Petkov

On Sat, Sep 17, 2022 at 11:09:44AM +0200, Mateusz Jończyk wrote:
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.
> 
> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller was nonfunctional unless its interrupt
> usage was disabled (using the "disable_features=0x10" module parameter).
> 
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
> 
>         [...]
>         [  132.248657] i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         [  132.248669] i801_smbus 0000:00:1f.3: Transaction timeout
>         [  132.452649] i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         [  132.452662] i801_smbus 0000:00:1f.3: Transaction timeout
>         [  132.467682] irq 17: nobody cared (try booting with the "irqpoll" option)

Drop the timestamps; they add clutter but not useful information.

> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSTD table.
> 
> Linux used the first matching entry, which was incorrect. In order not
> to disrupt existing systems, use the first matching entry unless the
> pci=prtlast kernel parameter is used or a Dell Latitude E6500 laptop is
> detected.

Do we have a reason to believe that in general, using the first
matching entry is incorrect?  I don't see anything in the ACPI spec
(r6.5, sec 6.2.13) that sheds light on this.

Presumably this works on Windows, and I doubt Windows would have a
platform quirk for this, so I hypothesize that Windows treats _PRT
entries as assignments, and the last one rules.  Maybe Linux should
adopt that rule?

Bjorn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] acpi,pci: handle duplicate IRQ routing entries returned from _PRT
  2022-09-27 19:28 ` Bjorn Helgaas
@ 2022-09-29 18:03   ` Mateusz Jończyk
  0 siblings, 0 replies; 9+ messages in thread
From: Mateusz Jończyk @ 2022-09-29 18:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, linux-pci, linux-acpi, Jonathan Corbet,
	Bjorn Helgaas, Rafael J. Wysocki, Len Brown, Borislav Petkov

W dniu 27.09.2022 o 21:28, Bjorn Helgaas pisze:
> On Sat, Sep 17, 2022 at 11:09:44AM +0200, Mateusz Jończyk wrote:
>> On some platforms, the ACPI _PRT function returns duplicate interrupt
>> routing entries. Linux uses the first matching entry, but sometimes the
>> second matching entry contains the correct interrupt vector.
>>
>> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
>> SMBus controller. This controller was nonfunctional unless its interrupt
>> usage was disabled (using the "disable_features=0x10" module parameter).
[...]
>> Existence of duplicate entries in a table returned by the _PRT method
>> was confirmed by disassembling the ACPI DSTD table.
>>
>> Linux used the first matching entry, which was incorrect. In order not
>> to disrupt existing systems, use the first matching entry unless the
>> pci=prtlast kernel parameter is used or a Dell Latitude E6500 laptop is
>> detected.
> Do we have a reason to believe that in general, using the first
> matching entry is incorrect?  I don't see anything in the ACPI spec
> (r6.5, sec 6.2.13) that sheds light on this.
I meant that the entry was incorrect, not that Linux behaviour was incorrect.

I have also searched and browsed the ACPI spec, but have found no general rule
that the OS should use the first or the last matching element from a list (in a general case,
not just _PRT).

> Presumably this works on Windows, and I doubt Windows would have a
> platform quirk for this, so I hypothesize that Windows treats _PRT
> entries as assignments, and the last one rules.  Maybe Linux should
> adopt that rule?

I don't know whether this works on Windows, or just the laptop OEM did not care
about the i2c bus on this model.

As a start, we may just print a warning when the _PRT table contains multiple matching
entries for a given device - to see if there are any other devices that are affected
(and which of the interrupt vectors for them is the correct one). This would be simpler
then my proposed patch.

Greetings,

Mateusz


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] acpi,pci: handle duplicate IRQ routing entries returned from _PRT
  2022-09-17  9:09 [PATCH v2] acpi,pci: handle duplicate IRQ routing entries returned from _PRT Mateusz Jończyk
  2022-09-27 19:28 ` Bjorn Helgaas
@ 2022-11-12  0:20 ` Bjorn Helgaas
  2022-11-12 20:07   ` Mateusz Jończyk
  1 sibling, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2022-11-12  0:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-kernel, linux-pci, linux-acpi, Jonathan Corbet,
	Bjorn Helgaas, Len Brown, Borislav Petkov, Mateusz Jończyk,
	Jean Delvare, linux-i2c

[+cc Jean, linux-i2c]

On Sat, Sep 17, 2022 at 11:09:44AM +0200, Mateusz Jończyk wrote:
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.

Rafael, Jean, what do you think about this?  It seems like kind of a
lot of infrastructure to deal with this oddness, but I'm not really
opposed to it.

This is in i2c-i801.c, which seems to have some support for polling;
maybe it could make smart enough to complain and automatically switch
to polling if a timeout occurs.

Or maybe we scan the entire _PRT and let the match win (instead of the
first as we do today).

Or ...?

Google finds a lot of hits for "i801_smbus" "timeout waiting for
interrupt", but I can't tell whether they're a similar _PRT issue or
something else.

> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller was nonfunctional unless its interrupt
> usage was disabled (using the "disable_features=0x10" module parameter).
> 
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
> 
>         [...]
>         [  132.248657] i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         [  132.248669] i801_smbus 0000:00:1f.3: Transaction timeout
>         [  132.452649] i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         [  132.452662] i801_smbus 0000:00:1f.3: Transaction timeout
>         [  132.467682] irq 17: nobody cared (try booting with the "irqpoll" option)
> 
> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSTD table.
> 
> Linux used the first matching entry, which was incorrect. In order not
> to disrupt existing systems, use the first matching entry unless the
> pci=prtlast kernel parameter is used or a Dell Latitude E6500 laptop is
> detected.
> 
> Disclaimer: there is nothing really interesting connected to the SMBus
> controller on this laptop, but this change may help other systems.
> 
> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> 
> ---
> v2: do not quote the disassembled ACPI DSDT table - for copyright reasons.
> ---
>  .../admin-guide/kernel-parameters.txt         |  8 ++
>  drivers/acpi/pci_irq.c                        | 89 ++++++++++++++++++-
>  drivers/pci/pci.c                             |  9 ++
>  3 files changed, 102 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 426fa892d311..2ff351db10b8 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4190,6 +4190,14 @@
>  				bridge windows. This is the default on modern
>  				hardware. If you need to use this, please report
>  				a bug to <linux-pci@vger.kernel.org>.
> +		prtlast		If the _PRT ACPI method returns duplicate
> +				IRQ routing entries, use the last matching entry
> +				for a given device. If the platform may be
> +				affected by this problem, an error message is
> +				printed to dmesg - this parameter is
> +				ineffective otherwise. If you need to use this,
> +				please report a bug to
> +				<linux-pci@vger.kernel.org>.
>  		routeirq	Do IRQ routing for all PCI devices.
>  				This is normally done in pci_enable_device(),
>  				so this option is a temporary workaround
> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> index 08e15774fb9f..5cead840de0b 100644
> --- a/drivers/acpi/pci_irq.c
> +++ b/drivers/acpi/pci_irq.c
> @@ -196,12 +196,73 @@ static int acpi_pci_irq_check_entry(acpi_handle handle, struct pci_dev *dev,
>  	return 0;
>  }
>  
> +extern bool pci_prtlast;
> +
> +static const struct dmi_system_id pci_prtlast_dmi[] = {
> +	{
> +		.ident = "Dell Latitude E6500",
> +		.matches = {
> +			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
> +			DMI_MATCH(DMI_PRODUCT_NAME, "Latitude E6500"),
> +		},
> +	},
> +	{ }
> +};
> +
> +static bool acpi_pci_prt_use_last(struct acpi_prt_entry *curr,
> +				  const char *current_source,
> +				  const char *previous_match_source,
> +				  int previous_match_index)
> +{
> +	bool ret;
> +	const struct dmi_system_id *id;
> +	const int msg_bufsize = 512;
> +	char *msg = kmalloc(msg_bufsize, GFP_KERNEL);
> +
> +	if (!msg)
> +		return false;
> +
> +	snprintf(msg, msg_bufsize,
> +		 FW_BUG
> +		 "ACPI _PRT returned duplicate IRQ routing entries for PCI device "
> +		 "%04x:%02x:%02x[INT%c]: %s[%d] and %s[%d]. ",
> +		 curr->id.segment, curr->id.bus, curr->id.device,
> +		 pin_name(curr->pin),
> +		 previous_match_source, previous_match_index,
> +		 current_source, curr->index);
> +
> +	id = dmi_first_match(pci_prtlast_dmi);
> +
> +	if (id) {
> +		pr_warn("%s%s detected, using last entry.\n",
> +			msg, id->ident);
> +
> +		ret = true;
> +	} else if (pci_prtlast) {
> +		pr_err(
> +"%sUsing last entry, as directed on the command line. If this helps, report a bug.\n",
> +		       msg);
> +
> +		ret = true;
> +	} else {
> +		pr_err("%sIf necessary, use \"pci=prtlast\" and report a bug.\n",
> +		       msg);
> +
> +		ret = false;
> +	}
> +
> +	kfree(msg);
> +	return ret;
> +}
> +
>  static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
> -			  int pin, struct acpi_prt_entry **entry_ptr)
> +			  int pin, struct acpi_prt_entry **entry_ptr_out)
>  {
>  	acpi_status status;
>  	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>  	struct acpi_pci_routing_table *entry;
> +	struct acpi_prt_entry *match = NULL;
> +	const char *match_source = NULL;
>  	acpi_handle handle = NULL;
>  
>  	if (dev->bus->bridge)
> @@ -219,13 +280,33 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>  
>  	entry = buffer.pointer;
>  	while (entry && (entry->length > 0)) {
> -		if (!acpi_pci_irq_check_entry(handle, dev, pin,
> -						 entry, entry_ptr))
> -			break;
> +		struct acpi_prt_entry *curr;
> +
> +		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
> +			if (!match) {
> +				// first match
> +				match = curr;
> +				match_source = entry->source;
> +			} else if (!acpi_pci_prt_use_last(curr,
> +							  entry->source,
> +							  match_source,
> +							  match->index)) {
> +				// duplicates found, use first entry
> +				kfree(curr);
> +			} else {
> +				// duplicates found, use last entry
> +				kfree(match);
> +				match = curr;
> +				match_source = entry->source;
> +			}
> +		}
> +
>  		entry = (struct acpi_pci_routing_table *)
>  		    ((unsigned long)entry + entry->length);
>  	}
>  
> +	*entry_ptr_out = match;
> +
>  	kfree(buffer.pointer);
>  	return 0;
>  }
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 95bc329e74c0..a14a2e4e4197 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -155,6 +155,11 @@ static bool pci_bridge_d3_disable;
>  /* Force bridge_d3 for all PCIe ports */
>  static bool pci_bridge_d3_force;
>  
> +#ifdef CONFIG_ACPI
> +/* Use the last matching entry from the table returned by the _PRT ACPI method. */
> +bool pci_prtlast;
> +#endif
> +
>  static int __init pcie_port_pm_setup(char *str)
>  {
>  	if (!strcmp(str, "off"))
> @@ -6896,6 +6901,10 @@ static int __init pci_setup(char *str)
>  				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
>  			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
>  				disable_acs_redir_param = str + 18;
> +#ifdef CONFIG_ACPI
> +			} else if (!strncmp(str, "prtlast", 7)) {
> +				pci_prtlast = true;
> +#endif
>  			} else {
>  				pr_err("PCI: Unknown option `%s'\n", str);
>  			}
> 
> base-commit: 7e18e42e4b280c85b76967a9106a13ca61c16179
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] acpi,pci: handle duplicate IRQ routing entries returned from _PRT
  2022-11-12  0:20 ` Bjorn Helgaas
@ 2022-11-12 20:07   ` Mateusz Jończyk
  2022-11-12 20:09     ` [PATCH] acpi,pci: warn about " Mateusz Jończyk
  0 siblings, 1 reply; 9+ messages in thread
From: Mateusz Jończyk @ 2022-11-12 20:07 UTC (permalink / raw)
  To: Bjorn Helgaas, Rafael J. Wysocki
  Cc: linux-kernel, linux-pci, linux-acpi, Jonathan Corbet,
	Bjorn Helgaas, Len Brown, Borislav Petkov, Jean Delvare,
	linux-i2c

W dniu 12.11.2022 o 01:20, Bjorn Helgaas pisze:
> [+cc Jean, linux-i2c]
>
> On Sat, Sep 17, 2022 at 11:09:44AM +0200, Mateusz Jończyk wrote:
>> On some platforms, the ACPI _PRT function returns duplicate interrupt
>> routing entries. Linux uses the first matching entry, but sometimes the
>> second matching entry contains the correct interrupt vector.
> Rafael, Jean, what do you think about this?  It seems like kind of a
> lot of infrastructure to deal with this oddness, but I'm not really
> opposed to it.
>
> This is in i2c-i801.c, which seems to have some support for polling;
> maybe it could make smart enough to complain and automatically switch
> to polling if a timeout occurs.
>
> Or maybe we scan the entire _PRT and let the match win (instead of the
> first as we do today).
>
> Or ...?
>
> Google finds a lot of hits for "i801_smbus" "timeout waiting for
> interrupt", but I can't tell whether they're a similar _PRT issue or
> something else.
>
>> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
>> SMBus controller. This controller was nonfunctional unless its interrupt
>> usage was disabled (using the "disable_features=0x10" module parameter).

Hello,

I have prepared a lean patch that only prints a warning when there are
two matching entries in the table returned from _PRT (I will send it in the
next e-mail). Perhaps it could be merged and then after a release or two
it will be known how widespread this problem is.

Greetings,

Mateusz


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] acpi,pci: warn about duplicate IRQ routing entries returned from _PRT
  2022-11-12 20:07   ` Mateusz Jończyk
@ 2022-11-12 20:09     ` Mateusz Jończyk
  2022-11-13 17:34       ` [PATCH v2] " Mateusz Jończyk
  0 siblings, 1 reply; 9+ messages in thread
From: Mateusz Jończyk @ 2022-11-12 20:09 UTC (permalink / raw)
  To: linux-kernel, linux-pci, linux-acpi, linux-i2c
  Cc: Mateusz Jończyk, Bjorn Helgaas, Rafael J. Wysocki,
	Len Brown, Borislav Petkov, Jean Delvare

On some platforms, the ACPI _PRT function returns duplicate interrupt
routing entries. Linux uses the first matching entry, but sometimes the
second matching entry contains the correct interrupt vector.

To check how widespread this problem is, print a warning to dmesg if
this is the case.

This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
SMBus controller. This controller was nonfunctional unless its interrupt
usage was disabled (using the "disable_features=0x10" module parameter).

After investigation, it turned out that the driver was using an
incorrect interrupt vector: in lspci output for this device there was:
        Interrupt: pin B routed to IRQ 19
but after running i2cdetect (without using any i2c-i801 module
parameters) the following was logged to dmesg:

	[...]
        i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.3: Transaction timeout
        irq 17: nobody cared (try booting with the "irqpoll" option)

Existence of duplicate entries in the table returned by the _PRT method
was confirmed by disassembling the ACPI DSDT table.

Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jean Delvare <jdelvare@suse.com>
---
 drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index 08e15774fb9f..c3168927446c 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
 	struct acpi_pci_routing_table *entry;
 	acpi_handle handle = NULL;
+	struct acpi_prt_entry *match = NULL;
+	const char *match_int_source = NULL;
 
 	if (dev->bus->bridge)
 		handle = ACPI_HANDLE(dev->bus->bridge);
@@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 
 	entry = buffer.pointer;
 	while (entry && (entry->length > 0)) {
-		if (!acpi_pci_irq_check_entry(handle, dev, pin,
-						 entry, entry_ptr))
-			break;
+		struct acpi_prt_entry *curr;
+
+		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
+			if (match == NULL) {
+				match = curr;
+				match_int_source = entry->source;
+			} else {
+				pr_warn(FW_BUG
+				"ACPI _PRT returned duplicate IRQ routing entries for device "
+					"%04x:%02x:%02x[INT%c]: %s[%d] and %s[%d]. ",
+					curr->id.segment, curr->id.bus, curr->id.device,
+					pin_name(curr->pin),
+					match_int_source, match->index,
+					entry->source, curr->index);
+				// we use the first matching entry nonetheless
+			}
+		}
+
 		entry = (struct acpi_pci_routing_table *)
 		    ((unsigned long)entry + entry->length);
 	}
 
+	*entry_ptr = match;
+
 	kfree(buffer.pointer);
 	return 0;
 }

base-commit: f0c4d9fc9cc9462659728d168387191387e903cc
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2] acpi,pci: warn about duplicate IRQ routing entries returned from _PRT
  2022-11-12 20:09     ` [PATCH] acpi,pci: warn about " Mateusz Jończyk
@ 2022-11-13 17:34       ` Mateusz Jończyk
  2022-11-15  8:36         ` Jean Delvare
  0 siblings, 1 reply; 9+ messages in thread
From: Mateusz Jończyk @ 2022-11-13 17:34 UTC (permalink / raw)
  To: linux-kernel, linux-pci, linux-acpi, linux-i2c
  Cc: Mateusz Jończyk, Bjorn Helgaas, Rafael J. Wysocki,
	Len Brown, Borislav Petkov, Jean Delvare

On some platforms, the ACPI _PRT function returns duplicate interrupt
routing entries. Linux uses the first matching entry, but sometimes the
second matching entry contains the correct interrupt vector.

Print a warning to dmesg if duplicate interrupt routing entries are
present, so that we could check how many models are affected.

This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
SMBus controller. This controller was nonfunctional unless its interrupt
usage was disabled (using the "disable_features=0x10" module parameter).

After investigation, it turned out that the driver was using an
incorrect interrupt vector: in lspci output for this device there was:
        Interrupt: pin B routed to IRQ 19
but after running i2cdetect (without using any i2c-i801 module
parameters) the following was logged to dmesg:

        [...]
        i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.3: Transaction timeout
        i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
        i801_smbus 0000:00:1f.3: Transaction timeout
        irq 17: nobody cared (try booting with the "irqpoll" option)

Existence of duplicate entries in a table returned by the _PRT method
was confirmed by disassembling the ACPI DSDT table.

Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jean Delvare <jdelvare@suse.com>

--
v2: - add a newline at the end of the kernel log message,
    - replace: "if (match == NULL)" -> "if (!match)"
    - patch description tweaks.

Tested on two computers, including the affected Dell Latitude E6500 laptop.

 drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
index 08e15774fb9f..a4e41b7b71ed 100644
--- a/drivers/acpi/pci_irq.c
+++ b/drivers/acpi/pci_irq.c
@@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
 	struct acpi_pci_routing_table *entry;
 	acpi_handle handle = NULL;
+	struct acpi_prt_entry *match = NULL;
+	const char *match_int_source = NULL;
 
 	if (dev->bus->bridge)
 		handle = ACPI_HANDLE(dev->bus->bridge);
@@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
 
 	entry = buffer.pointer;
 	while (entry && (entry->length > 0)) {
-		if (!acpi_pci_irq_check_entry(handle, dev, pin,
-						 entry, entry_ptr))
-			break;
+		struct acpi_prt_entry *curr;
+
+		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
+			if (!match) {
+				match = curr;
+				match_int_source = entry->source;
+			} else {
+				pr_warn(FW_BUG
+				"ACPI _PRT returned duplicate IRQ routing entries for device "
+					"%04x:%02x:%02x[INT%c]: %s[%d] and %s[%d].\n",
+					curr->id.segment, curr->id.bus, curr->id.device,
+					pin_name(curr->pin),
+					match_int_source, match->index,
+					entry->source, curr->index);
+				// we use the first matching entry nonetheless
+			}
+		}
+
 		entry = (struct acpi_pci_routing_table *)
 		    ((unsigned long)entry + entry->length);
 	}
 
+	*entry_ptr = match;
+
 	kfree(buffer.pointer);
 	return 0;
 }

base-commit: f0c4d9fc9cc9462659728d168387191387e903cc
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] acpi,pci: warn about duplicate IRQ routing entries returned from _PRT
  2022-11-13 17:34       ` [PATCH v2] " Mateusz Jończyk
@ 2022-11-15  8:36         ` Jean Delvare
  2022-11-23 20:28           ` Mateusz Jończyk
  0 siblings, 1 reply; 9+ messages in thread
From: Jean Delvare @ 2022-11-15  8:36 UTC (permalink / raw)
  To: Mateusz Jończyk
  Cc: linux-kernel, linux-pci, linux-acpi, linux-i2c, Bjorn Helgaas,
	Rafael J. Wysocki, Len Brown, Borislav Petkov

Hi Mateusz,

On Sun, 13 Nov 2022 18:34:42 +0100, Mateusz Jończyk wrote:
> On some platforms, the ACPI _PRT function returns duplicate interrupt
> routing entries. Linux uses the first matching entry, but sometimes the
> second matching entry contains the correct interrupt vector.
> 
> Print a warning to dmesg if duplicate interrupt routing entries are
> present, so that we could check how many models are affected.

Excellent idea. We want hardware manufacturers to fix such bugs in the
firmware, and the best way for this to happen is to report them
whenever they are encountered.

> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
> SMBus controller. This controller was nonfunctional unless its interrupt
> usage was disabled (using the "disable_features=0x10" module parameter).
> 
> After investigation, it turned out that the driver was using an
> incorrect interrupt vector: in lspci output for this device there was:
>         Interrupt: pin B routed to IRQ 19
> but after running i2cdetect (without using any i2c-i801 module
> parameters) the following was logged to dmesg:
> 
>         [...]
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>         i801_smbus 0000:00:1f.3: Transaction timeout
>         irq 17: nobody cared (try booting with the "irqpoll" option)
> 
> Existence of duplicate entries in a table returned by the _PRT method
> was confirmed by disassembling the ACPI DSDT table.

Excuse a probably stupid question, but what would happen if we would
plain ignore the IRQ routing information from ACPI in this case? Would
we fallback to some pure-PCI routing logic which may have a chance to
find the right IRQ routing (matching the second ACPI routing entry in
this case)?

> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Jean Delvare <jdelvare@suse.com>
> 
> --
> v2: - add a newline at the end of the kernel log message,
>     - replace: "if (match == NULL)" -> "if (!match)"
>     - patch description tweaks.
> 
> Tested on two computers, including the affected Dell Latitude E6500 laptop.
> 
>  drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
> index 08e15774fb9f..a4e41b7b71ed 100644
> --- a/drivers/acpi/pci_irq.c
> +++ b/drivers/acpi/pci_irq.c
> @@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>  	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>  	struct acpi_pci_routing_table *entry;
>  	acpi_handle handle = NULL;
> +	struct acpi_prt_entry *match = NULL;
> +	const char *match_int_source = NULL;
>  
>  	if (dev->bus->bridge)
>  		handle = ACPI_HANDLE(dev->bus->bridge);
> @@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>  
>  	entry = buffer.pointer;
>  	while (entry && (entry->length > 0)) {
> -		if (!acpi_pci_irq_check_entry(handle, dev, pin,
> -						 entry, entry_ptr))
> -			break;
> +		struct acpi_prt_entry *curr;
> +
> +		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
> +			if (!match) {
> +				match = curr;
> +				match_int_source = entry->source;
> +			} else {
> +				pr_warn(FW_BUG
> +				"ACPI _PRT returned duplicate IRQ routing entries for device "
> +					"%04x:%02x:%02x[INT%c]: %s[%d] and %s[%d].\n",

The beginning of the string should be aligned with the opening
parenthesis, and the string should be on a single line (this is a
encouraged exception to the 80-column rule). I would also omit the
tailing dot for consistency.

> +					curr->id.segment, curr->id.bus, curr->id.device,

Is the IRQ per PCI device, or per PCI function? If the latter, then you
should print "%02x.%x" instead of just "%02x", with the extra element
being curr->id.function.

> +					pin_name(curr->pin),
> +					match_int_source, match->index,
> +					entry->source, curr->index);
> +				// we use the first matching entry nonetheless

The rest of the file uses /* C89-style comments */ so I would stick to
that for consistency.

> +			}
> +		}
> +
>  		entry = (struct acpi_pci_routing_table *)
>  		    ((unsigned long)entry + entry->length);
>  	}
>  
> +	*entry_ptr = match;
> +
>  	kfree(buffer.pointer);
>  	return 0;
>  }

Reviewed-by: Jean Delvare <jdelvare@suse.de>
Tested-by: Jean Delvare <jdelvare@suse.de>

(Tested on a Dell OptiPlex 9020 not affected by the problem.)

-- 
Jean Delvare
SUSE L3 Support

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] acpi,pci: warn about duplicate IRQ routing entries returned from _PRT
  2022-11-15  8:36         ` Jean Delvare
@ 2022-11-23 20:28           ` Mateusz Jończyk
  0 siblings, 0 replies; 9+ messages in thread
From: Mateusz Jończyk @ 2022-11-23 20:28 UTC (permalink / raw)
  To: Jean Delvare
  Cc: linux-kernel, linux-pci, linux-acpi, linux-i2c, Bjorn Helgaas,
	Rafael J. Wysocki, Len Brown, Borislav Petkov

Hello,

W dniu 15.11.2022 o 09:36, Jean Delvare pisze:
> Hi Mateusz,
>
> On Sun, 13 Nov 2022 18:34:42 +0100, Mateusz Jończyk wrote:
>> On some platforms, the ACPI _PRT function returns duplicate interrupt
>> routing entries. Linux uses the first matching entry, but sometimes the
>> second matching entry contains the correct interrupt vector.
>>
>> Print a warning to dmesg if duplicate interrupt routing entries are
>> present, so that we could check how many models are affected.
> Excellent idea. We want hardware manufacturers to fix such bugs in the
> firmware, and the best way for this to happen is to report them
> whenever they are encountered.
>
>> This happens on a Dell Latitude E6500 laptop with the i2c-i801 Intel
>> SMBus controller. This controller was nonfunctional unless its interrupt
>> usage was disabled (using the "disable_features=0x10" module parameter).
>>
>> After investigation, it turned out that the driver was using an
>> incorrect interrupt vector: in lspci output for this device there was:
>>         Interrupt: pin B routed to IRQ 19
>> but after running i2cdetect (without using any i2c-i801 module
>> parameters) the following was logged to dmesg:
>>
>>         [...]
>>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>>         i801_smbus 0000:00:1f.3: Transaction timeout
>>         i801_smbus 0000:00:1f.3: Timeout waiting for interrupt!
>>         i801_smbus 0000:00:1f.3: Transaction timeout
>>         irq 17: nobody cared (try booting with the "irqpoll" option)
>>
>> Existence of duplicate entries in a table returned by the _PRT method
>> was confirmed by disassembling the ACPI DSDT table.
> Excuse a probably stupid question, but what would happen if we would
> plain ignore the IRQ routing information from ACPI in this case? Would
> we fallback to some pure-PCI routing logic which may have a chance to
> find the right IRQ routing (matching the second ACPI routing entry in
> this case)?

From what I understand, the PCI IRQ routing information is not discoverable
by probing the hardware (in the general case), it has to be obtained from
the ACPI tables (or perhaps from the obsolete MP tables, also provided by
firmware). See https://docs.kernel.org/PCI/acpi-info.html :

> For example, there’s no standard hardware mechanism for enumerating PCI
> host bridges, so the ACPI namespace must describe each host bridge,
> the method for accessing PCI config space below it, the address space
> windows the host bridge forwards to PCI (using _CRS), and the routing
> of legacy INTx interrupts (using _PRT).

(a PCI host bridge connects the CPU cores to the PCI bus, it is the root of the PCI
device tree. This patch concerns the "legacy INTx interrupts" as above).

In the case of this particular laptop, however, it should be possible to obtain
the information by reading chipset registers, which are documented at
https://www.intel.com/content/www/us/en/io/io-controller-hub-9-datasheet.html
But this is difficult to implement in every case.

>> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
>> Cc: Len Brown <lenb@kernel.org>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: Jean Delvare <jdelvare@suse.com>
>>
>> --
>> v2: - add a newline at the end of the kernel log message,
>>     - replace: "if (match == NULL)" -> "if (!match)"
>>     - patch description tweaks.
>>
>> Tested on two computers, including the affected Dell Latitude E6500 laptop.
>>
>>  drivers/acpi/pci_irq.c | 25 ++++++++++++++++++++++---
>>  1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c
>> index 08e15774fb9f..a4e41b7b71ed 100644
>> --- a/drivers/acpi/pci_irq.c
>> +++ b/drivers/acpi/pci_irq.c
>> @@ -203,6 +203,8 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>>  	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
>>  	struct acpi_pci_routing_table *entry;
>>  	acpi_handle handle = NULL;
>> +	struct acpi_prt_entry *match = NULL;
>> +	const char *match_int_source = NULL;
>>  
>>  	if (dev->bus->bridge)
>>  		handle = ACPI_HANDLE(dev->bus->bridge);
>> @@ -219,13 +221,30 @@ static int acpi_pci_irq_find_prt_entry(struct pci_dev *dev,
>>  
>>  	entry = buffer.pointer;
>>  	while (entry && (entry->length > 0)) {
>> -		if (!acpi_pci_irq_check_entry(handle, dev, pin,
>> -						 entry, entry_ptr))
>> -			break;
>> +		struct acpi_prt_entry *curr;
>> +
>> +		if (!acpi_pci_irq_check_entry(handle, dev, pin, entry, &curr)) {
>> +			if (!match) {
>> +				match = curr;
>> +				match_int_source = entry->source;
>> +			} else {
>> +				pr_warn(FW_BUG
>> +				"ACPI _PRT returned duplicate IRQ routing entries for device "
>> +					"%04x:%02x:%02x[INT%c]: %s[%d] and %s[%d].\n",
> The beginning of the string should be aligned with the opening
> parenthesis, and the string should be on a single line (this is a
> encouraged exception to the 80-column rule). I would also omit the
> tailing dot for consistency.
OK
>> +					curr->id.segment, curr->id.bus, curr->id.device,
> Is the IRQ per PCI device, or per PCI function? If the latter, then you
> should print "%02x.%x" instead of just "%02x", with the extra element
> being curr->id.function.

This is per PCI device.

[snip]

> Reviewed-by: Jean Delvare <jdelvare@suse.de>
> Tested-by: Jean Delvare <jdelvare@suse.de>
>
> (Tested on a Dell OptiPlex 9020 not affected by the problem.)
>
Thank you for reviewing.

Greetings,

Mateusz


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-11-23 20:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-17  9:09 [PATCH v2] acpi,pci: handle duplicate IRQ routing entries returned from _PRT Mateusz Jończyk
2022-09-27 19:28 ` Bjorn Helgaas
2022-09-29 18:03   ` Mateusz Jończyk
2022-11-12  0:20 ` Bjorn Helgaas
2022-11-12 20:07   ` Mateusz Jończyk
2022-11-12 20:09     ` [PATCH] acpi,pci: warn about " Mateusz Jończyk
2022-11-13 17:34       ` [PATCH v2] " Mateusz Jończyk
2022-11-15  8:36         ` Jean Delvare
2022-11-23 20:28           ` Mateusz Jończyk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.