* [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context
@ 2013-05-08 17:15 Lance Ortiz
2013-05-08 19:10 ` Borislav Petkov
2013-05-08 22:01 ` Rafael J. Wysocki
0 siblings, 2 replies; 6+ messages in thread
From: Lance Ortiz @ 2013-05-08 17:15 UTC (permalink / raw)
To: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt,
mchehab, linux-acpi, linux-pci, linux-kernel
The following warning was seen on 3.9 when a corrected PCIe error was being
handled by the AER subsystem.
WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90()
This occurred because code was added to the function cper_print_pcie() that
calls the pci_get_domain_bus_and_slot() function. cper_print_pcie() is called
in an interrupt context and pci_get* functions are not supposed to be called
in that context hence the warning.
The solution is to move the call to cper_print_aer() out of the interrupt
context and into aer_recover_queue() to avoid any warnings when calling
pci_get* functions.
Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
---
drivers/acpi/apei/cper.c | 18 ------------------
drivers/acpi/apei/ghes.c | 3 ++-
drivers/pci/pcie/aer/aerdrv_core.c | 6 +++++-
include/linux/aer.h | 2 +-
4 files changed, 8 insertions(+), 21 deletions(-)
diff --git a/drivers/acpi/apei/cper.c b/drivers/acpi/apei/cper.c
index 1e5d8a4..8713229 100644
--- a/drivers/acpi/apei/cper.c
+++ b/drivers/acpi/apei/cper.c
@@ -250,10 +250,6 @@ static const char *cper_pcie_port_type_strs[] = {
static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
const struct acpi_hest_generic_data *gdata)
{
-#ifdef CONFIG_ACPI_APEI_PCIEAER
- struct pci_dev *dev;
-#endif
-
if (pcie->validation_bits & CPER_PCIE_VALID_PORT_TYPE)
printk("%s""port_type: %d, %s\n", pfx, pcie->port_type,
pcie->port_type < ARRAY_SIZE(cper_pcie_port_type_strs) ?
@@ -285,20 +281,6 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
printk(
"%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
pfx, pcie->bridge.secondary_status, pcie->bridge.control);
-#ifdef CONFIG_ACPI_APEI_PCIEAER
- dev = pci_get_domain_bus_and_slot(pcie->device_id.segment,
- pcie->device_id.bus, pcie->device_id.function);
- if (!dev) {
- pr_err("PCI AER Cannot get PCI device %04x:%02x:%02x.%d\n",
- pcie->device_id.segment, pcie->device_id.bus,
- pcie->device_id.slot, pcie->device_id.function);
- return;
- }
- if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO)
- cper_print_aer(pfx, dev, gdata->error_severity,
- (struct aer_capability_regs *) pcie->aer_info);
- pci_dev_put(dev);
-#endif
}
static const char *apei_estatus_section_flag_strs[] = {
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index d668a8a..f2084b5 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -454,7 +454,8 @@ static void ghes_do_proc(struct ghes *ghes,
aer_severity = cper_severity_to_aer(sev);
aer_recover_queue(pcie_err->device_id.segment,
pcie_err->device_id.bus,
- devfn, aer_severity);
+ devfn, aer_severity,
+ pcie_err->aer_info);
}
}
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 564d97f..26aec0f 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -582,6 +582,7 @@ struct aer_recover_entry
u8 devfn;
u16 domain;
int severity;
+ u8 *regs;
};
static DEFINE_KFIFO(aer_recover_ring, struct aer_recover_entry,
@@ -595,7 +596,7 @@ static DEFINE_SPINLOCK(aer_recover_ring_lock);
static DECLARE_WORK(aer_recover_work, aer_recover_work_func);
void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
- int severity)
+ int severity, u8 *aer_regs)
{
unsigned long flags;
struct aer_recover_entry entry = {
@@ -603,6 +604,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
.devfn = devfn,
.domain = domain,
.severity = severity,
+ .regs = aer_regs,
};
spin_lock_irqsave(&aer_recover_ring_lock, flags);
@@ -629,6 +631,8 @@ static void aer_recover_work_func(struct work_struct *work)
PCI_SLOT(entry.devfn), PCI_FUNC(entry.devfn));
continue;
}
+ cper_print_aer("", pdev, entry.severity,
+ (struct aer_capability_regs *)entry.regs);
do_recovery(pdev, entry.severity);
pci_dev_put(pdev);
}
diff --git a/include/linux/aer.h b/include/linux/aer.h
index ec10e1b..a5c1583 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -53,6 +53,6 @@ extern void cper_print_aer(const char *prefix, struct pci_dev *dev,
int cper_severity, struct aer_capability_regs *aer);
extern int cper_severity_to_aer(int cper_severity);
extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
- int severity);
+ int severity, u8 *aer_regs);
#endif //_AER_H_
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context
2013-05-08 17:15 [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context Lance Ortiz
@ 2013-05-08 19:10 ` Borislav Petkov
2013-05-08 22:01 ` Rafael J. Wysocki
1 sibling, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2013-05-08 19:10 UTC (permalink / raw)
To: Lance Ortiz
Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, rostedt, mchehab,
linux-acpi, linux-pci, linux-kernel
On Wed, May 08, 2013 at 11:15:19AM -0600, Lance Ortiz wrote:
> The following warning was seen on 3.9 when a corrected PCIe error was being
> handled by the AER subsystem.
>
> WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90()
>
> This occurred because code was added to the function cper_print_pcie() that
> calls the pci_get_domain_bus_and_slot() function. cper_print_pcie() is called
> in an interrupt context and pci_get* functions are not supposed to be called
> in that context hence the warning.
>
> The solution is to move the call to cper_print_aer() out of the interrupt
> context and into aer_recover_queue() to avoid any warnings when calling
> pci_get* functions.
>
> Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
Looks good.
Acked-by: Borislav Petkov <bp@suse.de>
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context
2013-05-08 17:15 [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context Lance Ortiz
2013-05-08 19:10 ` Borislav Petkov
@ 2013-05-08 22:01 ` Rafael J. Wysocki
2013-05-08 22:28 ` Borislav Petkov
2013-05-09 19:10 ` Ortiz, Lance E
1 sibling, 2 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2013-05-08 22:01 UTC (permalink / raw)
To: Lance Ortiz
Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt,
mchehab, linux-acpi, linux-pci, linux-kernel
On Wednesday, May 08, 2013 11:15:19 AM Lance Ortiz wrote:
> The following warning was seen on 3.9 when a corrected PCIe error was being
> handled by the AER subsystem.
>
> WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90()
>
> This occurred because code was added to the function cper_print_pcie() that
> calls the pci_get_domain_bus_and_slot() function.
Do you know which commit added that code?
> cper_print_pcie() is called
> in an interrupt context and pci_get* functions are not supposed to be called
> in that context hence the warning.
>
> The solution is to move the call to cper_print_aer() out of the interrupt
> context and into aer_recover_queue() to avoid any warnings when calling
> pci_get* functions.
The way the changes are described here isn't particularly clear to me. I'd say
something like
If cper_print_aer() is called by aer_recover_work_func(), there won't be any
reason to call it from cper_print_pcie() any more, in which case all of the
problematic code needed only to prepare for the cper_print_aer() call,
including the invocation of pci_get_domain_bus_and_slot() causing the warning
to be printed, may be removed from there. Make that happen."
Also, since aer_recover_work_func() is going to be the only existing caller of
cper_print_aer() after this change, as far as I can say, and it doesn't use the
function's first argument, that argument should be dropped entirely.
Thanks,
Rafael
> Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
> ---
>
> drivers/acpi/apei/cper.c | 18 ------------------
> drivers/acpi/apei/ghes.c | 3 ++-
> drivers/pci/pcie/aer/aerdrv_core.c | 6 +++++-
> include/linux/aer.h | 2 +-
> 4 files changed, 8 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/acpi/apei/cper.c b/drivers/acpi/apei/cper.c
> index 1e5d8a4..8713229 100644
> --- a/drivers/acpi/apei/cper.c
> +++ b/drivers/acpi/apei/cper.c
> @@ -250,10 +250,6 @@ static const char *cper_pcie_port_type_strs[] = {
> static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
> const struct acpi_hest_generic_data *gdata)
> {
> -#ifdef CONFIG_ACPI_APEI_PCIEAER
> - struct pci_dev *dev;
> -#endif
> -
> if (pcie->validation_bits & CPER_PCIE_VALID_PORT_TYPE)
> printk("%s""port_type: %d, %s\n", pfx, pcie->port_type,
> pcie->port_type < ARRAY_SIZE(cper_pcie_port_type_strs) ?
> @@ -285,20 +281,6 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
> printk(
> "%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
> pfx, pcie->bridge.secondary_status, pcie->bridge.control);
> -#ifdef CONFIG_ACPI_APEI_PCIEAER
> - dev = pci_get_domain_bus_and_slot(pcie->device_id.segment,
> - pcie->device_id.bus, pcie->device_id.function);
> - if (!dev) {
> - pr_err("PCI AER Cannot get PCI device %04x:%02x:%02x.%d\n",
> - pcie->device_id.segment, pcie->device_id.bus,
> - pcie->device_id.slot, pcie->device_id.function);
> - return;
> - }
> - if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO)
> - cper_print_aer(pfx, dev, gdata->error_severity,
> - (struct aer_capability_regs *) pcie->aer_info);
> - pci_dev_put(dev);
> -#endif
> }
>
> static const char *apei_estatus_section_flag_strs[] = {
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index d668a8a..f2084b5 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -454,7 +454,8 @@ static void ghes_do_proc(struct ghes *ghes,
> aer_severity = cper_severity_to_aer(sev);
> aer_recover_queue(pcie_err->device_id.segment,
> pcie_err->device_id.bus,
> - devfn, aer_severity);
> + devfn, aer_severity,
> + pcie_err->aer_info);
> }
>
> }
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 564d97f..26aec0f 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -582,6 +582,7 @@ struct aer_recover_entry
> u8 devfn;
> u16 domain;
> int severity;
> + u8 *regs;
> };
>
> static DEFINE_KFIFO(aer_recover_ring, struct aer_recover_entry,
> @@ -595,7 +596,7 @@ static DEFINE_SPINLOCK(aer_recover_ring_lock);
> static DECLARE_WORK(aer_recover_work, aer_recover_work_func);
>
> void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
> - int severity)
> + int severity, u8 *aer_regs)
> {
> unsigned long flags;
> struct aer_recover_entry entry = {
> @@ -603,6 +604,7 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
> .devfn = devfn,
> .domain = domain,
> .severity = severity,
> + .regs = aer_regs,
> };
>
> spin_lock_irqsave(&aer_recover_ring_lock, flags);
> @@ -629,6 +631,8 @@ static void aer_recover_work_func(struct work_struct *work)
> PCI_SLOT(entry.devfn), PCI_FUNC(entry.devfn));
> continue;
> }
> + cper_print_aer("", pdev, entry.severity,
> + (struct aer_capability_regs *)entry.regs);
> do_recovery(pdev, entry.severity);
> pci_dev_put(pdev);
> }
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index ec10e1b..a5c1583 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -53,6 +53,6 @@ extern void cper_print_aer(const char *prefix, struct pci_dev *dev,
> int cper_severity, struct aer_capability_regs *aer);
> extern int cper_severity_to_aer(int cper_severity);
> extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
> - int severity);
> + int severity, u8 *aer_regs);
> #endif //_AER_H_
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context
2013-05-08 22:01 ` Rafael J. Wysocki
@ 2013-05-08 22:28 ` Borislav Petkov
2013-05-09 19:10 ` Ortiz, Lance E
1 sibling, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2013-05-08 22:28 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Lance Ortiz, bhelgaas, lance_ortiz, jiang.liu, tony.luck,
rostedt, mchehab, linux-acpi, linux-pci, linux-kernel
On Thu, May 09, 2013 at 12:01:21AM +0200, Rafael J. Wysocki wrote:
> On Wednesday, May 08, 2013 11:15:19 AM Lance Ortiz wrote:
> > The following warning was seen on 3.9 when a corrected PCIe error was being
> > handled by the AER subsystem.
> >
> > WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90()
> >
> > This occurred because code was added to the function cper_print_pcie() that
> > calls the pci_get_domain_bus_and_slot() function.
>
> Do you know which commit added that code?
1d5210008bd3a26daf4b06aed9d6c330dd4c83e2
> > cper_print_pcie() is called
> > in an interrupt context and pci_get* functions are not supposed to be called
> > in that context hence the warning.
> >
> > The solution is to move the call to cper_print_aer() out of the interrupt
> > context and into aer_recover_queue() to avoid any warnings when calling
> > pci_get* functions.
>
> The way the changes are described here isn't particularly clear to me. I'd say
> something like
>
> If cper_print_aer() is called by aer_recover_work_func(), there won't be any
> reason to call it from cper_print_pcie() any more, in which case all of the
> problematic code needed only to prepare for the cper_print_aer() call,
> including the invocation of pci_get_domain_bus_and_slot() causing the warning
> to be printed, may be removed from there. Make that happen."
>
> Also, since aer_recover_work_func() is going to be the only existing caller of
> cper_print_aer() after this change, as far as I can say, and it doesn't use the
> function's first argument, that argument should be dropped entirely.
Hmm, that needs more diddling: AFAICT __ghes_print_estatus() figures out
what the prefix is depending on the ->error_severity coming from the
acpi_hest_generic_status thing so it probably needs to be handed down or
similar...
Fun.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context
2013-05-08 22:01 ` Rafael J. Wysocki
2013-05-08 22:28 ` Borislav Petkov
@ 2013-05-09 19:10 ` Ortiz, Lance E
2013-05-09 19:24 ` Rafael J. Wysocki
1 sibling, 1 reply; 6+ messages in thread
From: Ortiz, Lance E @ 2013-05-09 19:10 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt,
mchehab, linux-acpi, linux-pci, linux-kernel
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1120 bytes --]
Rafael,
Thanks for your feedback.
> The way the changes are described here isn't particularly clear to me.
I will try to make it a little more clear.
> Also, since aer_recover_work_func() is going to be the only existing
> caller of
> cper_print_aer() after this change, as far as I can say, and it doesn't
> use the
> function's first argument, that argument should be dropped entirely.
The truth is, the function cper_print_aer() really needs to be re-written so it is consistent with aer_print_error() in how it outputs information. Right now, the output is formatted very differently. I was planning on doing that at a later date, but fix the warning now. I might add a TODO comment in the code for this.
The reason I did not remove the argument in cper_print_aer() is because 'prefix' is used in the call to cper_print_bits(), and I passed through an empty string to make sure that function worked correctly. I can try to clean it up.
Lance
ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context
2013-05-09 19:10 ` Ortiz, Lance E
@ 2013-05-09 19:24 ` Rafael J. Wysocki
0 siblings, 0 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2013-05-09 19:24 UTC (permalink / raw)
To: Ortiz, Lance E
Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt,
mchehab, linux-acpi, linux-pci, linux-kernel
On Thursday, May 09, 2013 07:10:36 PM Ortiz, Lance E wrote:
> Rafael,
>
> Thanks for your feedback.
>
> > The way the changes are described here isn't particularly clear to me.
>
> I will try to make it a little more clear.
Cool, thanks!
> > Also, since aer_recover_work_func() is going to be the only existing
> > caller of
> > cper_print_aer() after this change, as far as I can say, and it doesn't
> > use the
> > function's first argument, that argument should be dropped entirely.
>
> The truth is, the function cper_print_aer() really needs to be re-written so
> it is consistent with aer_print_error() in how it outputs information.
> Right now, the output is formatted very differently. I was planning on doing
> that at a later date, but fix the warning now. I might add a TODO comment in
> the code for this.
Yes, I think that would be OK, depending on the amount of changes actually
needed to rework it (if that's not too much, I'd just go straight for the
rework, honestly).
> The reason I did not remove the argument in cper_print_aer() is because
> 'prefix' is used in the call to cper_print_bits(), and I passed through an
> empty string to make sure that function worked correctly. I can try to clean
> it up.
Please do.
Thanks,
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-05-09 19:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-08 17:15 [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context Lance Ortiz
2013-05-08 19:10 ` Borislav Petkov
2013-05-08 22:01 ` Rafael J. Wysocki
2013-05-08 22:28 ` Borislav Petkov
2013-05-09 19:10 ` Ortiz, Lance E
2013-05-09 19:24 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).