linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 1/3] aerdrv: Trace Event for AER
@ 2012-12-04 21:03 Lance Ortiz
  2012-12-04 21:03 ` [PATCH v7 2/3] aerdrv: Enhanced AER logging Lance Ortiz
  2012-12-04 21:03 ` [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER Lance Ortiz
  0 siblings, 2 replies; 10+ messages in thread
From: Lance Ortiz @ 2012-12-04 21:03 UTC (permalink / raw)
  To: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt,
	mchehab, linux-acpi, linux-pci, linux-kernel

This header file will define a new trace event that will be triggered when
a AER event occurs.  The following data will be provided to the trace
event.

char * dev_name - The name of the slot where the device resides
                  ([domain:]bus:device.function).

u32 status - Either the correctable or uncorrectable register
             indicating what error or errors have been see.

u8 severity - error severity 0:NONFATAL 1:FATAL 2:CORRECTED

The trace event will also provide a trace string that may look like:

"0000:05:00.0 PCIe Bus Error:severity=Uncorrected (Non-Fatal), Poisoned
TLP"

v1-v2 Move header from include/ras/aer_event.h to
include/trace/events/ras.h
v3-v4 Cleaned up comments and commit header
v4-v5 More cleanup remove () from if statement in print.
      Renamed string define to be more specific.
v5-v6 change TRACE_SYSTEM define to be ras and not aer.
Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
---

 include/trace/events/ras.h |   77 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 77 insertions(+), 0 deletions(-)
 create mode 100644 include/trace/events/ras.h

diff --git a/include/trace/events/ras.h b/include/trace/events/ras.h
new file mode 100644
index 0000000..88b8783
--- /dev/null
+++ b/include/trace/events/ras.h
@@ -0,0 +1,77 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM ras
+
+#if !defined(_TRACE_AER_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_AER_H
+
+#include <linux/tracepoint.h>
+#include <linux/edac.h>
+
+
+/*
+ * PCIe AER Trace event
+ *
+ * These events are generated when hardware detects a corrected or
+ * uncorrected event on a PCIe device. The event report has
+ * the following structure:
+ *
+ * char * dev_name -	The name of the slot where the device resides
+ *			([domain:]bus:device.function).
+ * u32 status -		Either the correctable or uncorrectable register
+ *			indicating what error or errors have been seen
+ * u8 severity -	error severity 0:NONFATAL 1:FATAL 2:CORRECTED
+ */
+
+#define aer_correctable_errors		\
+	{BIT(0),	"Receiver Error"},		\
+	{BIT(6),	"Bad TLP"},			\
+	{BIT(7),	"Bad DLLP"},			\
+	{BIT(8),	"RELAY_NUM Rollover"},		\
+	{BIT(12),	"Replay Timer Timeout"},	\
+	{BIT(13),	"Advisory Non-Fatal"}
+
+#define aer_uncorrectable_errors		\
+	{BIT(4),	"Data Link Protocol"},		\
+	{BIT(12),	"Poisoned TLP"},		\
+	{BIT(13),	"Flow Control Protocol"},	\
+	{BIT(14),	"Completion Timeout"},		\
+	{BIT(15),	"Completer Abort"},		\
+	{BIT(16),	"Unexpected Completion"},	\
+	{BIT(17),	"Receiver Overflow"},		\
+	{BIT(18),	"Malformed TLP"},		\
+	{BIT(19),	"ECRC"},			\
+	{BIT(20),	"Unsupported Request"}
+
+TRACE_EVENT(aer_event,
+	TP_PROTO(const char *dev_name,
+		 const u32 status,
+		 const u8 severity),
+
+	TP_ARGS(dev_name, status, severity),
+
+	TP_STRUCT__entry(
+		__string(	dev_name,	dev_name	)
+		__field(	u32,		status		)
+		__field(	u8,		severity	)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name);
+		__entry->status		= status;
+		__entry->severity	= severity;
+	),
+
+	TP_printk("%s PCIe Bus Error: severity=%s, %s\n",
+		__get_str(dev_name),
+		__entry->severity == HW_EVENT_ERR_CORRECTED ? "Corrected" :
+			__entry->severity == HW_EVENT_ERR_FATAL ?
+			"Fatal" : "Uncorrected",
+		__entry->severity == HW_EVENT_ERR_CORRECTED ?
+		__print_flags(__entry->status, "|", aer_correctable_errors) :
+		__print_flags(__entry->status, "|", aer_uncorrectable_errors))
+);
+
+#endif /* _TRACE_AER_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v7 2/3] aerdrv: Enhanced AER logging
  2012-12-04 21:03 [PATCH v7 1/3] aerdrv: Trace Event for AER Lance Ortiz
@ 2012-12-04 21:03 ` Lance Ortiz
  2012-12-05 14:32   ` Borislav Petkov
  2012-12-04 21:03 ` [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER Lance Ortiz
  1 sibling, 1 reply; 10+ messages in thread
From: Lance Ortiz @ 2012-12-04 21:03 UTC (permalink / raw)
  To: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt,
	mchehab, linux-acpi, linux-pci, linux-kernel

This patch will provide a more reliable and easy way for user-space
applications to have access to AER logs rather than reading them from the
message buffer. It also provides a way to notify user-space when an AER
event occurs.

The aer driver is updated to generate a trace event of function 'aer_event'
when a PCIe error is reported over the AER interface.  The trace event was
added to both the interrupt based aer path and the firmware first path.

v1-v2 fix compile errors in ifdefs.
v2-v3 Update to new location of trace header. Update print to remove
warning.
v3-v4 Reworked logic when getting ready to call cper_print_aer
v6-v7 Change print from pr_info to pr_err if !dev
Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
---

 drivers/acpi/apei/cper.c               |   19 ++++++++++++++++---
 drivers/pci/pcie/aer/aerdrv_errprint.c |   10 +++++++++-
 include/linux/aer.h                    |    2 +-
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/apei/cper.c b/drivers/acpi/apei/cper.c
index e6defd8..b2f695b 100644
--- a/drivers/acpi/apei/cper.c
+++ b/drivers/acpi/apei/cper.c
@@ -29,6 +29,7 @@
 #include <linux/time.h>
 #include <linux/cper.h>
 #include <linux/acpi.h>
+#include <linux/pci.h>
 #include <linux/aer.h>
 
 /*
@@ -249,6 +250,10 @@ static const char *cper_pcie_port_type_strs[] = {
 static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
 			    const struct acpi_hest_generic_data *gdata)
 {
+#ifdef CONFIG_ACPI_APEI_PCIEAER
+	struct pci_dev *dev;
+#endif
+
 	if (pcie->validation_bits & CPER_PCIE_VALID_PORT_TYPE)
 		printk("%s""port_type: %d, %s\n", pfx, pcie->port_type,
 		       pcie->port_type < ARRAY_SIZE(cper_pcie_port_type_strs) ?
@@ -281,10 +286,18 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
 	"%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
 	pfx, pcie->bridge.secondary_status, pcie->bridge.control);
 #ifdef CONFIG_ACPI_APEI_PCIEAER
-	if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) {
-		struct aer_capability_regs *aer_regs = (void *)pcie->aer_info;
-		cper_print_aer(pfx, gdata->error_severity, aer_regs);
+	dev = pci_get_domain_bus_and_slot(pcie->device_id.segment,
+			pcie->device_id.bus, pcie->device_id.function);
+	if (!dev) {
+		pr_err("PCI AER Cannot get PCI device %04x:%02x:%02x.%d\n",
+			pcie->device_id.segment, pcie->device_id.bus,
+			pcie->device_id.slot, pcie->device_id.function);
+		return;
 	}
+	if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO)
+		cper_print_aer(dev, gdata->error_severity,
+				(struct aer_capability_regs *) pcie->aer_info);
+	pci_dev_put(dev);
 #endif
 }
 
diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
index 3ea5173..34d96e4 100644
--- a/drivers/pci/pcie/aer/aerdrv_errprint.c
+++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
@@ -23,6 +23,9 @@
 
 #include "aerdrv.h"
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/ras.h>
+
 #define AER_AGENT_RECEIVER		0
 #define AER_AGENT_REQUESTER		1
 #define AER_AGENT_COMPLETER		2
@@ -194,6 +197,8 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
 	if (info->id && info->error_dev_num > 1 && info->id == id)
 		printk("%s""  Error of this Agent(%04x) is reported first\n",
 			prefix, id);
+	trace_aer_event(dev_name(&dev->dev), (info->status & ~info->mask),
+			info->severity);
 }
 
 void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
@@ -217,12 +222,13 @@ int cper_severity_to_aer(int cper_severity)
 }
 EXPORT_SYMBOL_GPL(cper_severity_to_aer);
 
-void cper_print_aer(const char *prefix, int cper_severity,
+void cper_print_aer(struct pci_dev *dev, int cper_severity,
 		    struct aer_capability_regs *aer)
 {
 	int aer_severity, layer, agent, status_strs_size, tlp_header_valid = 0;
 	u32 status, mask;
 	const char **status_strs;
+	char *prefix = NULL;
 
 	aer_severity = cper_severity_to_aer(cper_severity);
 	if (aer_severity == AER_CORRECTABLE) {
@@ -259,5 +265,7 @@ void cper_print_aer(const char *prefix, int cper_severity,
 			*(tlp + 8), *(tlp + 15), *(tlp + 14),
 			*(tlp + 13), *(tlp + 12));
 	}
+	trace_aer_event(dev_name(&dev->dev), (status & ~mask),
+			aer_severity);
 }
 #endif
diff --git a/include/linux/aer.h b/include/linux/aer.h
index 544abdb..7b86dc6 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -49,7 +49,7 @@ static inline int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
 }
 #endif
 
-extern void cper_print_aer(const char *prefix, int cper_severity,
+extern void cper_print_aer(struct pci_dev *dev, int cper_severity,
 			   struct aer_capability_regs *aer);
 extern int cper_severity_to_aer(int cper_severity);
 extern void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER
  2012-12-04 21:03 [PATCH v7 1/3] aerdrv: Trace Event for AER Lance Ortiz
  2012-12-04 21:03 ` [PATCH v7 2/3] aerdrv: Enhanced AER logging Lance Ortiz
@ 2012-12-04 21:03 ` Lance Ortiz
  2012-12-05 14:41   ` Mauro Carvalho Chehab
  2012-12-05 14:50   ` Borislav Petkov
  1 sibling, 2 replies; 10+ messages in thread
From: Lance Ortiz @ 2012-12-04 21:03 UTC (permalink / raw)
  To: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt,
	mchehab, linux-acpi, linux-pci, linux-kernel

These changes make cper_print_aer more consistent with aer_print_error
which is called in the AER interrupt case. The string in the variable
'prefix' is printed at the beginning of each print statement in
cper_print_aer(). The prefix is a string containing the driver name
and the device's slot location.  From the callers the value of prefix
is never assigned and is NULL, so when cper_print_aer prints data the
initial string does not get printed.  This string is important because
it identifies the device that the error is on.

v1-v2 fix some compile errors withinn the #ifdef
v3-v4 remove agent id stuff and kept print the same to avoid
compatibility issues

Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
---

 drivers/pci/pcie/aer/aerdrv_errprint.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
index 34d96e4..58ff4c0 100644
--- a/drivers/pci/pcie/aer/aerdrv_errprint.c
+++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
@@ -228,9 +228,14 @@ void cper_print_aer(struct pci_dev *dev, int cper_severity,
 	int aer_severity, layer, agent, status_strs_size, tlp_header_valid = 0;
 	u32 status, mask;
 	const char **status_strs;
-	char *prefix = NULL;
+	char prefix[44];
 
 	aer_severity = cper_severity_to_aer(cper_severity);
+	snprintf(prefix, sizeof(prefix), "%s%s %s: ",
+		(aer_severity == AER_CORRECTABLE) ?
+		KERN_WARNING : KERN_ERR,
+		dev_driver_string(&dev->dev), dev_name(&dev->dev));
+
 	if (aer_severity == AER_CORRECTABLE) {
 		status = aer->cor_status;
 		mask = aer->cor_mask;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 2/3] aerdrv: Enhanced AER logging
  2012-12-04 21:03 ` [PATCH v7 2/3] aerdrv: Enhanced AER logging Lance Ortiz
@ 2012-12-05 14:32   ` Borislav Petkov
  2012-12-05 16:14     ` Ortiz, Lance E
  0 siblings, 1 reply; 10+ messages in thread
From: Borislav Petkov @ 2012-12-05 14:32 UTC (permalink / raw)
  To: Lance Ortiz
  Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

On Tue, Dec 04, 2012 at 02:03:12PM -0700, Lance Ortiz wrote:
> This patch will provide a more reliable and easy way for user-space
> applications to have access to AER logs rather than reading them from the
> message buffer. It also provides a way to notify user-space when an AER
> event occurs.
> 
> The aer driver is updated to generate a trace event of function 'aer_event'
> when a PCIe error is reported over the AER interface.  The trace event was
> added to both the interrupt based aer path and the firmware first path.
> 
> v1-v2 fix compile errors in ifdefs.
> v2-v3 Update to new location of trace header. Update print to remove
> warning.
> v3-v4 Reworked logic when getting ready to call cper_print_aer
> v6-v7 Change print from pr_info to pr_err if !dev
> Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
> ---
> 
>  drivers/acpi/apei/cper.c               |   19 ++++++++++++++++---
>  drivers/pci/pcie/aer/aerdrv_errprint.c |   10 +++++++++-
>  include/linux/aer.h                    |    2 +-
>  3 files changed, 26 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/apei/cper.c b/drivers/acpi/apei/cper.c
> index e6defd8..b2f695b 100644
> --- a/drivers/acpi/apei/cper.c
> +++ b/drivers/acpi/apei/cper.c
> @@ -29,6 +29,7 @@
>  #include <linux/time.h>
>  #include <linux/cper.h>
>  #include <linux/acpi.h>
> +#include <linux/pci.h>
>  #include <linux/aer.h>
>  
>  /*
> @@ -249,6 +250,10 @@ static const char *cper_pcie_port_type_strs[] = {
>  static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
>  			    const struct acpi_hest_generic_data *gdata)
>  {
> +#ifdef CONFIG_ACPI_APEI_PCIEAER
> +	struct pci_dev *dev;
> +#endif
> +
>  	if (pcie->validation_bits & CPER_PCIE_VALID_PORT_TYPE)
>  		printk("%s""port_type: %d, %s\n", pfx, pcie->port_type,
>  		       pcie->port_type < ARRAY_SIZE(cper_pcie_port_type_strs) ?
> @@ -281,10 +286,18 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
>  	"%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
>  	pfx, pcie->bridge.secondary_status, pcie->bridge.control);
>  #ifdef CONFIG_ACPI_APEI_PCIEAER
> -	if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) {
> -		struct aer_capability_regs *aer_regs = (void *)pcie->aer_info;
> -		cper_print_aer(pfx, gdata->error_severity, aer_regs);
> +	dev = pci_get_domain_bus_and_slot(pcie->device_id.segment,
> +			pcie->device_id.bus, pcie->device_id.function);
> +	if (!dev) {
> +		pr_err("PCI AER Cannot get PCI device %04x:%02x:%02x.%d\n",
> +			pcie->device_id.segment, pcie->device_id.bus,
> +			pcie->device_id.slot, pcie->device_id.function);
> +		return;
>  	}
> +	if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO)
> +		cper_print_aer(dev, gdata->error_severity,
> +				(struct aer_capability_regs *) pcie->aer_info);
> +	pci_dev_put(dev);
>  #endif
>  }
>  
> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
> index 3ea5173..34d96e4 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -23,6 +23,9 @@
>  
>  #include "aerdrv.h"
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/ras.h>
> +
>  #define AER_AGENT_RECEIVER		0
>  #define AER_AGENT_REQUESTER		1
>  #define AER_AGENT_COMPLETER		2
> @@ -194,6 +197,8 @@ void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
>  	if (info->id && info->error_dev_num > 1 && info->id == id)
>  		printk("%s""  Error of this Agent(%04x) is reported first\n",
>  			prefix, id);
> +	trace_aer_event(dev_name(&dev->dev), (info->status & ~info->mask),
> +			info->severity);
>  }
>  
>  void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
> @@ -217,12 +222,13 @@ int cper_severity_to_aer(int cper_severity)
>  }
>  EXPORT_SYMBOL_GPL(cper_severity_to_aer);
>  
> -void cper_print_aer(const char *prefix, int cper_severity,
> +void cper_print_aer(struct pci_dev *dev, int cper_severity,
>  		    struct aer_capability_regs *aer)
>  {
>  	int aer_severity, layer, agent, status_strs_size, tlp_header_valid = 0;
>  	u32 status, mask;
>  	const char **status_strs;
> +	char *prefix = NULL;

What are you doing here? You dropped the 'prefix' argument being passed
down in this function and are defining a local variable of the same name
which is used in the function later:

        printk("%s""aer_status: 0x%08x, aer_mask: 0x%08x\n",
               prefix, status, mask);

WTF?

Also, this function cper_print_aer() is too crammed and could use a
little spacing out between lines belonging logically together.

Finally, please wait a couple of days before resending the patchset
so that you can collect *all* feedback and can save yourself needless
iterations.

Thanks.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER
  2012-12-04 21:03 ` [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER Lance Ortiz
@ 2012-12-05 14:41   ` Mauro Carvalho Chehab
  2012-12-05 14:50   ` Borislav Petkov
  1 sibling, 0 replies; 10+ messages in thread
From: Mauro Carvalho Chehab @ 2012-12-05 14:41 UTC (permalink / raw)
  To: Lance Ortiz
  Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, bp, rostedt,
	linux-acpi, linux-pci, linux-kernel

Em Tue, 04 Dec 2012 14:03:18 -0700
Lance Ortiz <lance.ortiz@hp.com> escreveu:

Hmm... I did a reply to v6 of this patch, but i pressed the wrong button
and it went only to Joe. Excuse-me for that. Let me resend the
comments to everybody:

On Tue, 2012-12-04 at 16:33 -0200, Mauro Carvalho Chehab wrote:
> Em Tue, 04 Dec 2012 09:39:23 -0800
> Joe Perches <joe@perches.com> escreveu:
>   
> > On Tue, 2012-12-04 at 10:04 -0700, Lance Ortiz wrote:  
> > > These changes make cper_print_aer more consistent with aer_print_error
> > > which is called in the AER interrupt case. The string in the variable
> > > 'prefix' is printed at the beginning of each print statement in
> > > cper_print_aer(). The prefix is a string containing the driver name
> > > and the device's slot location.  From the callers the value of prefix
> > > is never assigned and is NULL, so when cper_print_aer prints data the
> > > initial string does not get printed.  This string is important because
> > > it identifies the device that the error is on.  
> > 
> > Hi again Lance.
> >   
> > > diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c  
> > []  
> > > @@ -228,9 +228,14 @@ void cper_print_aer(struct pci_dev *dev, int cper_severity,
> > >  	int aer_severity, layer, agent, status_strs_size, tlp_header_valid = 0;
> > >  	u32 status, mask;
> > >  	const char **status_strs;
> > > -	char *prefix = NULL;
> > > +	char prefix[44];
> > >  
> > >  	aer_severity = cper_severity_to_aer(cper_severity);
> > > +	snprintf(prefix, sizeof(prefix), "%s%s %s: ",
> > > +		(aer_severity == AER_CORRECTABLE) ?
> > > +		KERN_WARNING : KERN_ERR,
> > > +		dev_driver_string(&dev->dev), dev_name(&dev->dev));
> > > +
> > >  	if (aer_severity == AER_CORRECTABLE) {
> > >  		status = aer->cor_status;
> > >  		mask = aer->cor_mask;  
> > 
> > Perhaps this would be better just using dev_printk
> > instead of snprintf into a prefix with printk to
> > emulate dev_printk.
> > 
> > Also, perhaps KERN_NOTICE is preferable to KERN_WARNING
> > in the CORRECTABLE case.  
> 
> Hmm... IMHO, it should be KERN_ERR, even for a correctable one ;) 
> 
> This is an error and probably means some hardware problem or a bad 
> contact. In any case, a preventive action is likely required.
>   
> > 
> > Maybe something like:
> > 
> > 	const char *level = KERN_ERR;
> > 	if (aer_severity == AER_CORRECTABLE)
> > 		level = KERN_NOTICE;
> > 
> > 	...
> > 
> > 	dev_printk(level, &dev->dev, etc...);
> > 
> > Maybe do this after this series of patches is accepted.
> > Enough with the revisions for awhile...
> >   
> 
>   

Regards,
Mauro

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER
  2012-12-04 21:03 ` [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER Lance Ortiz
  2012-12-05 14:41   ` Mauro Carvalho Chehab
@ 2012-12-05 14:50   ` Borislav Petkov
  2012-12-05 16:16     ` Ortiz, Lance E
  1 sibling, 1 reply; 10+ messages in thread
From: Borislav Petkov @ 2012-12-05 14:50 UTC (permalink / raw)
  To: Lance Ortiz
  Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

On Tue, Dec 04, 2012 at 02:03:18PM -0700, Lance Ortiz wrote:
> These changes make cper_print_aer more consistent with aer_print_error
> which is called in the AER interrupt case. The string in the variable
> 'prefix' is printed at the beginning of each print statement in
> cper_print_aer(). The prefix is a string containing the driver name
> and the device's slot location.  From the callers the value of prefix
> is never assigned and is NULL, so when cper_print_aer prints data the
> initial string does not get printed.  This string is important because
> it identifies the device that the error is on.
> 
> v1-v2 fix some compile errors withinn the #ifdef
> v3-v4 remove agent id stuff and kept print the same to avoid
> compatibility issues
> 
> Signed-off-by: Lance Ortiz <lance.ortiz@hp.com>
> ---
> 
>  drivers/pci/pcie/aer/aerdrv_errprint.c |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
> index 34d96e4..58ff4c0 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -228,9 +228,14 @@ void cper_print_aer(struct pci_dev *dev, int cper_severity,
>  	int aer_severity, layer, agent, status_strs_size, tlp_header_valid = 0;
>  	u32 status, mask;
>  	const char **status_strs;
> -	char *prefix = NULL;
> +	char prefix[44];
>  
>  	aer_severity = cper_severity_to_aer(cper_severity);
> +	snprintf(prefix, sizeof(prefix), "%s%s %s: ",
> +		(aer_severity == AER_CORRECTABLE) ?
> +		KERN_WARNING : KERN_ERR,
> +		dev_driver_string(&dev->dev), dev_name(&dev->dev));

Here it is, you're initializing 'prefix' here although it is being
used in the previous patch. You should concentrate the whole prefix
initialization and passing in one patch so that there are no breakages.

Also, why is it exactly 44 chars long, is that some magic number
that always works? At least this is what aer_print_error does
too. I'm guessing someone chose it because it is large enough for
dev_driver_string() and dev_name() and a couple more formatting
characters. Oh well.

Thanks.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v7 2/3] aerdrv: Enhanced AER logging
  2012-12-05 14:32   ` Borislav Petkov
@ 2012-12-05 16:14     ` Ortiz, Lance E
  2012-12-05 16:30       ` Borislav Petkov
  0 siblings, 1 reply; 10+ messages in thread
From: Ortiz, Lance E @ 2012-12-05 16:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 836 bytes --]

> > +	char *prefix = NULL;
> 
> What are you doing here? You dropped the 'prefix' argument being passed
> down in this function and are defining a local variable of the same
> name
> which is used in the function later:
> 
>         printk("%s""aer_status: 0x%08x, aer_mask: 0x%08x\n",
>                prefix, status, mask);
> 

Boris,

I removed the prefix argument because it was never used by its caller and never set.  The reason I added the prefix variable and set it to NULL was to help in breaking up the patch and adding it would help the intermittent patch build without changing too much code.   I knew I was actually going to use the variable in patch 3. 

Lance

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER
  2012-12-05 14:50   ` Borislav Petkov
@ 2012-12-05 16:16     ` Ortiz, Lance E
  0 siblings, 0 replies; 10+ messages in thread
From: Ortiz, Lance E @ 2012-12-05 16:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 918 bytes --]

> 
> Here it is, you're initializing 'prefix' here although it is being
> used in the previous patch. You should concentrate the whole prefix
> initialization and passing in one patch so that there are no breakages.

Yes I probably could have broken up the patches in a cleaner way.  Thanks.

> 
> Also, why is it exactly 44 chars long, is that some magic number
> that always works? At least this is what aer_print_error does
> too. I'm guessing someone chose it because it is large enough for
> dev_driver_string() and dev_name() and a couple more formatting
> characters. Oh well.
> 

Yes, I was using the same code as in aer_print_error.  This was brought up in a previous mail with Joe and we discussed cleaning all of this up in a later patch.

Lance

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 2/3] aerdrv: Enhanced AER logging
  2012-12-05 16:14     ` Ortiz, Lance E
@ 2012-12-05 16:30       ` Borislav Petkov
  2012-12-27  1:33         ` Bjorn Helgaas
  0 siblings, 1 reply; 10+ messages in thread
From: Borislav Petkov @ 2012-12-05 16:30 UTC (permalink / raw)
  To: Ortiz, Lance E
  Cc: bhelgaas, lance_ortiz, jiang.liu, tony.luck, rostedt, mchehab,
	linux-acpi, linux-pci, linux-kernel

On Wed, Dec 05, 2012 at 04:14:14PM +0000, Ortiz, Lance E wrote:
> I removed the prefix argument because it was never used by its caller
> and never set. The reason I added the prefix variable and set it to
> NULL was to help in breaking up the patch and adding it would help the
> intermittent patch build without changing too much code. I knew I was
> actually going to use the variable in patch 3.

No, the correct way to do that is to keep all changes that belong
logically together in a single patch for ease of reviewing and avoid
breakages. And in your case this should be pretty easy: simply move all
the 'prefix' touching code to patch #3 and you're done, AFAICT.

Thanks.

-- 
Regards/Gruss,
    Boris.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v7 2/3] aerdrv: Enhanced AER logging
  2012-12-05 16:30       ` Borislav Petkov
@ 2012-12-27  1:33         ` Bjorn Helgaas
  0 siblings, 0 replies; 10+ messages in thread
From: Bjorn Helgaas @ 2012-12-27  1:33 UTC (permalink / raw)
  To: Borislav Petkov, Ortiz, Lance E, bhelgaas, lance_ortiz,
	jiang.liu, tony.luck, rostedt, mchehab, linux-acpi, linux-pci,
	linux-kernel

On Wed, Dec 5, 2012 at 9:30 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Dec 05, 2012 at 04:14:14PM +0000, Ortiz, Lance E wrote:
>> I removed the prefix argument because it was never used by its caller
>> and never set. The reason I added the prefix variable and set it to
>> NULL was to help in breaking up the patch and adding it would help the
>> intermittent patch build without changing too much code. I knew I was
>> actually going to use the variable in patch 3.
>
> No, the correct way to do that is to keep all changes that belong
> logically together in a single patch for ease of reviewing and avoid
> breakages. And in your case this should be pretty easy: simply move all
> the 'prefix' touching code to patch #3 and you're done, AFAICT.

Lance, you didn't add all the "prefix" stuff in AER, but since you're
touching it, I think things will make more sense if you clean it up at
the same time.  There are a lot of printk() calls there that should be
converted to dev_printk().

I think I see why it was done that way -- it sticks either
KERN_WARNING or KERN_ERR at the beginning of the prefix, then uses
plain printk() later.  I guess that means you only have to pass around
the prefix argument, rather than both a "level" and a "struct pci_dev
*".  But I think it will be simpler overall to pass both and take
advantage of dev_printk() and stop emulating it.

Bjorn

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-12-27  1:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-04 21:03 [PATCH v7 1/3] aerdrv: Trace Event for AER Lance Ortiz
2012-12-04 21:03 ` [PATCH v7 2/3] aerdrv: Enhanced AER logging Lance Ortiz
2012-12-05 14:32   ` Borislav Petkov
2012-12-05 16:14     ` Ortiz, Lance E
2012-12-05 16:30       ` Borislav Petkov
2012-12-27  1:33         ` Bjorn Helgaas
2012-12-04 21:03 ` [PATCH v7 3/3] aerdrv: Cleanup log output for CPER based AER Lance Ortiz
2012-12-05 14:41   ` Mauro Carvalho Chehab
2012-12-05 14:50   ` Borislav Petkov
2012-12-05 16:16     ` Ortiz, Lance E

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).