All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] PCI/AER: update AER status string print to match other AER logs
@ 2017-10-17 15:42 Tyler Baicar
  2017-10-17 16:00   ` David Laight
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Tyler Baicar @ 2017-10-17 15:42 UTC (permalink / raw)
  To: bhelgaas, helgaas, linux-pci, linux-kernel; +Cc: Tyler Baicar

Currently the AER driver uses cper_print_bits() to print the AER status
string. This causes the status string to not include the proper PCI device
name prefix that the other AER prints include. Also, it has a different
print level than all the other AER prints.

Update the AER driver to print the AER status string with the proper string
prefix and proper print level.

Previous log example:

e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
Receiver Error, Bad TLP
e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
Replay Timer Timeout
pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID

New log:

e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
e1000e 0003:01:00.1: Receiver Error
e1000e 0003:01:00.1: Bad TLP
e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
pcieport 0003:00:00.0: Replay Timer Timeout
pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
---
 drivers/pci/pcie/aer/aerdrv_errprint.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
index 54c4b69..b718daa 100644
--- a/drivers/pci/pcie/aer/aerdrv_errprint.c
+++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
@@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
 }
 
 #ifdef CONFIG_ACPI_APEI_PCIEAER
+void dev_print_bits(struct pci_dev *dev, unsigned int bits,
+		    const char * const strs[], unsigned int strs_size)
+{
+	unsigned int i;
+
+	for (i = 0; i < strs_size; i++) {
+		if (!(bits & (1U << i)))
+			continue;
+		if (strs[i])
+			dev_err(&dev->dev, "%s\n", strs[i]);
+	}
+}
+
 int cper_severity_to_aer(int cper_severity)
 {
 	switch (cper_severity) {
@@ -243,7 +256,7 @@ void cper_print_aer(struct pci_dev *dev, int aer_severity,
 	agent = AER_GET_AGENT(aer_severity, status);
 
 	dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
-	cper_print_bits("", status, status_strs, status_strs_size);
+	dev_print_bits(dev, status, status_strs, status_strs_size);
 	dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
 		aer_error_layer[layer], aer_agent_string[agent]);
 
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-10-17 15:42 [PATCH] PCI/AER: update AER status string print to match other AER logs Tyler Baicar
@ 2017-10-17 16:00   ` David Laight
  2017-10-20 23:55 ` Bjorn Helgaas
  2017-11-15 14:47 ` Tyler Baicar
  2 siblings, 0 replies; 13+ messages in thread
From: David Laight @ 2017-10-17 16:00 UTC (permalink / raw)
  To: 'Tyler Baicar', bhelgaas, helgaas, linux-pci, linux-kernel

From: Tyler Baicar
> Sent: 17 October 2017 16:42
> Currently the AER driver uses cper_print_bits() to print the AER status
> string. This causes the status string to not include the proper PCI device
> name prefix that the other AER prints include. Also, it has a different
> print level than all the other AER prints.
> 
> Update the AER driver to print the AER status string with the proper string
> prefix and proper print level.
> 
> Previous log example:
> 
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> Receiver Error, Bad TLP
...
> New log:
> 
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> e1000e 0003:01:00.1: Receiver Error
> e1000e 0003:01:00.1: Bad TLP

Wouldn't it be better to manage to print the above all on 1 line?

...
> index 54c4b69..b718daa 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>  }
> 
>  #ifdef CONFIG_ACPI_APEI_PCIEAER
> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
> +		    const char * const strs[], unsigned int strs_size)

static and rename to aer_print_bits since this isn't a generic 'dev'
function.

	David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH] PCI/AER: update AER status string print to match other AER logs
@ 2017-10-17 16:00   ` David Laight
  0 siblings, 0 replies; 13+ messages in thread
From: David Laight @ 2017-10-17 16:00 UTC (permalink / raw)
  To: 'Tyler Baicar', bhelgaas, helgaas, linux-pci, linux-kernel

From: Tyler Baicar
> Sent: 17 October 2017 16:42
> Currently the AER driver uses cper_print_bits() to print the AER status
> string. This causes the status string to not include the proper PCI devic=
e
> name prefix that the other AER prints include. Also, it has a different
> print level than all the other AER prints.
>=20
> Update the AER driver to print the AER status string with the proper stri=
ng
> prefix and proper print level.
>=20
> Previous log example:
>=20
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> Receiver Error, Bad TLP
...
> New log:
>=20
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> e1000e 0003:01:00.1: Receiver Error
> e1000e 0003:01:00.1: Bad TLP

Wouldn't it be better to manage to print the above all on 1 line?

...
> index 54c4b69..b718daa 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct=
 aer_err_info *info)
>  }
>=20
>  #ifdef CONFIG_ACPI_APEI_PCIEAER
> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
> +		    const char * const strs[], unsigned int strs_size)

static and rename to aer_print_bits since this isn't a generic 'dev'
function.

	David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-10-17 16:00   ` David Laight
  (?)
@ 2017-10-17 17:13   ` Tyler Baicar
  2017-10-18 10:14       ` David Laight
  -1 siblings, 1 reply; 13+ messages in thread
From: Tyler Baicar @ 2017-10-17 17:13 UTC (permalink / raw)
  To: David Laight, bhelgaas, helgaas, linux-pci, linux-kernel

On 10/17/2017 12:00 PM, David Laight wrote:
> From: Tyler Baicar
>> Sent: 17 October 2017 16:42
>> Currently the AER driver uses cper_print_bits() to print the AER status
>> string. This causes the status string to not include the proper PCI device
>> name prefix that the other AER prints include. Also, it has a different
>> print level than all the other AER prints.
>>
>> Update the AER driver to print the AER status string with the proper string
>> prefix and proper print level.
>>
>> Previous log example:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> Receiver Error, Bad TLP
> ...
>> New log:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> e1000e 0003:01:00.1: Receiver Error
>> e1000e 0003:01:00.1: Bad TLP
> Wouldn't it be better to manage to print the above all on 1 line?
Hello David,

I broke them up into separate lines to simplify the code. If you look at 
cper_print_bits(),
it is not a clean solution and involves some hard coded values to try to limit 
the lines to
80 characters.

http://elixir.free-electrons.com/linux/v4.14-rc5/source/drivers/firmware/efi/cper.c#L85

I think printing one error per line in this case is a better solution since the 
code is much
cleaner. If you would like me to add this code to print them in a list and limit 
the lines
to 80 characters I can add that in though.
>
> ...
>> index 54c4b69..b718daa 100644
>> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
>> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
>> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>>   }
>>
>>   #ifdef CONFIG_ACPI_APEI_PCIEAER
>> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
>> +		    const char * const strs[], unsigned int strs_size)
> static and rename to aer_print_bits since this isn't a generic 'dev'
> function.
Will do.

Thanks,
Tyler

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-10-17 17:13   ` Tyler Baicar
@ 2017-10-18 10:14       ` David Laight
  0 siblings, 0 replies; 13+ messages in thread
From: David Laight @ 2017-10-18 10:14 UTC (permalink / raw)
  To: 'Tyler Baicar', bhelgaas, helgaas, linux-pci, linux-kernel

From: Tyler Baicar [mailto:tbaicar@codeaurora.org]
> Sent: 17 October 2017 18:14
> On 10/17/2017 12:00 PM, David Laight wrote:
> > From: Tyler Baicar
> >> Sent: 17 October 2017 16:42
> >> Currently the AER driver uses cper_print_bits() to print the AER status
> >> string. This causes the status string to not include the proper PCI device
> >> name prefix that the other AER prints include. Also, it has a different
> >> print level than all the other AER prints.
> >>
> >> Update the AER driver to print the AER status string with the proper string
> >> prefix and proper print level.
> >>
> >> Previous log example:
> >>
> >> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> >> Receiver Error, Bad TLP
> > ...
> >> New log:
> >>
> >> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> >> e1000e 0003:01:00.1: Receiver Error
> >> e1000e 0003:01:00.1: Bad TLP

> > Wouldn't it be better to manage to print the above all on 1 line?
 
> I broke them up into separate lines to simplify the code. If you look at
> cper_print_bits(),
> it is not a clean solution and involves some hard coded values to try to limit
> the lines to 80 characters.

I'm not sure the 80 char limit is needed.


How about:
#define MAX_STR 32
void pr_bits(unsigned int val, const char *strs[], unsigned int num_str)
{
        const char *str[MAX_STR] = {};
        unsigned int i, num;

        if (num_str > MAX_STR)
                num_str = MAX_STR;
        for (i = 0, num = 0; i < num_str; i++) {
                if (!(val & (1 << i)))
                        continue;
                str[num++] = strs[i];
        }
        printf(" %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s\n" + (MAX_STR - num) * 3,
                str[0], str[1], str[2], str[3],
                str[4], str[5], str[6], str[7],
                str[8], str[9], str[10], str[11],
                str[12], str[13], str[14], str[15],
                str[16], str[17], str[18], str[19],
                str[20], str[21], str[22], str[23],
                str[24], str[25], str[26], str[27],
                str[28], str[29], str[30], str[31]);
}

For kernel use you'd probably want to pass in 'dev' and a printf list
and use %pV to put the fixed text on the front of the line.

All rather begging for a new %p? feature that is passed the value, strings
and separator.

	David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH] PCI/AER: update AER status string print to match other AER logs
@ 2017-10-18 10:14       ` David Laight
  0 siblings, 0 replies; 13+ messages in thread
From: David Laight @ 2017-10-18 10:14 UTC (permalink / raw)
  To: 'Tyler Baicar', bhelgaas, helgaas, linux-pci, linux-kernel

From: Tyler Baicar [mailto:tbaicar@codeaurora.org]
> Sent: 17 October 2017 18:14
> On 10/17/2017 12:00 PM, David Laight wrote:
> > From: Tyler Baicar
> >> Sent: 17 October 2017 16:42
> >> Currently the AER driver uses cper_print_bits() to print the AER statu=
s
> >> string. This causes the status string to not include the proper PCI de=
vice
> >> name prefix that the other AER prints include. Also, it has a differen=
t
> >> print level than all the other AER prints.
> >>
> >> Update the AER driver to print the AER status string with the proper s=
tring
> >> prefix and proper print level.
> >>
> >> Previous log example:
> >>
> >> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> >> Receiver Error, Bad TLP
> > ...
> >> New log:
> >>
> >> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> >> e1000e 0003:01:00.1: Receiver Error
> >> e1000e 0003:01:00.1: Bad TLP

> > Wouldn't it be better to manage to print the above all on 1 line?
=20
> I broke them up into separate lines to simplify the code. If you look at
> cper_print_bits(),
> it is not a clean solution and involves some hard coded values to try to =
limit
> the lines to 80 characters.

I'm not sure the 80 char limit is needed.


How about:
#define MAX_STR 32
void pr_bits(unsigned int val, const char *strs[], unsigned int num_str)
{
        const char *str[MAX_STR] =3D {};
        unsigned int i, num;

        if (num_str > MAX_STR)
                num_str =3D MAX_STR;
        for (i =3D 0, num =3D 0; i < num_str; i++) {
                if (!(val & (1 << i)))
                        continue;
                str[num++] =3D strs[i];
        }
        printf(" %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %=
s %s %s %s %s %s %s %s %s %s %s %s %s\n" + (MAX_STR - num) * 3,
                str[0], str[1], str[2], str[3],
                str[4], str[5], str[6], str[7],
                str[8], str[9], str[10], str[11],
                str[12], str[13], str[14], str[15],
                str[16], str[17], str[18], str[19],
                str[20], str[21], str[22], str[23],
                str[24], str[25], str[26], str[27],
                str[28], str[29], str[30], str[31]);
}

For kernel use you'd probably want to pass in 'dev' and a printf list
and use %pV to put the fixed text on the front of the line.

All rather begging for a new %p? feature that is passed the value, strings
and separator.

	David

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-10-18 10:14       ` David Laight
  (?)
@ 2017-10-18 18:23       ` Tyler Baicar
  -1 siblings, 0 replies; 13+ messages in thread
From: Tyler Baicar @ 2017-10-18 18:23 UTC (permalink / raw)
  To: David Laight, bhelgaas, helgaas, linux-pci, linux-kernel

On 10/18/2017 6:14 AM, David Laight wrote:
> From: Tyler Baicar [mailto:tbaicar@codeaurora.org]
>> Sent: 17 October 2017 18:14
>> On 10/17/2017 12:00 PM, David Laight wrote:
>>> From: Tyler Baicar
>>>> Sent: 17 October 2017 16:42
>>>> Currently the AER driver uses cper_print_bits() to print the AER status
>>>> string. This causes the status string to not include the proper PCI device
>>>> name prefix that the other AER prints include. Also, it has a different
>>>> print level than all the other AER prints.
>>>>
>>>> Update the AER driver to print the AER status string with the proper string
>>>> prefix and proper print level.
>>>>
>>>> Previous log example:
>>>>
>>>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>>>> Receiver Error, Bad TLP
>>> ...
>>>> New log:
>>>>
>>>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>>>> e1000e 0003:01:00.1: Receiver Error
>>>> e1000e 0003:01:00.1: Bad TLP
>>> Wouldn't it be better to manage to print the above all on 1 line?
>   
>> I broke them up into separate lines to simplify the code. If you look at
>> cper_print_bits(),
>> it is not a clean solution and involves some hard coded values to try to limit
>> the lines to 80 characters.
> I'm not sure the 80 char limit is needed.
>
>
> How about:
> #define MAX_STR 32
> void pr_bits(unsigned int val, const char *strs[], unsigned int num_str)
> {
>          const char *str[MAX_STR] = {};
>          unsigned int i, num;
>
>          if (num_str > MAX_STR)
>                  num_str = MAX_STR;
>          for (i = 0, num = 0; i < num_str; i++) {
>                  if (!(val & (1 << i)))
>                          continue;
>                  str[num++] = strs[i];
>          }
>          printf(" %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s\n" + (MAX_STR - num) * 3,
>                  str[0], str[1], str[2], str[3],
>                  str[4], str[5], str[6], str[7],
>                  str[8], str[9], str[10], str[11],
>                  str[12], str[13], str[14], str[15],
>                  str[16], str[17], str[18], str[19],
>                  str[20], str[21], str[22], str[23],
>                  str[24], str[25], str[26], str[27],
>                  str[28], str[29], str[30], str[31]);
> }
>
> For kernel use you'd probably want to pass in 'dev' and a printf list
> and use %pV to put the fixed text on the front of the line.
>
> All rather begging for a new %p? feature that is passed the value, strings
> and separator.
Hi David,

This seems like a bad approach. This can make the print in the kernel logs and 
the code both
look pretty awful. I would prefer to have each error that occurred have it's own 
print line in
the logs rather than introduce this code for the sole purpose of keeping the 
list on a single
print line. I don't see any real downside to having a few additional print lines 
in error
scenarios.

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-10-17 15:42 [PATCH] PCI/AER: update AER status string print to match other AER logs Tyler Baicar
  2017-10-17 16:00   ` David Laight
@ 2017-10-20 23:55 ` Bjorn Helgaas
  2017-11-07 23:18   ` Tyler Baicar
  2017-11-15 14:47 ` Tyler Baicar
  2 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2017-10-20 23:55 UTC (permalink / raw)
  To: Tyler Baicar; +Cc: bhelgaas, linux-pci, linux-kernel

On Tue, Oct 17, 2017 at 09:42:02AM -0600, Tyler Baicar wrote:
> Currently the AER driver uses cper_print_bits() to print the AER status
> string. This causes the status string to not include the proper PCI device
> name prefix that the other AER prints include. Also, it has a different
> print level than all the other AER prints.
> 
> Update the AER driver to print the AER status string with the proper string
> prefix and proper print level.
> 
> Previous log example:
> 
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> Receiver Error, Bad TLP
> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
> Replay Timer Timeout
> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
> 
> New log:
> 
> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
> e1000e 0003:01:00.1: Receiver Error
> e1000e 0003:01:00.1: Bad TLP
> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
> pcieport 0003:00:00.0: Replay Timer Timeout
> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID

I definitely think it's MUCH better to use dev_err() as you do.

I don't like the cper_print_bits() strategy of inserting line breaks
to fit in 80 columns.  That leads to atomicity issues, e.g., other
printk output getting inserted in the middle of a single AER log, and
suggests an ordering ("Receiver Error" occurred before "Bad TLP") that
isn't real.  It'd be ideal if everything fit on one line per event,
but that might not be practical.

I'm not necessarily attached to the actual strings.  These messages
are for sophisticated users and maybe could be abbreviated as in lspci
output.  It might actually be kind of neat if the output here matched
up with the output of "lspci -vv" (lspci prints all the bits; here you
probably want only the set bits).  Or maybe not.

But even what you have here is a huge improvement.  I *hate*
unattached things in dmesg like we currently get.  There's no reliable
way to connect that "Receiver Error, Bad TLP" with the device.

> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
> ---
>  drivers/pci/pcie/aer/aerdrv_errprint.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
> index 54c4b69..b718daa 100644
> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>  }
>  
>  #ifdef CONFIG_ACPI_APEI_PCIEAER
> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
> +		    const char * const strs[], unsigned int strs_size)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < strs_size; i++) {
> +		if (!(bits & (1U << i)))
> +			continue;
> +		if (strs[i])
> +			dev_err(&dev->dev, "%s\n", strs[i]);
> +	}
> +}
> +
>  int cper_severity_to_aer(int cper_severity)
>  {
>  	switch (cper_severity) {
> @@ -243,7 +256,7 @@ void cper_print_aer(struct pci_dev *dev, int aer_severity,
>  	agent = AER_GET_AGENT(aer_severity, status);
>  
>  	dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
> -	cper_print_bits("", status, status_strs, status_strs_size);
> +	dev_print_bits(dev, status, status_strs, status_strs_size);
>  	dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
>  		aer_error_layer[layer], aer_agent_string[agent]);
>  
> -- 
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project.
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-10-20 23:55 ` Bjorn Helgaas
@ 2017-11-07 23:18   ` Tyler Baicar
  0 siblings, 0 replies; 13+ messages in thread
From: Tyler Baicar @ 2017-11-07 23:18 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: bhelgaas, linux-pci, linux-kernel

On 10/20/2017 7:55 PM, Bjorn Helgaas wrote:
> On Tue, Oct 17, 2017 at 09:42:02AM -0600, Tyler Baicar wrote:
>> Currently the AER driver uses cper_print_bits() to print the AER status
>> string. This causes the status string to not include the proper PCI device
>> name prefix that the other AER prints include. Also, it has a different
>> print level than all the other AER prints.
>>
>> Update the AER driver to print the AER status string with the proper string
>> prefix and proper print level.
>>
>> Previous log example:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> Receiver Error, Bad TLP
>> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
>> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
>> Replay Timer Timeout
>> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
>>
>> New log:
>>
>> e1000e 0003:01:00.1: aer_status: 0x00000041, aer_mask: 0x00000000
>> e1000e 0003:01:00.1: Receiver Error
>> e1000e 0003:01:00.1: Bad TLP
>> e1000e 0003:01:00.1: aer_layer=Physical Layer, aer_agent=Receiver ID
>> pcieport 0003:00:00.0: aer_status: 0x00001000, aer_mask: 0x0000e000
>> pcieport 0003:00:00.0: Replay Timer Timeout
>> pcieport 0003:00:00.0: aer_layer=Data Link Layer, aer_agent=Transmitter ID
> I definitely think it's MUCH better to use dev_err() as you do.
>
> I don't like the cper_print_bits() strategy of inserting line breaks
> to fit in 80 columns.  That leads to atomicity issues, e.g., other
> printk output getting inserted in the middle of a single AER log, and
> suggests an ordering ("Receiver Error" occurred before "Bad TLP") that
> isn't real.  It'd be ideal if everything fit on one line per event,
> but that might not be practical.
>
> I'm not necessarily attached to the actual strings.  These messages
> are for sophisticated users and maybe could be abbreviated as in lspci
> output.  It might actually be kind of neat if the output here matched
> up with the output of "lspci -vv" (lspci prints all the bits; here you
> probably want only the set bits).  Or maybe not.
>
> But even what you have here is a huge improvement.  I *hate*
> unattached things in dmesg like we currently get.  There's no reliable
> way to connect that "Receiver Error, Bad TLP" with the device.
Hello Bjorn,

Thanks for the feedback. Do you think this can get into 4.15?

Thanks,
Tyler
>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>> ---
>>   drivers/pci/pcie/aer/aerdrv_errprint.c | 15 ++++++++++++++-
>>   1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c b/drivers/pci/pcie/aer/aerdrv_errprint.c
>> index 54c4b69..b718daa 100644
>> --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
>> +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
>> @@ -206,6 +206,19 @@ void aer_print_port_info(struct pci_dev *dev, struct aer_err_info *info)
>>   }
>>   
>>   #ifdef CONFIG_ACPI_APEI_PCIEAER
>> +void dev_print_bits(struct pci_dev *dev, unsigned int bits,
>> +		    const char * const strs[], unsigned int strs_size)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < strs_size; i++) {
>> +		if (!(bits & (1U << i)))
>> +			continue;
>> +		if (strs[i])
>> +			dev_err(&dev->dev, "%s\n", strs[i]);
>> +	}
>> +}
>> +
>>   int cper_severity_to_aer(int cper_severity)
>>   {
>>   	switch (cper_severity) {
>> @@ -243,7 +256,7 @@ void cper_print_aer(struct pci_dev *dev, int aer_severity,
>>   	agent = AER_GET_AGENT(aer_severity, status);
>>   
>>   	dev_err(&dev->dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
>> -	cper_print_bits("", status, status_strs, status_strs_size);
>> +	dev_print_bits(dev, status, status_strs, status_strs_size);
>>   	dev_err(&dev->dev, "aer_layer=%s, aer_agent=%s\n",
>>   		aer_error_layer[layer], aer_agent_string[agent]);
>>   
>> -- 
>> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
>> Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
>> a Linux Foundation Collaborative Project.
>>

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-10-17 15:42 [PATCH] PCI/AER: update AER status string print to match other AER logs Tyler Baicar
  2017-10-17 16:00   ` David Laight
  2017-10-20 23:55 ` Bjorn Helgaas
@ 2017-11-15 14:47 ` Tyler Baicar
  2017-11-15 17:56   ` Bjorn Helgaas
  2 siblings, 1 reply; 13+ messages in thread
From: Tyler Baicar @ 2017-11-15 14:47 UTC (permalink / raw)
  To: bhelgaas, helgaas, linux-pci, linux-kernel

On 10/17/2017 11:42 AM, Tyler Baicar wrote:
> Currently the AER driver uses cper_print_bits() to print the AER status
> string. This causes the status string to not include the proper PCI device
> name prefix that the other AER prints include. Also, it has a different
> print level than all the other AER prints.
>
> Update the AER driver to print the AER status string with the proper string
> prefix and proper print level.
Hello,

Will this patch be pulled into 4.15?

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-11-15 14:47 ` Tyler Baicar
@ 2017-11-15 17:56   ` Bjorn Helgaas
  2017-12-13 16:50     ` Tyler Baicar
  0 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2017-11-15 17:56 UTC (permalink / raw)
  To: Tyler Baicar; +Cc: bhelgaas, linux-pci, linux-kernel

Hi Tyler,

On Wed, Nov 15, 2017 at 09:47:41AM -0500, Tyler Baicar wrote:
> On 10/17/2017 11:42 AM, Tyler Baicar wrote:
> >Currently the AER driver uses cper_print_bits() to print the AER status
> >string. This causes the status string to not include the proper PCI device
> >name prefix that the other AER prints include. Also, it has a different
> >print level than all the other AER prints.
> >
> >Update the AER driver to print the AER status string with the proper string
> >prefix and proper print level.
> Hello,
> 
> Will this patch be pulled into 4.15?

Sorry, I am preparing the 4.15 pull request right now, and it doesn't
include this change.

I do like the dev_err() change, but would prefer fewer lines of
output.  I could have applied just the dev_err() change, but to
minimize pain for people who parse the logs, I'd rather make one
change in the output instead of making one change now and another
later.

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-11-15 17:56   ` Bjorn Helgaas
@ 2017-12-13 16:50     ` Tyler Baicar
  2017-12-13 19:24       ` Bjorn Helgaas
  0 siblings, 1 reply; 13+ messages in thread
From: Tyler Baicar @ 2017-12-13 16:50 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: bhelgaas, linux-pci, linux-kernel

On 11/15/2017 12:56 PM, Bjorn Helgaas wrote:
> Hi Tyler,
>
> On Wed, Nov 15, 2017 at 09:47:41AM -0500, Tyler Baicar wrote:
>> On 10/17/2017 11:42 AM, Tyler Baicar wrote:
>>> Currently the AER driver uses cper_print_bits() to print the AER status
>>> string. This causes the status string to not include the proper PCI device
>>> name prefix that the other AER prints include. Also, it has a different
>>> print level than all the other AER prints.
>>>
>>> Update the AER driver to print the AER status string with the proper string
>>> prefix and proper print level.
>> Hello,
>>
>> Will this patch be pulled into 4.15?
> Sorry, I am preparing the 4.15 pull request right now, and it doesn't
> include this change.
>
> I do like the dev_err() change, but would prefer fewer lines of
> output.  I could have applied just the dev_err() change, but to
> minimize pain for people who parse the logs, I'd rather make one
> change in the output instead of making one change now and another
> later.
Hello Bjorn,

Are there existing abbreviations for these AER status strings that I cannot 
find? Or do you want
me to abbreviate them similar to the style used with prints in lspci -vv?

Once they are abbreviated, you'd prefer to have all errors that have occurred to 
be printed on
the same line, correct?

Thanks,
Tyler

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] PCI/AER: update AER status string print to match other AER logs
  2017-12-13 16:50     ` Tyler Baicar
@ 2017-12-13 19:24       ` Bjorn Helgaas
  0 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2017-12-13 19:24 UTC (permalink / raw)
  To: Tyler Baicar; +Cc: bhelgaas, linux-pci, linux-kernel

On Wed, Dec 13, 2017 at 11:50:56AM -0500, Tyler Baicar wrote:
> On 11/15/2017 12:56 PM, Bjorn Helgaas wrote:
> >Hi Tyler,
> >
> >On Wed, Nov 15, 2017 at 09:47:41AM -0500, Tyler Baicar wrote:
> >>On 10/17/2017 11:42 AM, Tyler Baicar wrote:
> >>>Currently the AER driver uses cper_print_bits() to print the AER status
> >>>string. This causes the status string to not include the proper PCI device
> >>>name prefix that the other AER prints include. Also, it has a different
> >>>print level than all the other AER prints.
> >>>
> >>>Update the AER driver to print the AER status string with the proper string
> >>>prefix and proper print level.
> >>Hello,
> >>
> >>Will this patch be pulled into 4.15?
> >Sorry, I am preparing the 4.15 pull request right now, and it doesn't
> >include this change.
> >
> >I do like the dev_err() change, but would prefer fewer lines of
> >output.  I could have applied just the dev_err() change, but to
> >minimize pain for people who parse the logs, I'd rather make one
> >change in the output instead of making one change now and another
> >later.
> Hello Bjorn,
> 
> Are there existing abbreviations for these AER status strings that I
> cannot find? Or do you want
> me to abbreviate them similar to the style used with prints in lspci -vv?

I think the terms used by lspci -vv would be a good start.

> Once they are abbreviated, you'd prefer to have all errors that have
> occurred to be printed on
> the same line, correct?

Yes.  Multiple lines suggests an ordering that really isn't there, so
if we can print them all at once, it both improves atomicity and
removes the erroneous suggestion that "this error occurred before this
other one".

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-12-13 19:25 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-17 15:42 [PATCH] PCI/AER: update AER status string print to match other AER logs Tyler Baicar
2017-10-17 16:00 ` David Laight
2017-10-17 16:00   ` David Laight
2017-10-17 17:13   ` Tyler Baicar
2017-10-18 10:14     ` David Laight
2017-10-18 10:14       ` David Laight
2017-10-18 18:23       ` Tyler Baicar
2017-10-20 23:55 ` Bjorn Helgaas
2017-11-07 23:18   ` Tyler Baicar
2017-11-15 14:47 ` Tyler Baicar
2017-11-15 17:56   ` Bjorn Helgaas
2017-12-13 16:50     ` Tyler Baicar
2017-12-13 19:24       ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.