All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: "Grumbach, Emmanuel" <emmanuel.grumbach@intel.com>,
	"kvalo@qca.qualcomm.com" <kvalo@qca.qualcomm.com>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
	"ath10k@lists.infradead.org" <ath10k@lists.infradead.org>
Subject: Re: [PATCH v2 09/21] ath10k: print fw debug messages in hex.
Date: Thu, 15 Sep 2016 10:59:28 -0700	[thread overview]
Message-ID: <50a08de4-1959-3dc9-cfc7-89d5b2914cc5@candelatech.com> (raw)
In-Reply-To: <1473960869.31073.15.camel@intel.com>

On 09/15/2016 10:34 AM, Grumbach, Emmanuel wrote:
> On Thu, 2016-09-15 at 08:14 -0700, Ben Greear wrote:
>> On 09/15/2016 07:06 AM, Valo, Kalle wrote:
>>> Ben Greear <greearb@candelatech.com> writes:
>>>
>>>> On 09/14/2016 07:18 AM, Valo, Kalle wrote:
>>>>> greearb@candelatech.com writes:
>>>>>
>>>>>> From: Ben Greear <greearb@candelatech.com>
>>>>>>
>>>>>> This allows user-space tools to decode debug-log
>>>>>> messages by parsing dmesg or /var/log/messages.
>>>>>>
>>>>>> Signed-off-by: Ben Greear <greearb@candelatech.com>
>>>>>
>>>>> Don't tracing points already provide the same information?
>>>>
>>>> Tracing tools are difficult to set up and may not be available on
>>>> random embedded devices.  And if we are dealing with bug reports
>>>> from
>>>> the field, most users will not be able to set it up regardless.
>>>>
>>>> There are similar ways to print out hex, but the logic below
>>>> creates
>>>> specific and parseable logs in the 'dmesg' output and similar.
>>>>
>>>> I have written a tool that can decode these messages into useful
>>>> human-readable
>>>> text so that I can debug firmware issues both locally and from
>>>> field reports.
>>>>
>>>> Stock firmware generates similar logs and QCA could write their
>>>> own decode logic
>>>> for their firmware versions.
>>>
>>> Reinventing the wheel by using printk as the delivery mechanism
>>> doesn't
>>> sound like a good idea. IIRC Emmanuel talked about some kind of
>>> firmware
>>> debugging framework, he might have some ideas.
>>
>> Waiting for magical frameworks to fix problems is even worse.
>>
> It has been years since ath10k has been in the kernel.  There is
>> basically
>> still no way to debug what the firmware is doing.
>>
>
> I know the feeling :) I was in the same situation before I added stuff
> for iwlwifi.
>
>> My patch gives you something that can work right now, with the
>> standard 'dmesg'
>> framework found in virtually all kernels new and old, and it has been
>> proven
>> to be useful in the field.  The messages are also nicely interleaved
>> with the
>> rest of the mac80211 stack messages and any other driver messages, so
>> you have
>> context.
>>
>> If someone wants to add support for a framework later, then by all
>> means, post
>> the patches when it is ready.
>
> From my experience, a strong and easy-to-use firmware debug
> infrastructure is important because typically, the firmware is written
> by other people who have different priorities (and are not always Linux
> wizards) etc... Being able to give them good data is the only way to
> have them fix their bugs :) For us, it was really a game changer. When
> you work for a big corporate, having 2 groups work better together
> always has a big impact. That's for the philosophical part :)
>
> FWIW: what I did has nothing to do with FW 'live tracing', but with
> firmware dumps. One part of our firmware dumps include tracing. We also
> have "firmware prints", but we don't print them in the kernel log and
> they are not part of the firmware dump thing. We rather record them in
> tracepoints just like really *anything* that comes from the firmware.
> Basically, we have 2 layers, the transport layer (PCIe) and the
> operation_mode layer. The first just brings the data from the firmware
> and in that layer we *blindly* record everything in tracepoints. In the
> operation_mode layer, we look at the data itself. In case of debug
> prints from the firmware, we simply discard them, because we don't
> really care of the meaning. All we want is to have them go through the
> PCIe layer so that they are recorded in the tracepoints.
> When we finish recording the sequence we wanted with tracing (trace
> -cmd), we parse the output and then, we parse the firmware prints.
> IMHO, this is more reliable than kernel logs and you don't lose the
> alignment with the driver traces as long as you have driver data in
> tracepoints as well.

I have other patches that remember the last 100 or so firmware log messages from
the kernel and provide that in a binary dump image when firmware crashes.

This is indeed very useful.

But, when debugging non-crash occasions, it is still useful to see what
the firmware is doing.

For instance, maybe it is reporting lots of tx-hangs and/or low-level
resets.  This gives you a clue as to why a user might report 'my wifi sucks'.

Since I am both FW and driver team for my firmware variant,
and my approach has been working for me, then I feel it is certainly better than
the current state.  And just maybe the official upstream FW team could start
using something similar as well.  Currently, I don't see how they can ever make
much progress on firmware crashes reported in stock kernels.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

WARNING: multiple messages have this Message-ID (diff)
From: Ben Greear <greearb@candelatech.com>
To: "Grumbach, Emmanuel" <emmanuel.grumbach@intel.com>,
	"kvalo@qca.qualcomm.com" <kvalo@qca.qualcomm.com>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>,
	"ath10k@lists.infradead.org" <ath10k@lists.infradead.org>
Subject: Re: [PATCH v2 09/21] ath10k: print fw debug messages in hex.
Date: Thu, 15 Sep 2016 10:59:28 -0700	[thread overview]
Message-ID: <50a08de4-1959-3dc9-cfc7-89d5b2914cc5@candelatech.com> (raw)
In-Reply-To: <1473960869.31073.15.camel@intel.com>

On 09/15/2016 10:34 AM, Grumbach, Emmanuel wrote:
> On Thu, 2016-09-15 at 08:14 -0700, Ben Greear wrote:
>> On 09/15/2016 07:06 AM, Valo, Kalle wrote:
>>> Ben Greear <greearb@candelatech.com> writes:
>>>
>>>> On 09/14/2016 07:18 AM, Valo, Kalle wrote:
>>>>> greearb@candelatech.com writes:
>>>>>
>>>>>> From: Ben Greear <greearb@candelatech.com>
>>>>>>
>>>>>> This allows user-space tools to decode debug-log
>>>>>> messages by parsing dmesg or /var/log/messages.
>>>>>>
>>>>>> Signed-off-by: Ben Greear <greearb@candelatech.com>
>>>>>
>>>>> Don't tracing points already provide the same information?
>>>>
>>>> Tracing tools are difficult to set up and may not be available on
>>>> random embedded devices.  And if we are dealing with bug reports
>>>> from
>>>> the field, most users will not be able to set it up regardless.
>>>>
>>>> There are similar ways to print out hex, but the logic below
>>>> creates
>>>> specific and parseable logs in the 'dmesg' output and similar.
>>>>
>>>> I have written a tool that can decode these messages into useful
>>>> human-readable
>>>> text so that I can debug firmware issues both locally and from
>>>> field reports.
>>>>
>>>> Stock firmware generates similar logs and QCA could write their
>>>> own decode logic
>>>> for their firmware versions.
>>>
>>> Reinventing the wheel by using printk as the delivery mechanism
>>> doesn't
>>> sound like a good idea. IIRC Emmanuel talked about some kind of
>>> firmware
>>> debugging framework, he might have some ideas.
>>
>> Waiting for magical frameworks to fix problems is even worse.
>>
> It has been years since ath10k has been in the kernel.  There is
>> basically
>> still no way to debug what the firmware is doing.
>>
>
> I know the feeling :) I was in the same situation before I added stuff
> for iwlwifi.
>
>> My patch gives you something that can work right now, with the
>> standard 'dmesg'
>> framework found in virtually all kernels new and old, and it has been
>> proven
>> to be useful in the field.  The messages are also nicely interleaved
>> with the
>> rest of the mac80211 stack messages and any other driver messages, so
>> you have
>> context.
>>
>> If someone wants to add support for a framework later, then by all
>> means, post
>> the patches when it is ready.
>
> From my experience, a strong and easy-to-use firmware debug
> infrastructure is important because typically, the firmware is written
> by other people who have different priorities (and are not always Linux
> wizards) etc... Being able to give them good data is the only way to
> have them fix their bugs :) For us, it was really a game changer. When
> you work for a big corporate, having 2 groups work better together
> always has a big impact. That's for the philosophical part :)
>
> FWIW: what I did has nothing to do with FW 'live tracing', but with
> firmware dumps. One part of our firmware dumps include tracing. We also
> have "firmware prints", but we don't print them in the kernel log and
> they are not part of the firmware dump thing. We rather record them in
> tracepoints just like really *anything* that comes from the firmware.
> Basically, we have 2 layers, the transport layer (PCIe) and the
> operation_mode layer. The first just brings the data from the firmware
> and in that layer we *blindly* record everything in tracepoints. In the
> operation_mode layer, we look at the data itself. In case of debug
> prints from the firmware, we simply discard them, because we don't
> really care of the meaning. All we want is to have them go through the
> PCIe layer so that they are recorded in the tracepoints.
> When we finish recording the sequence we wanted with tracing (trace
> -cmd), we parse the output and then, we parse the firmware prints.
> IMHO, this is more reliable than kernel logs and you don't lose the
> alignment with the driver traces as long as you have driver data in
> tracepoints as well.

I have other patches that remember the last 100 or so firmware log messages from
the kernel and provide that in a binary dump image when firmware crashes.

This is indeed very useful.

But, when debugging non-crash occasions, it is still useful to see what
the firmware is doing.

For instance, maybe it is reporting lots of tx-hangs and/or low-level
resets.  This gives you a clue as to why a user might report 'my wifi sucks'.

Since I am both FW and driver team for my firmware variant,
and my approach has been working for me, then I feel it is certainly better than
the current state.  And just maybe the official upstream FW team could start
using something similar as well.  Currently, I don't see how they can ever make
much progress on firmware crashes reported in stock kernels.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

  reply	other threads:[~2016-09-15 17:59 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-11 17:02 [PATCH v2 00/21] ath10k patches, generic and CT firmware related greearb
2016-05-11 17:02 ` greearb
2016-05-11 17:02 ` [PATCH v2 01/21] ath10k: Fix crash related to printing features greearb
2016-05-11 17:02   ` greearb
2016-06-07 11:38   ` [v2,01/21] " Kalle Valo
2016-06-07 11:38     ` Kalle Valo
2016-06-20 20:49   ` [PATCH v2 01/21] " Ben Greear
2016-06-20 20:49     ` Ben Greear
2016-06-20 21:56     ` Valo, Kalle
2016-06-20 21:56       ` Valo, Kalle
2016-05-11 17:02 ` [PATCH v2 02/21] ath10k: fix typo in logging message greearb
2016-05-11 17:02   ` greearb
2016-09-27 12:19   ` [v2,02/21] " Kalle Valo
2016-09-27 12:19     ` Kalle Valo
2016-05-11 17:02 ` [PATCH v2 03/21] ath10k: Allow changing ath10k debug mask at runtime greearb
2016-05-11 17:02   ` greearb
2016-09-14 14:06   ` Valo, Kalle
2016-09-14 14:06     ` Valo, Kalle
2016-09-14 15:33     ` Ben Greear
2016-09-14 15:33       ` Ben Greear
2016-09-15 14:19       ` Valo, Kalle
2016-09-15 14:19         ` Valo, Kalle
2016-09-15 15:07         ` Ben Greear
2016-09-15 15:07           ` Ben Greear
2016-05-11 17:02 ` [PATCH v2 04/21] ath10k: rate-limit packet tx errors greearb
2016-05-11 17:02   ` greearb
2016-09-14 14:07   ` Valo, Kalle
2016-09-14 14:07     ` Valo, Kalle
2016-09-14 15:02     ` Ben Greear
2016-09-14 15:02       ` Ben Greear
2016-09-15 13:59       ` Valo, Kalle
2016-09-15 13:59         ` Valo, Kalle
2016-09-15 15:22         ` Ben Greear
2016-09-15 15:22           ` Ben Greear
2016-05-11 17:02 ` [PATCH v2 05/21] ath10k: save firmware debug log messages greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 06/21] ath10k: save firmware stacks upon firmware crash greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 07/21] ath10k: save firmware RAM and ROM BSS sections on crash greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 08/21] ath10k: make firmware text debug messages more verbose greearb
2016-05-11 17:02   ` greearb
2016-09-14 14:12   ` Valo, Kalle
2016-09-14 14:12     ` Valo, Kalle
2016-09-14 15:06     ` Ben Greear
2016-09-14 15:06       ` Ben Greear
2016-09-15 14:02       ` Valo, Kalle
2016-09-15 14:02         ` Valo, Kalle
2016-09-15 15:17         ` Ben Greear
2016-09-15 15:17           ` Ben Greear
2016-05-11 17:02 ` [PATCH v2 09/21] ath10k: print fw debug messages in hex greearb
2016-05-11 17:02   ` greearb
2016-09-14 14:18   ` Valo, Kalle
2016-09-14 14:18     ` Valo, Kalle
2016-09-14 15:13     ` Ben Greear
2016-09-14 15:13       ` Ben Greear
2016-09-15 14:06       ` Valo, Kalle
2016-09-15 14:06         ` Valo, Kalle
2016-09-15 15:14         ` Ben Greear
2016-09-15 15:14           ` Ben Greear
2016-09-15 17:34           ` Grumbach, Emmanuel
2016-09-15 17:34             ` Grumbach, Emmanuel
2016-09-15 17:59             ` Ben Greear [this message]
2016-09-15 17:59               ` Ben Greear
2016-09-15 18:08               ` Ben Greear
2016-09-15 18:08                 ` Ben Greear
2016-09-15 20:22               ` Grumbach, Emmanuel
2016-09-15 20:22                 ` Grumbach, Emmanuel
2016-05-11 17:02 ` [PATCH v2 10/21] ath10k: support logging ath10k_info as KERN_DEBUG greearb
2016-05-11 17:02   ` greearb
2016-09-14 14:19   ` Valo, Kalle
2016-09-14 14:19     ` Valo, Kalle
2016-09-14 15:14     ` Ben Greear
2016-09-14 15:14       ` Ben Greear
2016-09-15 14:12       ` Valo, Kalle
2016-09-15 14:12         ` Valo, Kalle
2016-09-15 15:11         ` Ben Greear
2016-09-15 15:11           ` Ben Greear
2016-05-11 17:02 ` [PATCH v2 11/21] ath10k: add fw-powerup-fail to ethtool stats greearb
2016-05-11 17:02   ` greearb
2016-09-14 14:25   ` Valo, Kalle
2016-09-14 14:25     ` Valo, Kalle
2016-09-14 15:19     ` Ben Greear
2016-09-14 15:19       ` Ben Greear
2016-05-11 17:02 ` [PATCH v2 12/21] ath10k: Support up to 64 vdevs greearb
2016-05-11 17:02   ` greearb
2016-09-14 15:01   ` Valo, Kalle
2016-09-14 15:01     ` Valo, Kalle
2016-05-11 17:02 ` [PATCH v2 13/21] ath10k: Document cycle count related counters greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 14/21] ath10k: Add tx/rx bytes, cycle counters to ethtool stats greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 15/21] ath10k: support CT firmware flag greearb
2016-05-11 17:02   ` greearb
2016-09-14 14:30   ` Valo, Kalle
2016-09-14 14:30     ` Valo, Kalle
2016-09-14 15:24     ` Ben Greear
2016-09-14 15:24       ` Ben Greear
2016-09-15 14:15       ` Valo, Kalle
2016-09-15 14:15         ` Valo, Kalle
2016-09-15 14:43         ` Ben Greear
2016-09-15 14:43           ` Ben Greear
2016-05-11 17:02 ` [PATCH v2 16/21] ath10k: Support 32+ stations greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 17/21] ath10k: Enable detecting failure to install key in firmware (CT) greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 18/21] ath10k: Note limitation on beaconing vdevs greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 19/21] ath10k: Enable adhoc mode for CT firmware greearb
2016-05-11 17:02   ` greearb
2016-09-14 14:37   ` Valo, Kalle
2016-09-14 14:37     ` Valo, Kalle
2016-09-14 15:28     ` Ben Greear
2016-09-14 15:28       ` Ben Greear
2016-05-11 17:02 ` [PATCH v2 20/21] ath10k: read firmware crash over ioread32 if CE fails greearb
2016-05-11 17:02   ` greearb
2016-05-11 17:02 ` [PATCH v2 21/21] ath10k: Read dbglog buffers over register ping-pong greearb
2016-05-11 17:02   ` greearb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50a08de4-1959-3dc9-cfc7-89d5b2914cc5@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=ath10k@lists.infradead.org \
    --cc=emmanuel.grumbach@intel.com \
    --cc=kvalo@qca.qualcomm.com \
    --cc=linux-wireless@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.