linux-kernel-mentees.lists.linuxfoundation.org archive mirror
 help / color / mirror / Atom feed
* [Linux-kernel-mentees] [RFC 0/3] staging: qlge: Re-writing the debugging features
@ 2020-08-14 16:05 Coiby Xu
  2020-08-14 16:05 ` [Linux-kernel-mentees] [RFC 1/3] Initialize devlink health dump framework for the dlge driver Coiby Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Coiby Xu @ 2020-08-14 16:05 UTC (permalink / raw)
  To: netdev
  Cc: Michal Kubecek, GR-Linux-NIC-Dev, Manish Chopra,
	Benjamin Poirier, linux-kernel-mentees

This patch set aims to avoid dumping registers, data structures and
coredump to dmesg and also to reduce the code size of the qlge driver.

As pointed out by Benjamin [1],

> At 2000 lines, qlge_dbg.c alone is larger than some entire ethernet
> drivers. Most of what it does is dump kernel data structures or pci
> memory mapped registers to dmesg. There are better facilities for that.
> My thinking is not simply to delete qlge_dbg.c but to replace it, making
> sure that most of the same information is still available. For data
> structures, crash or drgn can be used; possibly with a script for the
> latter which formats the data. For pci registers, they should be
> included in the ethtool register dump and a patch added to ethtool to
> pretty print them. That's what other drivers like e1000e do. For the
> "coredump", devlink health can be used.

So the debugging features are re-written following Benjamin's advice,
   - use ethtool to dump registers
   - dump kernel data structures in drgn
   - use devlink health to do coredump

The get_regs ethtool_ops has already implemented. What lacks is a patch
for the userland ethtool to do the pretty-printing. I haven't yet provided
a patch to the userland ethtool because I'm aware ethtool is moving towards
the netlink interface [2]. I'm curious if a generalized mechanism of
pretty-printing will be implemented thus making pretty-printing for a
specific driver unnecessary. As of this writing, `-d|--register-dump`
hasn't been implemented for the netlink interface.


To dump kernel data structures, the following Python script can be used
in drgn,


    ```python
    def align(x, a):
        """the alignment a should be a power of 2
        """
        mask = a - 1
        return (x+ mask) & ~mask

    def struct_size(struct_type):
        struct_str = "struct {}".format(struct_type)
        return sizeof(Object(prog, struct_str, address=0x0))

    def netdev_priv(netdevice):
        NETDEV_ALIGN = 32
        return netdevice.value_() + align(struct_size("net_device"), NETDEV_ALIGN)

    name = 'xxx'
    qlge_device = None
    netdevices = prog['init_net'].dev_base_head.address_of_()
    for netdevice in list_for_each_entry("struct net_device", netdevices, "dev_list"):
        if netdevice.name.string_().decode('ascii') == name:
            print(netdevice.name)

    ql_adapter = Object(prog, "struct ql_adapter", address=netdev_priv(qlge_device))
    ```

The struct ql_adapter will be printed in drgn as follows,
    >>> ql_adapter
    (struct ql_adapter){
            .ricb = (struct ricb){
                    .base_cq = (u8)0,
                    .flags = (u8)120,
                    .mask = (__le16)26637,
                    .hash_cq_id = (u8 [1024]){ 172, 142, 255, 255 },
                    .ipv6_hash_key = (__le32 [10]){},
                    .ipv4_hash_key = (__le32 [4]){},
            },
            .flags = (unsigned long)0,
            .wol = (u32)0,
            .nic_stats = (struct nic_stats){
                    .tx_pkts = (u64)0,
                    .tx_bytes = (u64)0,
                    .tx_mcast_pkts = (u64)0,
                    .tx_bcast_pkts = (u64)0,
                    .tx_ucast_pkts = (u64)0,
                    .tx_ctl_pkts = (u64)0,
                    .tx_pause_pkts = (u64)0,
                    ...
            },
            .active_vlans = (unsigned long [64]){
                    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 52780853100545, 18446744073709551615,
                    18446619461681283072, 0, 42949673024, 2147483647,
            },
            .rx_ring = (struct rx_ring [17]){
                    {
                            .cqicb = (struct cqicb){
                                    .msix_vect = (u8)0,
                                    .reserved1 = (u8)0,
                                    .reserved2 = (u8)0,
                                    .flags = (u8)0,
                                    .len = (__le16)0,
                                    .rid = (__le16)0,
                                    ...
                            },
                            .cq_base = (void *)0x0,
                            .cq_base_dma = (dma_addr_t)0,
                    }
                    ...
            }
    }


And the coredump obtained via devlink in json format looks like,

    $ devlink health dump show DEVICE reporter coredump -p -j
    {
        "Core Registers": {
            "segment": 1,
            "values": [ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]
        },
        "Test Logic Regs": {
            "segment": 2,
            "values": [ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]
        },
        "RMII Registers": {
            "segment": 3,
            "values": [ 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 ]
        },
        ...
        "Sem Registers": {
            "segment": 50,
            "values": [ 0,0,0,0 ]
        }
    }

Since I don't have a QLGE device and neither could I find a software
simulator, I put some functions into e1000 to get the above result.

I notice with the qlge_force_coredump module parameter set, ethtool
can also get the coredump. I'm not sure which tool is more suitable for
the coredump feature.

[1] https://lkml.org/lkml/2020/6/30/19
[2] https://www.kernel.org/doc/html/latest/networking/ethtool-netlink.html

Coiby Xu (3):
  Initialize devlink health dump framework for the dlge driver
  coredump via devlink health reporter
  clean up code that dump info to dmesg

 drivers/staging/qlge/Makefile       |   2 +-
 drivers/staging/qlge/qlge.h         |  91 +---
 drivers/staging/qlge/qlge_dbg.c     | 672 ----------------------------
 drivers/staging/qlge/qlge_ethtool.c |   1 -
 drivers/staging/qlge/qlge_health.c  | 156 +++++++
 drivers/staging/qlge/qlge_health.h  |   2 +
 drivers/staging/qlge/qlge_main.c    |  27 +-
 7 files changed, 189 insertions(+), 762 deletions(-)
 create mode 100644 drivers/staging/qlge/qlge_health.c
 create mode 100644 drivers/staging/qlge/qlge_health.h

--
2.27.0

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-05  2:33 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-14 16:05 [Linux-kernel-mentees] [RFC 0/3] staging: qlge: Re-writing the debugging features Coiby Xu
2020-08-14 16:05 ` [Linux-kernel-mentees] [RFC 1/3] Initialize devlink health dump framework for the dlge driver Coiby Xu
2020-08-16  2:56   ` Benjamin Poirier
2020-08-21  3:08     ` Coiby Xu
2020-08-21  5:23       ` Benjamin Poirier
2020-10-05  2:33       ` Coiby Xu
2020-08-14 16:06 ` [Linux-kernel-mentees] [RFC 2/3] staging: qlge: coredump via devlink health reporter Coiby Xu
2020-08-14 16:06 ` [Linux-kernel-mentees] [RFC 3/3] staging: qlge: clean up code that dump info to dmesg Coiby Xu
2020-08-16  2:57   ` Benjamin Poirier
2020-08-21  3:14     ` Coiby Xu
2020-08-26  7:52 ` [Linux-kernel-mentees] [RFC 0/3] staging: qlge: Re-writing the debugging features Shung-Hsi Yu
2020-08-27  9:54   ` Shung-Hsi Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).