All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Josh Triplett <josh@joshtriplett.org>, rcu <rcu@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	netdev@vger.kernel.org
Subject: Re: BUG: Kernel NULL pointer dereference on write at 0x00000000 (rtmsg_ifinfo_build_skb)
Date: Sat, 29 Jan 2022 17:52:12 +0100	[thread overview]
Message-ID: <3534d781-7d01-b42a-8974-0b1c367946f0@molgen.mpg.de> (raw)
In-Reply-To: <CAABZP2xampOLo8k93OLgaOfv9LreJ+f0g0_1mXwqtrv_LKewQg@mail.gmail.com>

Dear Zhouyi,


Thank you for taking the time.


Am 29.01.22 um 03:23 schrieb Zhouyi Zhou:

> I don't have an IBM machine, but I tried to analyze the problem using
> my x86_64 kvm virtual machine, I can't reproduce the bug using my
> x86_64 kvm virtual machine.

No idea, if it’s architecture specific.

> I saw the panic is caused by registration of sit device (A sit device
> is a type of virtual network device that takes our IPv6 traffic,
> encapsulates/decapsulates it in IPv4 packets, and sends/receives it
> over the IPv4 Internet to another host)
> 
> sit device is registered in function sit_init_net:
> 1895    static int __net_init sit_init_net(struct net *net)
> 1896    {
> 1897        struct sit_net *sitn = net_generic(net, sit_net_id);
> 1898        struct ip_tunnel *t;
> 1899        int err;
> 1900
> 1901        sitn->tunnels[0] = sitn->tunnels_wc;
> 1902        sitn->tunnels[1] = sitn->tunnels_l;
> 1903        sitn->tunnels[2] = sitn->tunnels_r;
> 1904        sitn->tunnels[3] = sitn->tunnels_r_l;
> 1905
> 1906        if (!net_has_fallback_tunnels(net))
> 1907            return 0;
> 1908
> 1909        sitn->fb_tunnel_dev = alloc_netdev(sizeof(struct ip_tunnel), "sit0",
> 1910                           NET_NAME_UNKNOWN,
> 1911                           ipip6_tunnel_setup);
> 1912        if (!sitn->fb_tunnel_dev) {
> 1913            err = -ENOMEM;
> 1914            goto err_alloc_dev;
> 1915        }
> 1916        dev_net_set(sitn->fb_tunnel_dev, net);
> 1917        sitn->fb_tunnel_dev->rtnl_link_ops = &sit_link_ops;
> 1918        /* FB netdevice is special: we have one, and only one per netns.
> 1919         * Allowing to move it to another netns is clearly unsafe.
> 1920         */
> 1921        sitn->fb_tunnel_dev->features |= NETIF_F_NETNS_LOCAL;
> 1922
> 1923        err = register_netdev(sitn->fb_tunnel_dev);
> register_netdev on line 1923 will call if_nlmsg_size indirectly.
> 
> On the other hand, the function that calls the paniced strlen is if_nlmsg_size:
> (gdb) disassemble if_nlmsg_size
> Dump of assembler code for function if_nlmsg_size:
>     0xffffffff81a0dc20 <+0>:    nopl   0x0(%rax,%rax,1)
>     0xffffffff81a0dc25 <+5>:    push   %rbp
>     0xffffffff81a0dc26 <+6>:    push   %r15
>     0xffffffff81a0dd04 <+228>:    je     0xffffffff81a0de20 <if_nlmsg_size+512>
>     0xffffffff81a0dd0a <+234>:    mov    0x10(%rbp),%rdi
>     ...
>   => 0xffffffff81a0dd0e <+238>:    callq  0xffffffff817532d0 <strlen>
>     0xffffffff81a0dd13 <+243>:    add    $0x10,%eax
>     0xffffffff81a0dd16 <+246>:    movslq %eax,%r12

Excuse my ignorance, would that look the same for ppc64le? 
Unfortunately, I didn’t save the problematic `vmlinuz` file, but on a 
current build (without rcutorture) I have the line below, where strlen 
shows up.

     (gdb) disassemble if_nlmsg_size
     […]
     0xc000000000f7f82c <+332>:	bl      0xc000000000a10e30 <strlen>
     […]

> and the C code for 0xffffffff81a0dd0e is following (line 524):
> 515    static size_t rtnl_link_get_size(const struct net_device *dev)
> 516    {
> 517        const struct rtnl_link_ops *ops = dev->rtnl_link_ops;
> 518        size_t size;
> 519
> 520        if (!ops)
> 521            return 0;
> 522
> 523        size = nla_total_size(sizeof(struct nlattr)) + /* IFLA_LINKINFO */
> 524               nla_total_size(strlen(ops->kind) + 1);  /* IFLA_INFO_KIND */

How do I connect the disassemby output with the corresponding line?

> But ops is assigned the value of sit_link_ops in function sit_init_net
> line 1917, so I guess something must happened between the calls.
> 
> Do we have KASAN in IBM machine? would KASAN help us find out what
> happened in between?

Unfortunately, KASAN is not support on Power, I have, as far as I can 
see. From `arch/powerpc/Kconfig`:

         select HAVE_ARCH_KASAN                  if PPC32 && 
PPC_PAGE_SHIFT <= 14
         select HAVE_ARCH_KASAN_VMALLOC          if PPC32 && 
PPC_PAGE_SHIFT <= 14

> Hope I can be of more helpful.

Some distributions support multi-arch, so they easily allow 
crosscompiling for different architectures.


Kind regards,

Paul

  reply	other threads:[~2022-01-29 16:52 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-25 19:13 BUG: Kernel NULL pointer dereference on write at 0x00000000 (rtmsg_ifinfo_build_skb) Paul Menzel
2022-01-26  9:47 ` Zhouyi Zhou
2022-01-29  2:23 ` Zhouyi Zhou
2022-01-29 16:52   ` Paul Menzel [this message]
2022-01-30  0:21     ` Zhouyi Zhou
2022-01-30  8:19       ` Paul Menzel
2022-01-30 13:24         ` Zhouyi Zhou
2022-01-30 17:44           ` Paul E. McKenney
2022-01-31  1:08             ` Zhouyi Zhou
2022-02-01 17:50               ` Paul E. McKenney
2022-02-02  2:39                 ` Zhouyi Zhou
2022-02-08 20:10                   ` Zhouyi Zhou
2022-02-08 20:10                     ` Zhouyi Zhou
2022-02-16 13:19           ` Paul Menzel
2022-02-17  1:16             ` Nathan Chancellor
2022-02-21 11:17               ` Paul Menzel
2022-02-21 15:29                 ` Nathan Chancellor
2022-02-21 17:33                   ` Paul Menzel
2022-04-19 21:34                   ` Nathan Chancellor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3534d781-7d01-b42a-8974-0b1c367946f0@molgen.mpg.de \
    --to=pmenzel@molgen.mpg.de \
    --cc=davem@davemloft.net \
    --cc=josh@joshtriplett.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=zhouzhouyi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.