Hi, Alexander Viro and dear Linux Filesystems maintainers, recently we
encounter a NULL pointer dereference Oops in our production.

We have attempted to analyze the core dump and compare it with source
code in the past few weeks, currently still could not understand why
`dentry->d_inode` become NULL while other fields look normal.

Here is the call stack trace of this Oops.

```
[19521409.363839] BUG: unable to handle kernel NULL pointer
dereference at 000000000000000c
[19521409.372016] IP: __atime_needs_update+0x5/0x190
[19521409.376757] PGD 80000020326ad067 P4D 80000020326ad067 PUD 200fd06067 PMD 0
[19521409.384025] Oops: 0000 [#1] SMP PTI
[19521409.387796] Modules linked in: veth ipt_MASQUERADE
nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user
xfrm_algo xt_addrtype iptable_nat nf_nat_ipv4 nf_nat br_netfilter
bridge stp llc aufs overlay cpuid iptable_filter ip_tables cls_cgroup
sch_htb xt_multiport ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp
xt_conntrack x_tables bonding nls_utf8 isofs ib_iser rdma_cm iw_cm
ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
toa(OE) nf_conntrack lp parport intel_rapl skx_edac
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass
intel_cstate intel_rapl_perf ipmi_ssif ipmi_si dcdbas mei_me mei
ipmi_devintf lpc_ich shpchp ipmi_msghandler acpi_power_meter mac_hid
autofs4 btrfs zstd_compress
[19521409.458627]  raid10 raid456 async_raid6_recov async_memcpy
async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0
multipath linear crct10dif_pclmul crc32_pclmul mgag200
ghash_clmulni_intel ttm pcbc drm_kms_helper aesni_intel syscopyarea
aes_x86_64 sysfillrect ixgbe igb sysimgblt crypto_simd fb_sys_fops dca
i2c_algo_bit glue_helper ptp megaraid_sas ahci drm cryptd mdio
pps_core libahci [last unloaded: ip_tables]
[19521409.496053] CPU: 46 PID: 10855 Comm: node-exporter Tainted: G
       OE    4.15.0-42-generic #46~16.04.1+4
[19521409.506851] Hardware name: Dell Inc. PowerEdge R740xd/08D89F,
BIOS 1.4.9 06/29/2018
[19521409.514784] RIP: 0010:__atime_needs_update+0x5/0x190
[19521409.520026] RSP: 0018:ffff9dee09c2fc48 EFLAGS: 00010202
[19521409.525528] RAX: ffff8a4281d01ec0 RBX: fefefefefefefeff RCX:
0000000000000040
[19521409.532942] RDX: 0000000000000001 RSI: 0000000000000000 RDI:
ffff9dee09c2fde8
[19521409.540354] RBP: ffff9dee09c2fca8 R08: ffff9dee09c2fbf4 R09:
ffff9dee09c2fd90
[19521409.547761] R10: ffff8a34397b4022 R11: 6b636f732f74656e R12:
2f2f2f2f2f2f2f2f
[19521409.555176] R13: 0000000000000000 R14: ffff8a34397b4026 R15:
ffff9dee09c2fde8
[19521409.562592] FS:  000000c000218090(0000)
GS:ffff8a3b401c0000(0000) knlGS:0000000000000000
[19521409.570976] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19521409.577001] CR2: 000000000000000c CR3: 000000203ad22005 CR4:
00000000007606e0
[19521409.584415] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[19521409.592937] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[19521409.601419] PKRU: 55555554
[19521409.605464] Call Trace:
[19521409.609235]  ? link_path_walk+0x3e4/0x5a0
[19521409.614546]  ? path_init+0x177/0x2f0
[19521409.619423]  path_openat+0xe4/0x1770
[19521409.624282]  ? ttwu_do_wakeup+0x1e/0x140
[19521409.629465]  ? ttwu_do_activate+0x77/0x80
[19521409.634713]  ? try_to_wake_up+0x59/0x480
[19521409.639864]  do_filp_open+0x9b/0x110
[19521409.644638]  ? __check_object_size+0xaf/0x1b0
[19521409.650176]  ? path_get+0x27/0x30
[19521409.654652]  do_sys_open+0x1bb/0x2c0
[19521409.659372]  ? do_sys_open+0x1bb/0x2c0
[19521409.664254]  SyS_openat+0x14/0x20
[19521409.668677]  do_syscall_64+0x73/0x130
[19521409.673479]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[19521409.679602] RIP: 0033:0x4a5c9a
[19521409.683706] RSP: 002b:000000c000304ab0 EFLAGS: 00000202
ORIG_RAX: 0000000000000101
[19521409.692316] RAX: ffffffffffffffda RBX: 000000c00002f400 RCX:
00000000004a5c9a
[19521409.700483] RDX: 0000000000080000 RSI: 000000c00175a020 RDI:
ffffffffffffff9c
[19521409.708640] RBP: 000000c000304b28 R08: 0000000000000000 R09:
0000000000000000
[19521409.716806] R10: 0000000000000000 R11: 0000000000000202 R12:
ffffffffffffffff
[19521409.724962] R13: 0000000000000002 R14: 0000000000000001 R15:
0000000000000100
[19521409.733145] Code: 83 ec 08 0f 0d 8f 80 05 00 00 e8 87 ff ff ff
48 85 c0 74 10 48 89 c7 48 89 45 f8 e8 56 d4 ff ff 48 8b 45 f8 c9 c3
0f 1f 44 00 00 <f6> 46 0c 02 0f 85 9b 00 00 00 83 7e 04 ff 0f 84 91 00
00 00 83
[19521409.753799] RIP: __atime_needs_update+0x5/0x190 RSP: ffff9dee09c2fc48
[19521409.761228] CR2: 000000000000000c
```

In the coredump, we try to figure out how this NULL pointer Oops
happen. It looks like when the program `node-exporter` tries to access
`/proc/net/sockstat`, when `walk_component()` the `/proc/net`, it got
a dentry which `d_inode` is NULL while other fields have data.

```
struct dentry {
  ...
  d_name = {
    {
      {
        hash = 2805607892,
        len = 3
      },
      hash_len = 15690509780
    },
    name = 0xffff8a4281d01ef8 "net"
  },
  struct inode *d_inode = 0x0         <======= d_inode is NULL and cause Oops!
  -> NULL
  d_iname = "net\000:01:00.0\000\000sage_in_bytes\000B\212\377",
...
```

We extra the nameidata from the crash dump as well, `link_inode` is
NULL, looks like either `lookup_slow` or `lookup_fast` return a dentry
which `inode` is NULL while other fields look normal.

```
struct nameidata {
  last = {
    {
      {
        hash = 2805607892,
        len = 3
      },
      hash_len = 15690509780
    },
    name = 0xffff8a34397b4022 "net/sockstat"
  },
  struct filename *name = 0xffff8a34397b4000
  -> {
       name = 0xffff8a34397b401c "/proc/net/sockstat",
       uptr = 0xc00175a020 <Address 0xc00175a020 out of bounds>,
       aname = 0xffff8a3b2f3c9860,
       refcnt = 2,
       iname = 0xffff8a34397b401c "/proc/net/sockstat"
     }
  struct nameidata *saved = 0x0
  -> NULL
  struct inode *link_inode = 0x0      <======= link_inode is NULL as well!
  -> NULL
}
```

We try to reproduce this question at the beginning, however, it looks
difficult to reproduce. We keep running `while true; do cat
/proc/net/sockstat; done`, but could not reproduce so far. In the past
year, we only found two similar crashes in thousands of servers in our
production.

By right `link_inode` should always have values according to our tiny
bpftrace program result.

```
# /tmp/trace_walk_component.bt
kprobe:walk_component {
  $p=((struct nameidata*) arg0);
  printf("nameidata->last.name: %s, nameidata->link_inode: %p\n",
str($p->last.name), $p->link_inode);
}
```

```
# Output
nameidata->last.name: net/sockstat, nameidata->link_inode: 0xffffffffab299966
nameidata->last.name: net/sockstat, nameidata->link_inode: 0xffff9a4efe813ab8
nameidata->last.name: net/sockstat, nameidata->link_inode: 0xffffffffab299966
nameidata->last.name: net/sockstat, nameidata->link_inode: 0xffff9a4efe813ab8
nameidata->last.name: net/sockstat, nameidata->link_inode: 0xffffffffab299966
nameidata->last.name: net/sockstat, nameidata->link_inode: 0xffffffffab299966
```

We try to search in past kernel threads, could not find a similar
crash yet, but could find a similar case in another user's blog
https://utcc.utoronto.ca/~cks/space/blog/linux/Ubuntu1804OddKernelPanic
. However, in that blog, the user didn't figure out the reason as well
although their crash stack same as us exactly.

Is this a known bug that makes dentry become corrupt? Because we could
not reproduce this issue so far, it is difficult to verify if this is
fixed in mainline. So we write this email to see if any insights from
other Linux developers, any replies would be appreciated.

Thank you in advanace.

Attach files are the dentry and nameidata which extra from the core
dump, not sure if there are helpful to check this Oops.

-- 
Best Regards,
Haosdent Huang