linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nfsd bug after f875a792abe9 (nfsd: allow nfsv3 readdir request to be larger.)
@ 2019-04-21  3:36 Doug Nazar
  2019-04-23  0:21 ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Doug Nazar @ 2019-04-21  3:36 UTC (permalink / raw)
  To: NeilBrown, J. Bruce Fields, linux-nfs

After recently upgrading the kernel on one of my servers I was getting 
the below crashes. Tracked it down to an NFSv3 root client, bisection 
points to f875a792abe9. A revert of that on top of 
5.1.0-rc5-00289-gee2725f711fa works fine.

Relevant exports:
/storage2/cubox/root 
192.168.21.0/24(sec=sys,rw,no_root_squash,insecure,nohide,no_subtree_check,crossmnt,async)
/ 
*(sec=krb5,rw,fsid=0,no_root_squash,insecure,no_subtree_check,crossmnt,async)
/storage2 
*(sec=krb5,rw,no_root_squash,insecure,nohide,no_subtree_check,crossmnt,async)


Client has the following mounts to this server (running a modified 4.9.124):
192.168.21.90:/storage2/cubox/root on / type nfs 
(rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.21.90,mountvers=3,mountproto=tcp,local_lock=all,addr=192.168.21.90)
192.168.21.90:/storage2/cubox/root/var/tmp on /var/tmp type nfs4 
(rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=krb5,clientaddr=192.168.21.242,local_lock=none,addr=192.168.21.90)


Thanks,
Doug


Initial BUG with HEAD when I started testing.
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294654] BUG: unable to 
handle kernel NULL pointer dereference at 0000000000000008
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294772] #PF error: [normal 
kernel read fault]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294776] PGD 0 P4D 0
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294780] Oops: 0000 [#1] SMP 
NOPTI
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294784] CPU: 1 PID: 5459 
Comm: nfsd Not tainted 5.1.0-rc5-00289-gee2725f711fa #67
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294787] Hardware name: 
Gigabyte Technology Co., Ltd. GA-MA78LMT-S2/GA-MA78LMT-S2, BIOS F14 
08/16/2011
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294794] RIP: 
0010:do_tcp_sendpages+0x308/0x5d0
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294799] Code: ff 48 63 e8 48 
8b 5c 24 40 65 48 33 1c 25 28 00 00 00 48 89 e8 0f 85 73 02 00 00 48 83 
c4 48 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <49> 8b 57 08 48 8d 42 ff 83 e2 
01 49 0f 44 c7 f0 ff 40 34 8b 91 b8
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294804] RSP: 
0018:ffffc90000437c90 EFLAGS: 00010202
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294808] RAX: 
0000000000000001 RBX: 0000000000028000 RCX: ffff888210505600
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294811] RDX: 
ffffffff8154b1a0 RSI: 0000000000001000 RDI: ffff888210316e80
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294814] RBP: 
0000000000000000 R08: 0000000000001000 R09: 0000000000000003
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294817] R10: 
0000000000000002 R11: 0000000000000005 R12: 0000000000001000
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294820] R13: 
00000000000005a8 R14: ffff888210316e80 R15: 0000000000000000
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294823] FS: 
0000000000000000(0000) GS:ffff888217a40000(0000) knlGS:0000000000000000
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294826] CS:  0010 DS: 0000 
ES: 0000 CR0: 0000000080050033
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294829] CR2: 
0000000000000008 CR3: 00000002150aa000 CR4: 00000000000006e0
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294832] Call Trace:
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294837] 
tcp_sendpage_locked+0x42/0x60
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294841] tcp_sendpage+0x38/0x50
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294844] inet_sendpage+0x3a/0xe0
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294856]  ? 
encode_post_op_attr.isra.4+0xf4/0x100 [nfsd]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294865] 
kernel_sendpage+0x15/0x30
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294878] 
svc_send_common+0xed/0x150 [sunrpc]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294887] 
svc_sendto+0xf6/0x1a0 [sunrpc]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294892]  ? 
refcount_dec_and_lock+0x18/0x70
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294899]  ? 
auth_domain_put+0x10/0x60 [sunrpc]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294907]  ? 
svc_authorise+0x34/0x40 [sunrpc]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294915]  ? 
svc_process_common.isra.8+0x10b/0x6a0 [sunrpc]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294923] 
svc_tcp_sendto+0x29/0x40 [sunrpc]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294930] svc_send+0x94/0x190 
[sunrpc]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.294990]  ? 
nfsd_destroy+0x50/0x50 [nfsd]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.295049]  nfsd+0xdd/0x140 [nfsd]
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.295054] kthread+0x10e/0x130
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.295058]  ? 
kthread_park+0x80/0x80
Apr 20 05:48:10 wraith.dragoninc.ca  [  968.295062] ret_from_fork+0x22/0x40


Sometimes during bisection I saw this error instead:
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972318] BUG: unable to 
handle kernel paging request at ffff888214c15000
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972336] #PF error: [PROT] 
[WRITE]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972339] PGD 2201067 P4D 
2201067 PUD 2204067 PMD 2155bd063 PTE 8000000214c15161
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972344] Oops: 0003 [#1] SMP 
NOPTI
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972349] CPU: 1 PID: 5456 
Comm: nfsd Not tainted 5.0.0-11058-ga840b56ba385 #79
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972352] Hardware name: 
Gigabyte Technology Co., Ltd. GA-MA78LMT-S2/GA-MA78LMT-S2, BIOS F14 
08/16/2011
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972360] RIP: 
0010:__memmove+0x57/0x1a0
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972363] Code: 00 72 05 40 38 
fe 74 3b 48 83 ea 20 48 83 ea 20 4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 
46 18 48 8d 76 20 4c 89 1f 4c 89 57 08 <4c> 89 4f 10 4c 89 47 18 48 8d 
7f 20 73 d4 48 83 c2 20 e9 a2 00 00
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972370] RSP: 
0018:ffffc90000437d48 EFLAGS: 00010286
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972373] RAX: 
ffff88821498400c RBX: ffffffffdfc7fff4 RCX: ffff888214984024
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972377] RDX: 
ffffffffdf9eefd4 RSI: ffff8882121b8000 RDI: ffff888214c14fec
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972379] RBP: 
000000002038002c R08: 0000000000000000 R09: 0000000000000000
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972382] R10: 
0000000000000000 R11: 0000000000000000 R12: 0000000000000008
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972385] R13: 
ffff888214e692d4 R14: ffff888211f27000 R15: 0000000000000007
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972388] FS: 
0000000000000000(0000) GS:ffff888217a40000(0000) knlGS:0000000000000000
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972391] CS:  0010 DS: 0000 
ES: 0000 CR0: 0000000080050033
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972394] CR2: 
ffff888214c15000 CR3: 0000000210996000 CR4: 00000000000006e0
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972397] Call Trace:
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972413] 
encode_entry.isra.8+0x1bc/0x310 [nfsd]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972422] 
nfsd_readdir+0x184/0x230 [nfsd]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972430]  ? 
nfs3svc_encode_readdirres+0xc0/0xc0 [nfsd]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972437]  ? 
nfsd_finish_read+0x190/0x190 [nfsd]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972445] 
nfsd3_proc_readdir+0x84/0x120 [nfsd]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972452] 
nfsd_dispatch+0xc1/0x1b0 [nfsd]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972465] 
svc_process_common.isra.8+0x355/0x6a0 [sunrpc]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972474]  ? 
svc_recv+0x2ac/0x820 [sunrpc]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972480]  ? 
nfsd_destroy+0x50/0x50 [nfsd]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972488] 
svc_process+0xd2/0x110 [sunrpc]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972494]  nfsd+0xdd/0x140 [nfsd]
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972499] kthread+0x10e/0x130
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972502]  ? 
kthread_park+0x80/0x80
Apr 20 21:45:56 wraith.dragoninc.ca  [  365.972506] ret_from_fork+0x22/0x40


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nfsd bug after f875a792abe9 (nfsd: allow nfsv3 readdir request to be larger.)
  2019-04-21  3:36 nfsd bug after f875a792abe9 (nfsd: allow nfsv3 readdir request to be larger.) Doug Nazar
@ 2019-04-23  0:21 ` NeilBrown
  2019-04-23  2:22   ` Doug Nazar
  2019-04-23 14:38   ` J. Bruce Fields
  0 siblings, 2 replies; 4+ messages in thread
From: NeilBrown @ 2019-04-23  0:21 UTC (permalink / raw)
  To: Doug Nazar, J. Bruce Fields, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 773 bytes --]

On Sat, Apr 20 2019, Doug Nazar wrote:

> After recently upgrading the kernel on one of my servers I was getting 
> the below crashes. Tracked it down to an NFSv3 root client, bisection 
> points to f875a792abe9. A revert of that on top of 
> 5.1.0-rc5-00289-gee2725f711fa works fine.

hi Doug,
 thanks for the report and for bisecting!

I think this bug has been fixed, but the fix hasn't landed upstream yet.
If you
   git pull git://git.linux-nfs.org/~bfields/linux.git nfs-next

you should get

Commit 3c86794ac0e6 ("nfsd/nfsd3_proc_readdir: fix buffer count and page pointers")

which should fix this.


Bruce:  I don't see that patch in your "for-5.1" branch, but you said
   "Queuing up for 5.1."
Did something go awry?

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nfsd bug after f875a792abe9 (nfsd: allow nfsv3 readdir request to be larger.)
  2019-04-23  0:21 ` NeilBrown
@ 2019-04-23  2:22   ` Doug Nazar
  2019-04-23 14:38   ` J. Bruce Fields
  1 sibling, 0 replies; 4+ messages in thread
From: Doug Nazar @ 2019-04-23  2:22 UTC (permalink / raw)
  To: NeilBrown, J. Bruce Fields, linux-nfs

On 2019-04-22 20:21, NeilBrown wrote:
> I think this bug has been fixed, but the fix hasn't landed upstream yet.
> If you
>     git pull git://git.linux-nfs.org/~bfields/linux.git nfs-next

s/nfs-next/nfsd-next/    ;-)

Seems to work fine. I didn't test just that particular patch however, 
just the tree.

I did poke around a bit in the git web interface before, but didn't see 
anything that jumped out at me.

Thanks,
Doug


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nfsd bug after f875a792abe9 (nfsd: allow nfsv3 readdir request to be larger.)
  2019-04-23  0:21 ` NeilBrown
  2019-04-23  2:22   ` Doug Nazar
@ 2019-04-23 14:38   ` J. Bruce Fields
  1 sibling, 0 replies; 4+ messages in thread
From: J. Bruce Fields @ 2019-04-23 14:38 UTC (permalink / raw)
  To: NeilBrown; +Cc: Doug Nazar, linux-nfs

On Tue, Apr 23, 2019 at 10:21:53AM +1000, NeilBrown wrote:
> On Sat, Apr 20 2019, Doug Nazar wrote:
> 
> > After recently upgrading the kernel on one of my servers I was getting 
> > the below crashes. Tracked it down to an NFSv3 root client, bisection 
> > points to f875a792abe9. A revert of that on top of 
> > 5.1.0-rc5-00289-gee2725f711fa works fine.
> 
> hi Doug,
>  thanks for the report and for bisecting!
> 
> I think this bug has been fixed, but the fix hasn't landed upstream yet.
> If you
>    git pull git://git.linux-nfs.org/~bfields/linux.git nfs-next
> 
> you should get
> 
> Commit 3c86794ac0e6 ("nfsd/nfsd3_proc_readdir: fix buffer count and page pointers")
> 
> which should fix this.
> 
> 
> Bruce:  I don't see that patch in your "for-5.1" branch, but you said
>    "Queuing up for 5.1."
> Did something go awry?

Sorry, I've got this confusing system where I commit first to a branch
named for-x.y-incoming, then push it to for-x.y after I've tested it.
Except sometimes I forget to push to the for-x.y branch until the last
minute.

The -incoming branches should always be included in nfsd-next.

Anyway, I'll send a pull request now, it's overdue....

--b.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-04-23 14:38 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-21  3:36 nfsd bug after f875a792abe9 (nfsd: allow nfsv3 readdir request to be larger.) Doug Nazar
2019-04-23  0:21 ` NeilBrown
2019-04-23  2:22   ` Doug Nazar
2019-04-23 14:38   ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).