linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Armin Zentai <armin@zentai.name>
Cc: linux-nfs@vger.kernel.org
Subject: Re: Bug? nfs export on fuse, one more READDIR call after a finished directory read
Date: Tue, 18 Jun 2019 13:58:46 -0400	[thread overview]
Message-ID: <20190618175846.GA7304@fieldses.org> (raw)
In-Reply-To: <75222F09-6BA2-40C3-B8AF-CEE707E71A90@zentai.name>

On Tue, Jun 18, 2019 at 02:32:15AM +0200, Armin Zentai wrote:
> After, I've done some tests, I confirmed, you were right.
> 
> I was able to reproduce the issue with a very simple example code
> - create a few hundred folders within a folder
> - using the example code with a smaller BUF_SIZE, like 256, http://man7.org/linux/man-pages/man2/getdents.2.html
> - also deleting one folder in every iteration of the outer for loop
> 
> This clearly reproduces the issue and the fuse lib's EINVAL is a false-positive error and it should return a simple empty array with fuse.OK return code.

OK, thanks for following up.

--b.

> 
> 
> Thanks,
> Armin
> 
> > On 2019. Jun 4., at 17:12, J. Bruce Fields <bfields@fieldses.org> wrote:
> > 
> > On Tue, Jun 04, 2019 at 12:51:11PM +0200, Armin Zentai wrote:
> >> Hi everyone,
> >> 
> >> 
> >> I'm developing a fuse software and it should be able to run _under_ an nfsd export.
> >> Currently I've stripped everything out from the fuse software, and only using it as a mirrorfs. It's written in go and using this lib: https://github.com/hanwen/go-fuse.
> >> 
> >> My issue a bit complex, but I'll try to explain it as detailed as I can. I like to get some guidance or hint how to proceed with the debugging or a solution of course. 
> >> 
> >> 
> >> NFS and kernel settings:
> >> client+server: Debian 9.0 / kernel-5.2.0 (I compiled it with debug symbols)
> >> export options: (rw,sync,no_subtree_check,all_squash,anonuid=505,anongid=20,fsid=4193281846142082570)
> >> client mount options: type nfs4 (rw,noatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=6,sec=sys,clientaddr=10.135.105.205,local_lock=none,addr=10.135.101.174,_netdev)
> >> 
> >> NFSv3 is explicitly disabled on the server:
> >> cat /proc/fs/nfsd/versions
> >> -2 -3 +4 +4.1 +4.2
> >> 
> >> 
> >> 
> >> The error itself is non-deterministic, but reproducible within ~10 minutes.
> >> 
> >> Sometimes the client is calling one more READDIR after an empty READDIR reply. The extra READDIR is called with an increased cookie value.
> > 
> > I don't know why the client's doing that, it does sound a little weird.
> > I guess the higher cookie value is something cached from a previous
> > READDIR of the directory.  (I think you said it was removing files so
> > it's not surprising that a cached cookie might point to a position in
> > the directory that's no longer there.)  But I don't know why it would be
> > issuing another READDIR after the client already set the EOF bit on a
> > previous one.
> > 
> > Off the top of my head that client behavior sounds weird and suboptimal,
> > but also harmless.
> > 
> >> This behaviour is causing an array out-of-bound read in the fuse lib.
> >> 
> >> This results in an EINVAL error in the go fuse library.
> >> https://github.com/hanwen/go-fuse/blob/master/fuse/nodefs/dir.go#L42
> >> https://github.com/hanwen/go-fuse/blob/master/fuse/nodefs/dir.go#L72
> >> (also there is an open issue by me, https://github.com/hanwen/go-fuse/issues/297)
> > 
> > I think that's incorrect.
> > 
> > Assuming the client is using a cookie value that was previously handed
> > out by the server at some point in the past, the server still needs to
> > honor that cookie.  So it should probably just be returning a succesful
> > READDIR with zero entries.
> > 
> > --b.
> > 
> > 
> >> 
> >> If this behaviour is totally accepted, then this EINVAL return value can be replaced with fuse.OK in the lib, but I'm a bit curious what is causing this.
> >> 
> >> 
> >> 
> >> The easiest way to reproduce it is using an npm install (without an existing node_modules folder) and in the end it may produce the error.
> >> I haven't debugged the npm install process or reproduced it with a simpler code, but it's happening when removing a directory structure with files.
> >> The npm install at the end removes the fsevents folder with its files. It is downloaded, but not supported on linux (only on OSX) and it removes the module.
> >> So probably it's connected with some npm behaviour which is trying to remove multiple files with multiple async file delete calls.
> >> rm -rf itself can't reproduce the issue.
> >> 
> >> If I'm running npm install && rm -rf node_modules in a loop, I can reproduce the error in every 2nd - 10th run (approx 2 - 15 minutes).
> >> 
> >> 
> >> 
> >> I've found these things while debugging:
> >> 
> >> The client is always using readdirplus functions to get directory entries.
> >> If I get a plus=false, then it indicates an error.
> >> Although the error can happen without plus=false too (approx 50/50 chance).
> >> 
> >> 
> >> I've used this bpftrace to get it.
> >> 
> >> ```
> >> kprobe:nfs4_xdr_enc_readdir {
> >>  $a = ((struct nfs4_readdir_arg *)arg2);
> >>  if ($a->plus != 1) {
> >>    time("%H:%M:%S\n");
> >>    printf("nfs4_xdr_enc_readdir count=%u plus=%u cookie=%llu\n",
> >>      $a->count, $a->plus, $a->cookie);
> >>  }
> >> }
> >> ```
> >> 
> >> Also there is a kstack, from nfs4_xdr_enc_readdir:
> >> 
> >> nfs4_xdr_enc_readdir count=262144 plus=0 cookie=24
> >> nfs4_xdr_enc_readdir kstack
> >> 
> >>        nfs4_xdr_enc_readdir+1
> >>        rpcauth_wrap_req_encode+35
> >>        call_encode+357
> >>        __rpc_execute+137
> >>        rpc_run_task+268
> >>        nfs4_call_sync_sequence+100
> >>        _nfs4_proc_readdir+277
> >>        nfs4_proc_readdir+141
> >>        nfs_readdir_xdr_to_array+375
> >>        nfs_readdir+613
> >>        iterate_dir+320
> >>        __x64_sys_getdents+169
> >>        do_syscall_64+85
> >>        entry_SYSCALL_64_after_hwframe+68
> >> 
> >> 
> >> Also I found out the extra READDIR call, always (mostly?) has an increased (duplicated) dircount parameter.
> >> 
> >> 
> >>> From wireshark dump there is an example:
> >> 
> >> Packets from 1 to 4 are totally normal communication.
> >> 5-6 are the extra ones. 
> >> In packet 3, the dircount is 65536 and in packet 5 it is 131072.
> >> 
> >> 
> >> 1. client -> server
> >> Remote Procedure Call, Type:Call XID:0x04a37c12
> >> Network File System, Ops(3): SEQUENCE, PUTFH, READDIR
> >>    [Program Version: 4]
> >>    [V4 Procedure: COMPOUND (1)]
> >>    Tag: <EMPTY>
> >>    minorversion: 2
> >>    Operations (count: 3): SEQUENCE, PUTFH, READDIR
> >>        Opcode: SEQUENCE (53)
> >>            sessionid: a2fdea5cc954b1470300000000000000
> >>            seqid: 0x0001e92f
> >>            slot id: 25
> >>            high slot id: 25
> >>            cache this?: No
> >>        Opcode: PUTFH (22)
> >>            FileHandle
> >>        Opcode: READDIR (26)
> >>            cookie: 0
> >>            cookie_verf: 0
> >>            dircount: 65514
> >>            maxcount: 262056
> >>            Attr mask[0]: 0x0018091a (Type, Change, Size, FSID, RDAttr_Error, Filehandle, FileId)
> >>            Attr mask[1]: 0x00b0a23a (Mode, NumLinks, Owner, Owner_Group, RawDev, Space_Used, Time_Access, Time_Metadata, Time_Modify, Mounted_on_FileId)
> >>    [Main Opcode: READDIR (26)]
> >> 
> >> 2. server -> client
> >> Network File System, Ops(3): SEQUENCE PUTFH READDIR
> >>    [Program Version: 4]
> >>    [V4 Procedure: COMPOUND (1)]
> >>    Status: NFS4_OK (0)
> >>    Tag: <EMPTY>
> >>    Operations (count: 3)
> >>        Opcode: SEQUENCE (53)
> >>            Status: NFS4_OK (0)
> >>            sessionid: a2fdea5cc954b1470300000000000000
> >>            seqid: 0x0001e92f
> >>            slot id: 25
> >>            high slot id: 29
> >>            target high slot id: 29
> >>            status flags: 0x00000000
> >>        Opcode: PUTFH (22)
> >>            Status: NFS4_OK (0)
> >>        Opcode: READDIR (26)
> >>            Status: NFS4_OK (0)
> >>            verifier: 0x0000000000000000
> >>            Directory Listing
> >>                Entry: node-pre-gyp
> >>                Entry: needle
> >>                Entry: iconv-lite
> >>                Entry: readable-stream
> >>                Value Follows: No
> >>                EOF: Yes
> >>    [Main Opcode: READDIR (26)]
> >> 
> >> 3. client -> server
> >> Remote Procedure Call, Type:Call XID:0x16a37c12
> >> Network File System, Ops(3): SEQUENCE, PUTFH, READDIR
> >>    [Program Version: 4]
> >>    [V4 Procedure: COMPOUND (1)]
> >>    Tag: <EMPTY>
> >>    minorversion: 2
> >>    Operations (count: 3): SEQUENCE, PUTFH, READDIR
> >>        Opcode: SEQUENCE (53)
> >>            sessionid: a2fdea5cc954b1470300000000000000
> >>            seqid: 0x000cd063
> >>            slot id: 15
> >>            high slot id: 24
> >>            cache this?: No
> >>        Opcode: PUTFH (22)
> >>            FileHandle
> >>        Opcode: READDIR (26)
> >>            cookie: 6
> >>            cookie_verf: 0
> >>            dircount: 65536
> >>            maxcount: 262144
> >>            Attr mask[0]: 0x0018091a (Type, Change, Size, FSID, RDAttr_Error, Filehandle, FileId)
> >>            Attr mask[1]: 0x00b0a23a (Mode, NumLinks, Owner, Owner_Group, RawDev, Space_Used, Time_Access, Time_Metadata, Time_Modify, Mounted_on_FileId)
> >>    [Main Opcode: READDIR (26)]
> >> 
> >> 4. server -> client
> >> Remote Procedure Call, Type:Reply XID:0x16a37c12
> >> Network File System, Ops(3): SEQUENCE PUTFH READDIR
> >>    [Program Version: 4]
> >>    [V4 Procedure: COMPOUND (1)]
> >>    Status: NFS4_OK (0)
> >>    Tag: <EMPTY>
> >>    Operations (count: 3)
> >>        Opcode: SEQUENCE (53)
> >>            Status: NFS4_OK (0)
> >>            sessionid: a2fdea5cc954b1470300000000000000
> >>            seqid: 0x000cd063
> >>            slot id: 15
> >>            high slot id: 29
> >>            target high slot id: 29
> >>            status flags: 0x00000000
> >>        Opcode: PUTFH (22)
> >>            Status: NFS4_OK (0)
> >>        Opcode: READDIR (26)
> >>            Status: NFS4_OK (0)
> >>            verifier: 0x0000000000000000
> >>            Directory Listing
> >>                Value Follows: No
> >>                EOF: Yes
> >>    [Main Opcode: READDIR (26)]
> >> 
> >> 5. client -> server
> >> Remote Procedure Call, Type:Call XID:0x32a37c12
> >> Network File System, Ops(3): SEQUENCE, PUTFH, READDIR
> >>    [Program Version: 4]
> >>    [V4 Procedure: COMPOUND (1)]
> >>    Tag: <EMPTY>
> >>    minorversion: 2
> >>    Operations (count: 3): SEQUENCE, PUTFH, READDIR
> >>        Opcode: SEQUENCE (53)
> >>            sessionid: a2fdea5cc954b1470300000000000000
> >>            seqid: 0x0001585c
> >>            slot id: 26
> >>            high slot id: 26
> >>            cache this?: No
> >>        Opcode: PUTFH (22)
> >>            FileHandle
> >>        Opcode: READDIR (26)
> >>            cookie: 7
> >>            cookie_verf: 0
> >>            dircount: 131072
> >>            maxcount: 262144
> >>            Attr mask[0]: 0x00000800 (RDAttr_Error)
> >>            Attr mask[1]: 0x00800000 (Mounted_on_FileId)
> >>    [Main Opcode: READDIR (26)]
> >> 
> >> 6. server -> client
> >> Remote Procedure Call, Type:Reply XID:0x32a37c12
> >> Network File System, Ops(3): SEQUENCE PUTFH READDIR(NFS4ERR_INVAL)
> >>    [Program Version: 4]
> >>    [V4 Procedure: COMPOUND (1)]
> >>    Status: NFS4ERR_INVAL (22)
> >>    Tag: <EMPTY>
> >>    Operations (count: 3)
> >>        Opcode: SEQUENCE (53)
> >>            Status: NFS4_OK (0)
> >>            sessionid: a2fdea5cc954b1470300000000000000
> >>            seqid: 0x0001585c
> >>            slot id: 26
> >>            high slot id: 29
> >>            target high slot id: 29
> >>            status flags: 0x00000000
> >>        Opcode: PUTFH (22)
> >>            Status: NFS4_OK (0)
> >>        Opcode: READDIR (26)
> >>            Status: NFS4ERR_INVAL (22)
> >>    [Main Opcode: READDIR (26)]
> >> 
> >> 
> >> Thanks,
> >> Armin

      reply	other threads:[~2019-06-18 17:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-04 10:51 Bug? nfs export on fuse, one more READDIR call after a finished directory read Armin Zentai
2019-06-04 15:12 ` J. Bruce Fields
2019-06-18  0:32   ` Armin Zentai
2019-06-18 17:58     ` J. Bruce Fields [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190618175846.GA7304@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=armin@zentai.name \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).