All of lore.kernel.org
 help / color / mirror / Atom feed
* NFSv4.1 - 2x opens causes unknown 5s delay.
@ 2021-07-30  3:15 Matthew Robertson
  2021-07-31 22:16 ` Trond Myklebust
  0 siblings, 1 reply; 2+ messages in thread
From: Matthew Robertson @ 2021-07-30  3:15 UTC (permalink / raw)
  To: linux-nfs

Doing a simple open of a file from 2 readers and there is a strange 5s
delay on the second open.   We did NOT see the issue on kernel:
CentOS 4.18.0-193.6.3.el8_2.x86_64
but the issue is present in:
CentOS 4.18.0-305.7.1.el8_4.x86_64


#include <stdio.h>

int main() {

        FILE * file_pointer;

        char buffer[50], c;

file_pointer = fopen("/mnt/nfsv4/test/t.txt", "r");

fgets(buffer, 50, file_pointer);

file_pointer = fopen("/mnt/nfsv4/test/t.txt", "r");

fgets(buffer, 50, file_pointer);

        fclose(file_pointer);

return 0;

}



Call Graph from ftrace

_nfs4_opendata_to_nfs4_state()

 nfs_refresh_inode() {

  nfs_refresh_inode.part.28() {

   nfs_refresh_inode_locked() {

 update_open_stateid()

  ** 5s wait **

Then the open / read completes


We do not see this issue on nfsv3 only nfsv4.


Here is the wireshark of the request and response:

The second request and response happens immediately in the wireshark.
The delay is in the kernel.



Frame 25: 378 bytes on wire (3024 bits), 378 bytes captured (3024 bits)
Ethernet II, Src: fa:16:3e:88:e2:24 (fa:16:3e:88:e2:24), Dst:
PureStor_5b:45:10 (24:a9:37:5b:45:10)
Internet Protocol Version 4, Src: 10.15.132.250, Dst: 10.15.128.15
Transmission Control Protocol, Src Port: 710, Dst Port: 2049, Seq:
2125, Ack: 1193, Len: 312
Remote Procedure Call, Type:Call XID:0xed175b8d
Network File System, Ops(5): SEQUENCE, PUTFH, OPEN, ACCESS, GETATTR
    [Program Version: 4]
    [V4 Procedure: COMPOUND (1)]
    Tag: <EMPTY>
    minorversion: 1
    Operations (count: 5): SEQUENCE, PUTFH, OPEN, ACCESS, GETATTR
        Opcode: SEQUENCE (53)
            sessionid: 00c800000000000c0000000000000000
            seqid: 0x00000073
            slot id: 0
            high slot id: 0
            cache this?: Yes
        Opcode: PUTFH (22)
            FileHandle
        Opcode: OPEN (18)
            seqid: 0x00000000
            share_access: OPEN4_SHARE_ACCESS_READ (1)
            share_deny: OPEN4_SHARE_DENY_NONE (0)
            clientid: 0x00c8000000000017
            owner: <DATA>
            Open Type: OPEN4_NOCREATE (0)
            Claim Type: CLAIM_FH (4)
        Opcode: ACCESS (3), [Check: RD MD XT XE]
            Check access: 0x2d
                .... ...1 = 0x001 READ: allowed?
                .... .1.. = 0x004 MODIFY: allowed?
                .... 1... = 0x008 EXTEND: allowed?
                ..1. .... = 0x020 EXECUTE: allowed?
        Opcode: GETATTR (9)
            Attr mask[0]: 0x0010011a (Type, Change, Size, FSID, FileId)
                reqd_attr: Type (1)
                reqd_attr: Change (3)
                reqd_attr: Size (4)
                reqd_attr: FSID (8)
                reco_attr: FileId (20)
            Attr mask[1]: 0x0030a23a (Mode, NumLinks, Owner,
Owner_Group, RawDev, Space_Used, Time_Access, Time_Metadata,
Time_Modify)
                reco_attr: Mode (33)
                reco_attr: NumLinks (35)
                reco_attr: Owner (36)
                reco_attr: Owner_Group (37)
                reco_attr: RawDev (41)
                reco_attr: Space_Used (45)
                reco_attr: Time_Access (47)
                reco_attr: Time_Metadata (52)
                reco_attr: Time_Modify (53)
    [Main Opcode: OPEN (18)]





Frame 26: 374 bytes on wire (2992 bits), 374 bytes captured (2992 bits)
Ethernet II, Src: PureStor_5b:45:10 (24:a9:37:5b:45:10), Dst:
fa:16:3e:88:e2:24 (fa:16:3e:88:e2:24)
Internet Protocol Version 4, Src: 10.15.128.15, Dst: 10.15.132.250
Transmission Control Protocol, Src Port: 2049, Dst Port: 710, Seq:
1193, Ack: 2437, Len: 308
Remote Procedure Call, Type:Reply XID:0xed175b8d
Network File System, Ops(5): SEQUENCE PUTFH OPEN ACCESS GETATTR
    [Program Version: 4]
    [V4 Procedure: COMPOUND (1)]
    Status: NFS4_OK (0)
    Tag: <EMPTY>
    Operations (count: 5)
        Opcode: SEQUENCE (53)
            Status: NFS4_OK (0)
            sessionid: 00c800000000000c0000000000000000
            seqid: 0x00000073
            slot id: 0
            high slot id: 127
            target high slot id: 127
            status flags: 0x00000000
        Opcode: PUTFH (22)
            Status: NFS4_OK (0)
        Opcode: OPEN (18)
            Status: NFS4_OK (0)
            StateID
                [StateID Hash: 0xaa04]
                StateID seqid: 1
                StateID Other: 000000000000000000000000
                [StateID Other hash: 0x60de]
            change_info
                Atomic: Yes
                changeid (before): 1626267251187000000
                changeid (after): 1626267251187000000
            result flags: 0x00000004, locktype posix
            Delegation Type: OPEN_DELEGATE_NONE (0)
        Opcode: ACCESS (3), [NOT Supported: LU DL], [Access Denied: LU
MD XT DL XE], [Allowed: RD]
            Status: NFS4_OK (0)
            Supported types (of requested): 0x3f
            Access rights (of requested): 0x01
                .... ...1 = 0x001 READ: allowed
                .... ..0. = 0x002 LOOKUP: *Access Denied*
                .... .0.. = 0x004 MODIFY: *Access Denied*
                .... 0... = 0x008 EXTEND: *Access Denied*
                ...0 .... = 0x010 DELETE: *Access Denied*
                ..0. .... = 0x020 EXECUTE: *Access Denied*
        Opcode: GETATTR (9)
            Status: NFS4_OK (0)
            Attr mask[0]: 0x0010011a (Type, Change, Size, FSID, FileId)
                reqd_attr: Type (1)
                    ftype4: NF4REG (1)
                reqd_attr: Change (3)
                    changeid: 1626267251187000000
                reqd_attr: Size (4)
                    size: 7
                reqd_attr: FSID (8)
                    fattr4_fsid
                reco_attr: FileId (20)
                    fileid: 20829148276664444
            Attr mask[1]: 0x0030a23a (Mode, NumLinks, Owner,
Owner_Group, RawDev, Space_Used, Time_Access, Time_Metadata,
Time_Modify)
                reco_attr: Mode (33)
                    mode: 0644, Name: Unknown, Read permission for
owner, Write permission for owner, Read permission for group, Read
permission for others
                reco_attr: NumLinks (35)
                    numlinks: 1
                reco_attr: Owner (36)
                    fattr4_owner: 0
                reco_attr: Owner_Group (37)
                    fattr4_owner_group: 0
                reco_attr: RawDev (41)
                    specdata1: 0
                    specdata2: 0
                reco_attr: Space_Used (45)
                    space_used: 7
                reco_attr: Time_Access (47)
                    seconds: 1626267228
                    nseconds: 0
                reco_attr: Time_Metadata (52)
                    seconds: 1626267251
                    nseconds: 187000000
                reco_attr: Time_Modify (53)
                    seconds: 1626267251
                    nseconds: 186841000
    [Main Opcode: OPEN (18)]


-- 
Matt Robertson // Data Architect | Pure Storage, Inc.
913-915-1162  |  mrobertson@purestorage.com

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: NFSv4.1 - 2x opens causes unknown 5s delay.
  2021-07-30  3:15 NFSv4.1 - 2x opens causes unknown 5s delay Matthew Robertson
@ 2021-07-31 22:16 ` Trond Myklebust
  0 siblings, 0 replies; 2+ messages in thread
From: Trond Myklebust @ 2021-07-31 22:16 UTC (permalink / raw)
  To: Matthew Robertson; +Cc: linux-nfs



> On Jul 29, 2021, at 23:15, Matthew Robertson <mrobertson@purestorage.com> wrote:
> 
> Doing a simple open of a file from 2 readers and there is a strange 5s
> delay on the second open.   We did NOT see the issue on kernel:
> CentOS 4.18.0-193.6.3.el8_2.x86_64
> but the issue is present in:
> CentOS 4.18.0-305.7.1.el8_4.x86_64

Your OPEN implementation is fundamentally broken. You’re handing out an illegal stateid. The client is not obligated to take into account an implementation that does not bump the seqid as required by the NFSv4 spec.

> 
> 
> #include <stdio.h>
> 
> int main() {
> 
>        FILE * file_pointer;
> 
>        char buffer[50], c;
> 
> file_pointer = fopen("/mnt/nfsv4/test/t.txt", "r");
> 
> fgets(buffer, 50, file_pointer);
> 
> file_pointer = fopen("/mnt/nfsv4/test/t.txt", "r");
> 
> fgets(buffer, 50, file_pointer);
> 
>        fclose(file_pointer);
> 
> return 0;
> 
> }
> 
> 
> 
> Call Graph from ftrace
> 
> _nfs4_opendata_to_nfs4_state()
> 
> nfs_refresh_inode() {
> 
>  nfs_refresh_inode.part.28() {
> 
>   nfs_refresh_inode_locked() {
> 
> update_open_stateid()
> 
>  ** 5s wait **
> 
> Then the open / read completes
> 
> 
> We do not see this issue on nfsv3 only nfsv4.
> 
> 
> Here is the wireshark of the request and response:
> 
> The second request and response happens immediately in the wireshark.
> The delay is in the kernel.
> 
> 
> 
> Frame 25: 378 bytes on wire (3024 bits), 378 bytes captured (3024 bits)
> Ethernet II, Src: fa:16:3e:88:e2:24 (fa:16:3e:88:e2:24), Dst:
> PureStor_5b:45:10 (24:a9:37:5b:45:10)
> Internet Protocol Version 4, Src: 10.15.132.250, Dst: 10.15.128.15
> Transmission Control Protocol, Src Port: 710, Dst Port: 2049, Seq:
> 2125, Ack: 1193, Len: 312
> Remote Procedure Call, Type:Call XID:0xed175b8d
> Network File System, Ops(5): SEQUENCE, PUTFH, OPEN, ACCESS, GETATTR
>    [Program Version: 4]
>    [V4 Procedure: COMPOUND (1)]
>    Tag: <EMPTY>
>    minorversion: 1
>    Operations (count: 5): SEQUENCE, PUTFH, OPEN, ACCESS, GETATTR
>        Opcode: SEQUENCE (53)
>            sessionid: 00c800000000000c0000000000000000
>            seqid: 0x00000073
>            slot id: 0
>            high slot id: 0
>            cache this?: Yes
>        Opcode: PUTFH (22)
>            FileHandle
>        Opcode: OPEN (18)
>            seqid: 0x00000000
>            share_access: OPEN4_SHARE_ACCESS_READ (1)
>            share_deny: OPEN4_SHARE_DENY_NONE (0)
>            clientid: 0x00c8000000000017
>            owner: <DATA>
>            Open Type: OPEN4_NOCREATE (0)
>            Claim Type: CLAIM_FH (4)
>        Opcode: ACCESS (3), [Check: RD MD XT XE]
>            Check access: 0x2d
>                .... ...1 = 0x001 READ: allowed?
>                .... .1.. = 0x004 MODIFY: allowed?
>                .... 1... = 0x008 EXTEND: allowed?
>                ..1. .... = 0x020 EXECUTE: allowed?
>        Opcode: GETATTR (9)
>            Attr mask[0]: 0x0010011a (Type, Change, Size, FSID, FileId)
>                reqd_attr: Type (1)
>                reqd_attr: Change (3)
>                reqd_attr: Size (4)
>                reqd_attr: FSID (8)
>                reco_attr: FileId (20)
>            Attr mask[1]: 0x0030a23a (Mode, NumLinks, Owner,
> Owner_Group, RawDev, Space_Used, Time_Access, Time_Metadata,
> Time_Modify)
>                reco_attr: Mode (33)
>                reco_attr: NumLinks (35)
>                reco_attr: Owner (36)
>                reco_attr: Owner_Group (37)
>                reco_attr: RawDev (41)
>                reco_attr: Space_Used (45)
>                reco_attr: Time_Access (47)
>                reco_attr: Time_Metadata (52)
>                reco_attr: Time_Modify (53)
>    [Main Opcode: OPEN (18)]
> 
> 
> 
> 
> 
> Frame 26: 374 bytes on wire (2992 bits), 374 bytes captured (2992 bits)
> Ethernet II, Src: PureStor_5b:45:10 (24:a9:37:5b:45:10), Dst:
> fa:16:3e:88:e2:24 (fa:16:3e:88:e2:24)
> Internet Protocol Version 4, Src: 10.15.128.15, Dst: 10.15.132.250
> Transmission Control Protocol, Src Port: 2049, Dst Port: 710, Seq:
> 1193, Ack: 2437, Len: 308
> Remote Procedure Call, Type:Reply XID:0xed175b8d
> Network File System, Ops(5): SEQUENCE PUTFH OPEN ACCESS GETATTR
>    [Program Version: 4]
>    [V4 Procedure: COMPOUND (1)]
>    Status: NFS4_OK (0)
>    Tag: <EMPTY>
>    Operations (count: 5)
>        Opcode: SEQUENCE (53)
>            Status: NFS4_OK (0)
>            sessionid: 00c800000000000c0000000000000000
>            seqid: 0x00000073
>            slot id: 0
>            high slot id: 127
>            target high slot id: 127
>            status flags: 0x00000000
>        Opcode: PUTFH (22)
>            Status: NFS4_OK (0)
>        Opcode: OPEN (18)
>            Status: NFS4_OK (0)
>            StateID
>                [StateID Hash: 0xaa04]
>                StateID seqid: 1
>                StateID Other: 000000000000000000000000
>                [StateID Other hash: 0x60de]
>            change_info
>                Atomic: Yes
>                changeid (before): 1626267251187000000
>                changeid (after): 1626267251187000000
>            result flags: 0x00000004, locktype posix
>            Delegation Type: OPEN_DELEGATE_NONE (0)
>        Opcode: ACCESS (3), [NOT Supported: LU DL], [Access Denied: LU
> MD XT DL XE], [Allowed: RD]
>            Status: NFS4_OK (0)
>            Supported types (of requested): 0x3f
>            Access rights (of requested): 0x01
>                .... ...1 = 0x001 READ: allowed
>                .... ..0. = 0x002 LOOKUP: *Access Denied*
>                .... .0.. = 0x004 MODIFY: *Access Denied*
>                .... 0... = 0x008 EXTEND: *Access Denied*
>                ...0 .... = 0x010 DELETE: *Access Denied*
>                ..0. .... = 0x020 EXECUTE: *Access Denied*
>        Opcode: GETATTR (9)
>            Status: NFS4_OK (0)
>            Attr mask[0]: 0x0010011a (Type, Change, Size, FSID, FileId)
>                reqd_attr: Type (1)
>                    ftype4: NF4REG (1)
>                reqd_attr: Change (3)
>                    changeid: 1626267251187000000
>                reqd_attr: Size (4)
>                    size: 7
>                reqd_attr: FSID (8)
>                    fattr4_fsid
>                reco_attr: FileId (20)
>                    fileid: 20829148276664444
>            Attr mask[1]: 0x0030a23a (Mode, NumLinks, Owner,
> Owner_Group, RawDev, Space_Used, Time_Access, Time_Metadata,
> Time_Modify)
>                reco_attr: Mode (33)
>                    mode: 0644, Name: Unknown, Read permission for
> owner, Write permission for owner, Read permission for group, Read
> permission for others
>                reco_attr: NumLinks (35)
>                    numlinks: 1
>                reco_attr: Owner (36)
>                    fattr4_owner: 0
>                reco_attr: Owner_Group (37)
>                    fattr4_owner_group: 0
>                reco_attr: RawDev (41)
>                    specdata1: 0
>                    specdata2: 0
>                reco_attr: Space_Used (45)
>                    space_used: 7
>                reco_attr: Time_Access (47)
>                    seconds: 1626267228
>                    nseconds: 0
>                reco_attr: Time_Metadata (52)
>                    seconds: 1626267251
>                    nseconds: 187000000
>                reco_attr: Time_Modify (53)
>                    seconds: 1626267251
>                    nseconds: 186841000
>    [Main Opcode: OPEN (18)]
> 
> 
> -- 
> Matt Robertson // Data Architect | Pure Storage, Inc.
> 913-915-1162  |  mrobertson@purestorage.com

_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-07-31 22:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-30  3:15 NFSv4.1 - 2x opens causes unknown 5s delay Matthew Robertson
2021-07-31 22:16 ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.