Data corrupt after truncate at nfs client v3 mount point when nfs server restart

* Data corrupt after truncate at nfs client v3 mount point when nfs server restart
@ 2021-02-03 11:44 Kinglong Mee
  0 siblings, 0 replies; only message in thread
From: Kinglong Mee @ 2021-02-03 11:44 UTC (permalink / raw)
  To: Linux NFS Mailing List; +Cc: Trond Myklebust, Anna Schumaker, Chuck Lever

Hello,

I meet a data corrupt problem at nfs client v3 mount point without sync
at centos 7.6. (kernel 3.10.0-957.el7.x86_64).

When runing ltp ftest01 test case with restart/reboot nfs server,
ftest01 reports data corrupt or size mismatch sometimes.

Debugging shows,
1. The WRITE reply contains a write verifier, and the COMMIT reply
    contains a verifer too.The knfsd encodes nfsd_net_id for the
    two verifers, other nfs server (eg, nfs-ganesha) encodes the daemon
    start time. After nfs server restart/reboot, the verifier is changed.
2. For mounting without sync, nfs client uses buffer io, sends WRITE
    request with unstable argument.
    After unstable WRITE success, it is added to commit list, not be
    deleted directly.
3. Following sync(fsync) or setattr(truncate) will occurs a COMMIT,
    if COMMIT success and the returned verifier is same as the unstable
    WRITE in commit list, the unstable WRITE is deleted; otherwise the
    unstable WRITE will be resend to nfs server.
    It's in nfs_commit_release_pages().
4. After nfs server restart, the COMMIT may be processed by the newer
    server that a different verifier is returned with COMMIT success.
    At this case, those unstable WRITE will be resend but COMMIT finish
    without any error and does not wait those resending WRITEs.
5. For sync(fsync), it's okay; but for setattr(truncate), data corrupt
    appears that those resending WRITEs may be send to server with the
    SETATTR for truncate simultaneously.
6. nfs_setattr() only does a nfs_sync_inode without processing the
    result.

/* Write all dirty data */
if (S_ISREG(inode->i_mode))
nfs_sync_inode(inode);

Also, nfs_initiate_commit() does not return error for FLUSH_SYNC
when COMMIT meeting error.

Maybe we should return error to upper caller (the nfs_setattr()) who
doing FLUSH_SYNC commit when COMMIT meets different verifier as those
unstable WRITEs. With the error, upper caller may return error or
do nfs_sync_inode again.

Although I meet this problem at a older kernel, but the logical in the
latest nfs client source does not be updated.

Any suggestion is welcome.

thanks,
Kinglong Mee

^ permalink raw reply	[flat|nested] only message in thread