Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
From: Trond Myklebust <trondmy@gmail.com>
To: "J. Bruce Fields" <bfields@redhat.com>
Cc: linux-nfs@vger.kernel.org
Subject: [PATCH 0/9] Fix error reporting for NFS writes
Date: Mon,  6 Jan 2020 13:40:28 -0500
Message-ID: <20200106184037.563557-1-trond.myklebust@hammerspace.com> (raw)

In cases where we have transient errors, such as ENOSPC, it is important
to ensure that errors are reported on all writes that may be affected.

The problem we have is that not all errors are guaranteed to be reported
at write time. Some are reported only when we call fsync(). In
particular, this can be a problem for stable NFS writes. Since most
filesystems protect the write to the page cache with the inode lock,
but do not protect the subsequent call to generic_write_sync(), this
means that if we have parallel writes to the same file, we can end up
assigning the error to the wrong stable write call. If the application
expects to be able to fix the transient errors, it may end up replaying
the wrong write. One area where we have seen this happen is in flexfiles
writes, where the server is capable of freeing up space on the DS in
case of ENOSPC.

The other area where we have seen a similar problem is when we have
unstable writes, and the client sends a backgrounded commit in order
to free up memory. If there are outstanding writes while the commit
gets a transient error and bumps the write verifier, then we want to
ensure that those writes get the approprite write verifier depending
on whether they were affected by the fsync() or not. Right now,
because the NFSv3 verifier is set in the XDR encoder well after the
write is done, there is fairly large window for a race with a
background commit.

This patch series deals with both issues by adding per-file-descriptor
locking that ensures that writes, fsync error handling, and write verifier
updates are appropriately serialised.

Trond Myklebust (9):
  nfsd: Allow nfsd_vfs_write() to take the nfsd_file as an argument
  nfsd: Fix stable writes
  nfsd: Update the boot verifier on stable writes too.
  nfsd: Pass the nfsd_file as arguments to nfsd4_clone_file_range()
  nfsd: Ensure exclusion between CLONE and WRITE errors
  sunrpc: Fix potential leaks in sunrpc_cache_unhash()
  sunrpc: clean up cache entry add/remove from hashtable
  nfsd: Ensure sampling of the commit verifier is atomic with the commit
  nfsd: Ensure sampling of the write verifier is atomic with the write

 fs/nfsd/filecache.c |  1 +
 fs/nfsd/filecache.h |  1 +
 fs/nfsd/nfs3proc.c  |  5 +--
 fs/nfsd/nfs3xdr.c   | 16 +++------
 fs/nfsd/nfs4proc.c  | 14 ++++----
 fs/nfsd/nfsproc.c   |  2 +-
 fs/nfsd/vfs.c       | 79 ++++++++++++++++++++++++++++++++++-----------
 fs/nfsd/vfs.h       | 16 +++++----
 fs/nfsd/xdr3.h      |  2 ++
 net/sunrpc/cache.c  | 48 ++++++++++++++-------------
 10 files changed, 115 insertions(+), 69 deletions(-)

-- 
2.24.1


             reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-06 18:40 Trond Myklebust [this message]
2020-01-06 18:40 ` [PATCH 1/9] nfsd: Allow nfsd_vfs_write() to take the nfsd_file as an argument Trond Myklebust
2020-01-06 18:40   ` [PATCH 2/9] nfsd: Fix stable writes Trond Myklebust
2020-01-06 18:40     ` [PATCH 3/9] nfsd: Update the boot verifier on stable writes too Trond Myklebust
2020-01-06 18:40       ` [PATCH 4/9] nfsd: Pass the nfsd_file as arguments to nfsd4_clone_file_range() Trond Myklebust
2020-01-06 18:40         ` [PATCH 5/9] nfsd: Ensure exclusion between CLONE and WRITE errors Trond Myklebust
2020-01-06 18:40           ` [PATCH 6/9] sunrpc: Fix potential leaks in sunrpc_cache_unhash() Trond Myklebust
2020-01-06 18:40             ` [PATCH 7/9] sunrpc: clean up cache entry add/remove from hashtable Trond Myklebust
2020-01-06 18:40               ` [PATCH 8/9] nfsd: Ensure sampling of the commit verifier is atomic with the commit Trond Myklebust
2020-01-06 18:40                 ` [PATCH 9/9] nfsd: Ensure sampling of the write verifier is atomic with the write Trond Myklebust
2020-01-22 15:27 ` [PATCH 0/9] Fix error reporting for NFS writes J. Bruce Fields
2020-01-22 17:04   ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200106184037.563557-1-trond.myklebust@hammerspace.com \
    --to=trondmy@gmail.com \
    --cc=bfields@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git