linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kirill Smelkov <kirr@nexedi.com>
To: Miklos Szeredi <miklos@szeredi.hu>, Miklos Szeredi <mszeredi@redhat.com>
Cc: Han-Wen Nienhuys <hanwen@google.com>,
	Jakob Unterwurzacher <jakobunt@gmail.com>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	<linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<fuse-devel@lists.sourceforge.net>,
	Kirill Smelkov <kirr@nexedi.com>
Subject: [RESEND4, PATCH 2/2] fuse: require /dev/fuse reads to have enough buffer capacity as negotiated
Date: Wed, 27 Mar 2019 10:15:15 +0000	[thread overview]
Message-ID: <d74b17b9d33c3dcc7a1f2fa2914fb3c4e7cda127.1553680185.git.kirr@nexedi.com> (raw)
In-Reply-To: <cover.1553680185.git.kirr@nexedi.com>

A FUSE filesystem server queues /dev/fuse sys_read calls to get
filesystem requests to handle. It does not know in advance what would be
that request as it can be anything that client issues - LOOKUP, READ,
WRITE, ... Many requests are short and retrieve data from the
filesystem. However WRITE and NOTIFY_REPLY write data into filesystem.

Before getting into operation phase, FUSE filesystem server and kernel
client negotiate what should be the maximum write size the client will
ever issue. After negotiation the contract in between server/client is
that the filesystem server then should queue /dev/fuse sys_read calls with
enough buffer capacity to receive any client request - WRITE in
particular, while FUSE client should not, in particular, send WRITE
requests with > negotiated max_write payload. FUSE client in kernel and
libfuse historically reserve 4K for request header. This way the
contract is that filesystem server should queue sys_reads with
4K+max_write buffer.

If the filesystem server does not follow this contract, what can happen
is that fuse_dev_do_read will see that request size is > buffer size,
and then it will return EIO to client who issued the request but won't
indicate in any way that there is a problem to filesystem server.
This can be hard to diagnose because for some requests, e.g. for
NOTIFY_REPLY which mimics WRITE, there is no client thread that is
waiting for request completion and that EIO goes nowhere, while on
filesystem server side things look like the kernel is not replying back
after successful NOTIFY_RETRIEVE request made by the server.

-> We can make the problem easy to diagnose if we indicate via error
return to filesystem server when it is violating the contract.
This should not practically cause problems because if a filesystem
server is using shorter buffer, writes to it were already very likely to
cause EIO, and if the filesystem is read-only it should be too following
8K minimum buffer size (= either FUSE_MIN_READ_BUFFER, see 1d3d752b47,
or = 4K + min(max_write)=4k cared to be so by process_init_reply).

Please see [1] for context where the problem of stuck filesystem was hit
for real (because kernel client was incorrectly sending more than
max_write data with NOTIFY_REPLY; see also previous patch), how the
situation was traced and for more involving patch that did not make it
into the tree.

[1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2

Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Cc: Han-Wen Nienhuys <hanwen@google.com>
Cc: Jakob Unterwurzacher <jakobunt@gmail.com>
---
 fs/fuse/dev.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 38e94bc43053..8fdfbafed037 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1317,6 +1317,16 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
 	unsigned reqsize;
 	unsigned int hash;
 
+	/*
+	 * Require sane minimum read buffer - that has capacity for fixed part
+	 * of any request header + negotated max_write room for data. If the
+	 * requirement is not satisfied return EINVAL to the filesystem server
+	 * to indicate that it is not following FUSE server/client contract.
+	 * Don't dequeue / abort any request.
+	 */
+	if (nbytes < (fc->conn_init ? 4096 + fc->max_write : FUSE_MIN_READ_BUFFER))
+		return -EINVAL;
+
  restart:
 	spin_lock(&fiq->waitq.lock);
 	err = -EAGAIN;
-- 
2.21.0.392.gf8f6787159

  reply	other threads:[~2019-03-27 10:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-27 10:15 [RESEND4, PATCH 0/2] fuse: don't stuck clients on retrieve_notify with size > max_write Kirill Smelkov
2019-03-27 10:15 ` Kirill Smelkov [this message]
2019-04-24 10:48   ` [RESEND4, PATCH 2/2] fuse: require /dev/fuse reads to have enough buffer capacity as negotiated Miklos Szeredi
2019-04-24 11:58     ` Kirill Smelkov
2019-03-27 10:15 ` [RESEND4, PATCH 1/2] fuse: retrieve: cap requested size to negotiated max_write Kirill Smelkov
2019-04-24 10:44   ` Miklos Szeredi
2019-04-24 11:56     ` Kirill Smelkov
2019-04-24 12:17       ` Miklos Szeredi
2019-04-24 12:31         ` Kirill Smelkov
2019-04-24 13:19           ` Miklos Szeredi
2019-04-24 14:22             ` Kirill Smelkov
2019-04-24 15:02               ` Miklos Szeredi
2019-04-24 18:10                 ` Kirill Smelkov
2019-04-24 18:59                   ` Kirill Smelkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d74b17b9d33c3dcc7a1f2fa2914fb3c4e7cda127.1553680185.git.kirr@nexedi.com \
    --to=kirr@nexedi.com \
    --cc=akpm@linux-foundation.org \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=hanwen@google.com \
    --cc=jakobunt@gmail.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=mszeredi@redhat.com \
    --subject='Re: [RESEND4, PATCH 2/2] fuse: require /dev/fuse reads to have enough buffer capacity as negotiated' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).