linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kirill Smelkov <kirr@nexedi.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Miklos Szeredi <mszeredi@redhat.com>,
	Han-Wen Nienhuys <hanwen@google.com>,
	Jakob Unterwurzacher <jakobunt@gmail.com>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	<linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	fuse-devel <fuse-devel@lists.sourceforge.net>,
	stable <stable@vger.kernel.org>
Subject: Re: [RESEND4, PATCH 1/2] fuse: retrieve: cap requested size to negotiated max_write
Date: Wed, 24 Apr 2019 11:56:29 +0000	[thread overview]
Message-ID: <20190424115620.GA2723@deco.navytux.spb.ru> (raw)
In-Reply-To: <CAJfpegv0NvcsD9gh3aN+ngzsLqbNWUc73ZPN1qHox_14eN9Pqg@mail.gmail.com>

On Wed, Apr 24, 2019 at 12:44:50PM +0200, Miklos Szeredi wrote:
> On Wed, Mar 27, 2019 at 11:15 AM Kirill Smelkov <kirr@nexedi.com> wrote:
> >
> > FUSE filesystem server and kernel client negotiate during initialization
> > phase, what should be the maximum write size the client will ever issue.
> > Correspondingly the filesystem server then queues sys_read calls to read
> > requests with buffer capacity large enough to carry request header
> > + that max_write bytes. A filesystem server is free to set its max_write
> > in anywhere in the range between [1·page, fc->max_pages·page]. In
> > particular go-fuse[2] sets max_write by default as 64K, wheres default
> > fc->max_pages corresponds to 128K. Libfuse also allows users to
> > configure max_write, but by default presets it to possible maximum.
> >
> > If max_write is < fc->max_pages·page, and in NOTIFY_RETRIEVE handler we
> > allow to retrieve more than max_write bytes, corresponding prepared
> > NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the
> > filesystem server, in full correspondence with server/client contract,
> > will be only queuing sys_read with ~max_write buffer capacity, and
> > fuse_dev_do_read throws away requests that cannot fit into server
> > request buffer. In turn the filesystem server could get stuck waiting
> > indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK
> > which is understood by clients as that NOTIFY_REPLY was queued and will
> > be sent back.
> >
> > -> Cap requested size to negotiate max_write to avoid the problem.
> > This aligns with the way NOTIFY_RETRIEVE handler works, which already
> > unconditionally caps requested retrieve size to fuse_conn->max_pages.
> > This way it should not hurt NOTIFY_RETRIEVE semantic if we return less
> > data than was originally requested.
> >
> > Please see [1] for context where the problem of stuck filesystem was hit
> > for real, how the situation was traced and for more involving patch that
> > did not make it into the tree.
> >
> > [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2
> > [2] https://github.com/hanwen/go-fuse
> >
> > Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
> > Cc: Han-Wen Nienhuys <hanwen@google.com>
> > Cc: Jakob Unterwurzacher <jakobunt@gmail.com>
> > Cc: <stable@vger.kernel.org> # v2.6.36+
> > ---
> >  fs/fuse/dev.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> > index 8a63e52785e9..38e94bc43053 100644
> > --- a/fs/fuse/dev.c
> > +++ b/fs/fuse/dev.c
> > @@ -1749,7 +1749,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode,
> >         offset = outarg->offset & ~PAGE_MASK;
> >         file_size = i_size_read(inode);
> >
> > -       num = outarg->size;
> > +       num = min(outarg->size, fc->max_write);
>
> This is wrong: the max_size limited num is overwritten if constrained
> by file size.

I assume you are meaning this:

	--- a/fs/fuse/dev.c
	+++ b/fs/fuse/dev.c
	@@ -1745,15 +1745,15 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode,
	        unsigned int offset;
	        size_t total_len = 0;
	        unsigned int num_pages;
	
	        offset = outarg->offset & ~PAGE_MASK;
	        file_size = i_size_read(inode);
	
	-       num = outarg->size;
	+       num = min(outarg->size, fc->max_write);
	        if (outarg->offset > file_size)
	                num = 0;
	        else if (outarg->offset + num > file_size)
	                num = file_size - outarg->offset;		<-- THIS
	
	        num_pages = (num + offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
	        num_pages = min(num_pages, fc->max_pages);

and then in this case (offset + num > file_size) num overwrite

	num = file_size - offset

can make num only smaller, right? And then the patch is not wrong because there
is no other num overwriting in this function except when num is being further
decremented in loop that prepares pages to retrieve.

Or am I missing something?  Would it be more clear to cap num to
max_write after all calculations?  But then - if we are not sure that
file_size check can only lower num - we have a problem: we are no longer
sure that num is <= outarg->size.


> Also the patch is whitespace damaged.

I've tried to do the following in my mutt on "RESEND4, PATCH 1/2"
message:

	|(cd ~/src/linux/linux && git am -)

and the patch applied successfully. So could you please clarify what
"whitespace damaged" means?

Attaching the patch once again just in case.

Kirill

---- 8< ----
From 000a5a6c91992f2a361a846cd050444964920f85 Mon Sep 17 00:00:00 2001
From: Kirill Smelkov <kirr@nexedi.com>
Date: Wed, 27 Mar 2019 13:00:56 +0300
Subject: [PATCH] fuse: retrieve: cap requested size to negotiated max_write
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit

FUSE filesystem server and kernel client negotiate during initialization
phase, what should be the maximum write size the client will ever issue.
Correspondingly the filesystem server then queues sys_read calls to read
requests with buffer capacity large enough to carry request header
+ that max_write bytes. A filesystem server is free to set its max_write
in anywhere in the range between [1·page, fc->max_pages·page]. In
particular go-fuse[2] sets max_write by default as 64K, wheres default
fc->max_pages corresponds to 128K. Libfuse also allows users to
configure max_write, but by default presets it to possible maximum.

If max_write is < fc->max_pages·page, and in NOTIFY_RETRIEVE handler we
allow to retrieve more than max_write bytes, corresponding prepared
NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the
filesystem server, in full correspondence with server/client contract,
will be only queuing sys_read with ~max_write buffer capacity, and
fuse_dev_do_read throws away requests that cannot fit into server
request buffer. In turn the filesystem server could get stuck waiting
indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK
which is understood by clients as that NOTIFY_REPLY was queued and will
be sent back.

-> Cap requested size to negotiate max_write to avoid the problem.
This aligns with the way NOTIFY_RETRIEVE handler works, which already
unconditionally caps requested retrieve size to fuse_conn->max_pages.
This way it should not hurt NOTIFY_RETRIEVE semantic if we return less
data than was originally requested.

Please see [1] for context where the problem of stuck filesystem was hit
for real, how the situation was traced and for more involving patch that
did not make it into the tree.

[1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2
[2] https://github.com/hanwen/go-fuse

Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Cc: Han-Wen Nienhuys <hanwen@google.com>
Cc: Jakob Unterwurzacher <jakobunt@gmail.com>
Cc: <stable@vger.kernel.org> # v2.6.36+
---
 fs/fuse/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 9971a35cf1ef..1e2efd873201 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1749,7 +1749,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode,
 	offset = outarg->offset & ~PAGE_MASK;
 	file_size = i_size_read(inode);

-	num = outarg->size;
+	num = min(outarg->size, fc->max_write);
 	if (outarg->offset > file_size)
 		num = 0;
 	else if (outarg->offset + num > file_size)
--
2.21.0.765.geec228f530


  reply	other threads:[~2019-04-24 12:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-27 10:15 [RESEND4, PATCH 0/2] fuse: don't stuck clients on retrieve_notify with size > max_write Kirill Smelkov
2019-03-27 10:15 ` [RESEND4, PATCH 2/2] fuse: require /dev/fuse reads to have enough buffer capacity as negotiated Kirill Smelkov
2019-04-24 10:48   ` Miklos Szeredi
2019-04-24 11:58     ` Kirill Smelkov
2019-03-27 10:15 ` [RESEND4, PATCH 1/2] fuse: retrieve: cap requested size to negotiated max_write Kirill Smelkov
2019-04-24 10:44   ` Miklos Szeredi
2019-04-24 11:56     ` Kirill Smelkov [this message]
2019-04-24 12:17       ` Miklos Szeredi
2019-04-24 12:31         ` Kirill Smelkov
2019-04-24 13:19           ` Miklos Szeredi
2019-04-24 14:22             ` Kirill Smelkov
2019-04-24 15:02               ` Miklos Szeredi
2019-04-24 18:10                 ` Kirill Smelkov
2019-04-24 18:59                   ` Kirill Smelkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190424115620.GA2723@deco.navytux.spb.ru \
    --to=kirr@nexedi.com \
    --cc=akpm@linux-foundation.org \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=hanwen@google.com \
    --cc=jakobunt@gmail.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=mszeredi@redhat.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).