Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Kirill Smelkov <kirr@nexedi.com>
To: Miklos Szeredi <miklos@szeredi.hu>, Miklos Szeredi <mszeredi@redhat.com>
Cc: <linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Kirill Smelkov <kirr@nexedi.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Yongzhi Pan <panyongzhi@gmail.com>,
	Jonathan Corbet <corbet@lwn.net>,
	David Vrabel <david.vrabel@citrix.com>,
	Juergen Gross <jgross@suse.com>, Tejun Heo <tj@kernel.org>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	Arnd Bergmann <arnd@arndb.de>, Christoph Hellwig <hch@lst.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Julia Lawall <Julia.Lawall@lip6.fr>,
	Nikolaus Rath <Nikolaus@rath.org>,
	Han-Wen Nienhuys <hanwen@google.com>, <stable@vger.kernel.org>
Subject: [RESEND, PATCH 3/3] fuse: Add FOPEN_STREAM and use stream_open() if filesystem returned that from open handler
Date: Wed, 24 Apr 2019 07:13:57 +0000
Message-ID: <20190424071316.11967-1-kirr@nexedi.com> (raw)
In-Reply-To: <dc47c061f20c464ccf46b43822b062dca6486e90.1553637462.git.kirr@nexedi.com>

Starting from 9c225f2655 (vfs: atomic f_pos accesses as per POSIX) files
opened even via nonseekable_open gate read and write via lock and do not
allow them to be run simultaneously. This can create read vs write
deadlock if a filesystem is trying to implement a socket-like file which
is intended to be simultaneously used for both read and write from
filesystem client. See 10dce8af3422 ("fs: stream_open - opener for
stream-like files so that read and write can run simultaneously without
deadlock") for details and e.g. 581d21a2d0 (xenbus: fix deadlock on
writes to /proc/xen/xenbus) for a similar deadlock example on /proc/xen/xenbus.

To avoid such deadlock it was tempting to adjust fuse_finish_open to use
stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags,
but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
and in particular GVFS which actually uses offset in its read and write
handlers

	https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481

so if we would do such a change it will break a real user.

-> Add another flag (FOPEN_STREAM) for filesystem servers to indicate
that the opened handler is having stream-like semantics; does not use
file position and thus the kernel is free to issue simultaneous read and
write request on opened file handle.

This patch together with stream_open (10dce8af3422) should be added to
stable kernels starting from v3.14+ (the kernel where 9c225f2655 first
appeared). This will allow to patch OSSPD and other FUSE filesystems that
provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in
open handler and this way avoid the deadlock on all kernel versions. This
should work because fuse_finish_open ignores unknown open flags returned
from a filesystem and so passing FOPEN_STREAM to a kernel that is not
aware of this flag cannot hurt. In turn the kernel that is not aware of
FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to
implement streams without read vs write deadlock.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yongzhi Pan <panyongzhi@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Nikolaus Rath <Nikolaus@rath.org>
Cc: Han-Wen Nienhuys <hanwen@google.com>
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
---

( resending the same patch with updated description to reference stream_open
  that was landed to master as 10dce8af3422; also added cc stable )

 fs/fuse/file.c            | 4 +++-
 include/uapi/linux/fuse.h | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 06096b60f1df..44de96cb7871 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -178,7 +178,9 @@ void fuse_finish_open(struct inode *inode, struct file *file)
 
 	if (!(ff->open_flags & FOPEN_KEEP_CACHE))
 		invalidate_inode_pages2(inode->i_mapping);
-	if (ff->open_flags & FOPEN_NONSEEKABLE)
+	if (ff->open_flags & FOPEN_STREAM)
+		stream_open(inode, file);
+	else if (ff->open_flags & FOPEN_NONSEEKABLE)
 		nonseekable_open(inode, file);
 	if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC)) {
 		struct fuse_inode *fi = get_fuse_inode(inode);
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 2ac598614a8f..26abf0a571c7 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -229,11 +229,13 @@ struct fuse_file_lock {
  * FOPEN_KEEP_CACHE: don't invalidate the data cache on open
  * FOPEN_NONSEEKABLE: the file is not seekable
  * FOPEN_CACHE_DIR: allow caching this directory
+ * FOPEN_STREAM: the file is stream-like
  */
 #define FOPEN_DIRECT_IO		(1 << 0)
 #define FOPEN_KEEP_CACHE	(1 << 1)
 #define FOPEN_NONSEEKABLE	(1 << 2)
 #define FOPEN_CACHE_DIR		(1 << 3)
+#define FOPEN_STREAM		(1 << 4)
 
 /**
  * INIT request/reply flags
-- 
2.21.0.765.geec228f530

  reply index

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-26 22:20 [PATCH 1/3] fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock Kirill Smelkov
2019-03-26 23:22 ` [PATCH 3/3] fuse: Add FOPEN_STREAM and use stream_open() if filesystem returned that from open handler Kirill Smelkov
2019-04-24  7:13   ` Kirill Smelkov [this message]
     [not found]     ` <20190424160611.2A71321900@mail.kernel.org>
2019-04-24 19:16       ` [RESEND, PATCH " Kirill Smelkov
     [not found] ` <8794193f3040b798010970228d978c05ad56ec52.1553637462.git.kirr@nexedi.com>
2019-03-27  6:54   ` [PATCH 2/3] *: convert stream-like files from nonseekable_open -> stream_open Lubomir Rintel
2019-03-27 16:58 ` [PATCH 1/3] fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock Juergen Gross
2019-04-06 17:07 ` Linus Torvalds
2019-04-07 20:04   ` Kirill Smelkov
2019-04-08  0:09     ` Linus Torvalds
2019-04-14  7:11       ` Kirill Smelkov
     [not found] ` <4c4651e2-167e-bfcc-7b3e-cda118f98a69@rasmusvillemoes.dk>
     [not found]   ` <20190409203807.GA13855@deco.navytux.spb.ru>
     [not found]     ` <d8c23d05-8810-13a2-cc50-7a47ff35e90b@rasmusvillemoes.dk>
2019-04-11 12:38       ` Kirill Smelkov
2019-04-11 16:22         ` Linus Torvalds
2019-04-12 12:42           ` Kirill Smelkov
2019-04-13 16:54             ` Kirill Smelkov
2019-04-13 16:54               ` [PATCH 1/2] vfs: pass ppos=NULL to .read()/.write() of FMODE_STREAM files Kirill Smelkov
2019-04-13 17:27                 ` Linus Torvalds
2019-04-13 17:38                   ` Al Viro
2019-04-13 18:44                   ` Kirill Smelkov
2019-04-13 16:55               ` [PATCH 2/2] vfs: use &file->f_pos directly on files that have position Kirill Smelkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190424071316.11967-1-kirr@nexedi.com \
    --to=kirr@nexedi.com \
    --cc=Julia.Lawall@lip6.fr \
    --cc=Nikolaus@rath.org \
    --cc=arnd@arndb.de \
    --cc=corbet@lwn.net \
    --cc=david.vrabel@citrix.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hanwen@google.com \
    --cc=hch@lst.de \
    --cc=jgross@suse.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=mszeredi@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=panyongzhi@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git