linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] fuse: close file synchronously
@ 2012-12-20 12:30 Maxim Patlasov
  2012-12-20 12:31 ` [PATCH 1/5] fuse: add close_wait flag to fuse_conn Maxim Patlasov
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Maxim Patlasov @ 2012-12-20 12:30 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

Hi,

There is a long-standing demand for syncronous behaviour of fuse_release:

http://sourceforge.net/mailarchive/message.php?msg_id=19343889
http://sourceforge.net/mailarchive/message.php?msg_id=29814693

A few months ago Avati and me explained why such a feature would be useful:

http://sourceforge.net/mailarchive/message.php?msg_id=29889055
http://sourceforge.net/mailarchive/message.php?msg_id=29867423

In short, the problem is that fuse_release (that's called on last user
close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
ACK from userspace. Consequently, there is a gap when user regards the
file released while userspace fuse is still working on it. An attempt to
access the file from another node leads to complicated synchronization
problems because the first node still "holds" the file.

The patch-set resolves the problem by making fuse_release synchronous:
wait for ACK from userspace for FUSE_RELEASE if the feature is ON.

To keep single-threaded userspace implementations happy the patch-set
ensures that by the time fuse_release_common calls fuse_file_put, no
more in-flight I/O exists. Asynchronous fuse callbacks (like
fuse_readpages_end) cannot trigger FUSE_RELEASE anymore. Hence, we'll
never block in contexts other than close().

Thanks,
Maxim

---

Maxim Patlasov (5):
      fuse: add close_wait flag to fuse_conn
      fuse: cosmetic rework of fuse_send_readpages
      fuse: wait for end of IO on release
      fuse: enable close_wait feature
      fuse: fix synchronous case of fuse_file_put()


 fs/fuse/file.c            |   82 ++++++++++++++++++++++++++++++++++++++-------
 fs/fuse/fuse_i.h          |    3 ++
 fs/fuse/inode.c           |    5 ++-
 include/uapi/linux/fuse.h |    7 +++-
 4 files changed, 82 insertions(+), 15 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] fuse: add close_wait flag to fuse_conn
  2012-12-20 12:30 [PATCH 0/5] fuse: close file synchronously Maxim Patlasov
@ 2012-12-20 12:31 ` Maxim Patlasov
  2012-12-20 12:31 ` [PATCH 2/5] fuse: cosmetic rework of fuse_send_readpages Maxim Patlasov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Maxim Patlasov @ 2012-12-20 12:31 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

The feature will be governed by fc->close_wait. Userspace can enable it in
the same way as auto_inval_data or any other kernel fuse capability.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/fuse_i.h          |    3 +++
 fs/fuse/inode.c           |    5 ++++-
 include/uapi/linux/fuse.h |    7 ++++++-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index e24dd74..156ad75 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -487,6 +487,9 @@ struct fuse_conn {
 	/** Use enhanced/automatic page cache invalidation. */
 	unsigned auto_inval_data:1;
 
+	/** Wait for response from daemon on close */
+	unsigned close_wait:1;
+
 	/** The number of requests waiting for completion */
 	atomic_t num_waiting;
 
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index f0eda12..ac685f7 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -859,6 +859,8 @@ static void process_init_reply(struct fuse_conn *fc, struct fuse_req *req)
 				fc->dont_mask = 1;
 			if (arg->flags & FUSE_AUTO_INVAL_DATA)
 				fc->auto_inval_data = 1;
+			if (arg->flags & FUSE_CLOSE_WAIT)
+				fc->close_wait = 1;
 		} else {
 			ra_pages = fc->max_read / PAGE_CACHE_SIZE;
 			fc->no_lock = 1;
@@ -885,7 +887,8 @@ static void fuse_send_init(struct fuse_conn *fc, struct fuse_req *req)
 	arg->flags |= FUSE_ASYNC_READ | FUSE_POSIX_LOCKS | FUSE_ATOMIC_O_TRUNC |
 		FUSE_EXPORT_SUPPORT | FUSE_BIG_WRITES | FUSE_DONT_MASK |
 		FUSE_SPLICE_WRITE | FUSE_SPLICE_MOVE | FUSE_SPLICE_READ |
-		FUSE_FLOCK_LOCKS | FUSE_IOCTL_DIR | FUSE_AUTO_INVAL_DATA;
+		FUSE_FLOCK_LOCKS | FUSE_IOCTL_DIR | FUSE_AUTO_INVAL_DATA |
+		FUSE_CLOSE_WAIT;
 	req->in.h.opcode = FUSE_INIT;
 	req->in.numargs = 1;
 	req->in.args[0].size = sizeof(*arg);
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index d8c713e..b99d720 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -60,6 +60,9 @@
  *
  * 7.20
  *  - add FUSE_AUTO_INVAL_DATA
+ *
+ * 7.21
+ *  - add FUSE_CLOSE_WAIT
  */
 
 #ifndef _LINUX_FUSE_H
@@ -91,7 +94,7 @@
 #define FUSE_KERNEL_VERSION 7
 
 /** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 20
+#define FUSE_KERNEL_MINOR_VERSION 21
 
 /** The node ID of the root inode */
 #define FUSE_ROOT_ID 1
@@ -179,6 +182,7 @@ struct fuse_file_lock {
  * FUSE_FLOCK_LOCKS: remote locking for BSD style file locks
  * FUSE_HAS_IOCTL_DIR: kernel supports ioctl on directories
  * FUSE_AUTO_INVAL_DATA: automatically invalidate cached pages
+ * FUSE_CLOSE_WAIT: wait for response from daemon on close
  */
 #define FUSE_ASYNC_READ		(1 << 0)
 #define FUSE_POSIX_LOCKS	(1 << 1)
@@ -193,6 +197,7 @@ struct fuse_file_lock {
 #define FUSE_FLOCK_LOCKS	(1 << 10)
 #define FUSE_HAS_IOCTL_DIR	(1 << 11)
 #define FUSE_AUTO_INVAL_DATA	(1 << 12)
+#define FUSE_CLOSE_WAIT		(1 << 14)
 
 /**
  * CUSE INIT request/reply flags


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] fuse: cosmetic rework of fuse_send_readpages
  2012-12-20 12:30 [PATCH 0/5] fuse: close file synchronously Maxim Patlasov
  2012-12-20 12:31 ` [PATCH 1/5] fuse: add close_wait flag to fuse_conn Maxim Patlasov
@ 2012-12-20 12:31 ` Maxim Patlasov
  2012-12-20 12:31 ` [PATCH 3/5] fuse: wait for end of IO on release Maxim Patlasov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Maxim Patlasov @ 2012-12-20 12:31 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

The patch change arguments of fuse_send_readpages to give it access to inode
(will be used in the next patch of patch-set). The change is cosmetic,
no logic changed.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |   20 +++++++++++---------
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 78d2837..4f23134 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -614,8 +614,16 @@ static void fuse_readpages_end(struct fuse_conn *fc, struct fuse_req *req)
 		fuse_file_put(req->ff, false);
 }
 
-static void fuse_send_readpages(struct fuse_req *req, struct file *file)
+struct fuse_fill_data {
+	struct fuse_req *req;
+	struct file *file;
+	struct inode *inode;
+};
+
+static void fuse_send_readpages(struct fuse_fill_data *data)
 {
+	struct fuse_req *req = data->req;
+	struct file *file = data->file;
 	struct fuse_file *ff = file->private_data;
 	struct fuse_conn *fc = ff->fc;
 	loff_t pos = page_offset(req->pages[0]);
@@ -637,12 +645,6 @@ static void fuse_send_readpages(struct fuse_req *req, struct file *file)
 	}
 }
 
-struct fuse_fill_data {
-	struct fuse_req *req;
-	struct file *file;
-	struct inode *inode;
-};
-
 static int fuse_readpages_fill(void *_data, struct page *page)
 {
 	struct fuse_fill_data *data = _data;
@@ -656,7 +658,7 @@ static int fuse_readpages_fill(void *_data, struct page *page)
 	    (req->num_pages == FUSE_MAX_PAGES_PER_REQ ||
 	     (req->num_pages + 1) * PAGE_CACHE_SIZE > fc->max_read ||
 	     req->pages[req->num_pages - 1]->index + 1 != page->index)) {
-		fuse_send_readpages(req, data->file);
+		fuse_send_readpages(data);
 		data->req = req = fuse_get_req(fc);
 		if (IS_ERR(req)) {
 			unlock_page(page);
@@ -691,7 +693,7 @@ static int fuse_readpages(struct file *file, struct address_space *mapping,
 	err = read_cache_pages(mapping, pages, fuse_readpages_fill, &data);
 	if (!err) {
 		if (data.req->num_pages)
-			fuse_send_readpages(data.req, file);
+			fuse_send_readpages(&data);
 		else
 			fuse_put_request(fc, data.req);
 	}


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] fuse: wait for end of IO on release
  2012-12-20 12:30 [PATCH 0/5] fuse: close file synchronously Maxim Patlasov
  2012-12-20 12:31 ` [PATCH 1/5] fuse: add close_wait flag to fuse_conn Maxim Patlasov
  2012-12-20 12:31 ` [PATCH 2/5] fuse: cosmetic rework of fuse_send_readpages Maxim Patlasov
@ 2012-12-20 12:31 ` Maxim Patlasov
  2013-01-02 20:35   ` Brian Foster
  2013-01-15 15:02   ` [PATCH] fuse: wait for end of IO on release (v2) Maxim Patlasov
  2012-12-20 12:32 ` [PATCH 4/5] fuse: enable close_wait feature Maxim Patlasov
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 18+ messages in thread
From: Maxim Patlasov @ 2012-12-20 12:31 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

There are two types of I/O activity that can be "in progress" at the time
of fuse_release() execution: asynchronous read-ahead and write-back. The
patch ensures that they are completed before fuse_release_common sends
FUSE_RELEASE to userspace.

So far as fuse_release() waits for end of async I/O, its callbacks
(fuse_readpages_end and fuse_writepage_finish) calling fuse_file_put cannot
be the last holders of fuse file anymore. To emphasize the fact, the patch
replaces fuse_file_put with __fuse_file_put there.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |   55 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 52 insertions(+), 3 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 4f23134..aed9be2 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -137,6 +137,12 @@ static void fuse_file_put(struct fuse_file *ff, bool sync)
 	}
 }
 
+static void __fuse_file_put(struct fuse_file *ff)
+{
+	if (atomic_dec_and_test(&ff->count))
+		BUG();
+}
+
 int fuse_do_open(struct fuse_conn *fc, u64 nodeid, struct file *file,
 		 bool isdir)
 {
@@ -260,7 +266,12 @@ void fuse_release_common(struct file *file, int opcode)
 	 * Make the release synchronous if this is a fuseblk mount,
 	 * synchronous RELEASE is allowed (and desirable) in this case
 	 * because the server can be trusted not to screw up.
+	 *
+	 * We might wait for them (asynchronous READ or WRITE requests), so:
 	 */
+	if (ff->fc->close_wait)
+		BUG_ON(atomic_read(&ff->count) != 1);
+
 	fuse_file_put(ff, ff->fc->destroy_req != NULL);
 }
 
@@ -271,6 +282,31 @@ static int fuse_open(struct inode *inode, struct file *file)
 
 static int fuse_release(struct inode *inode, struct file *file)
 {
+	struct fuse_file *ff = file->private_data;
+
+	if (ff->fc->close_wait) {
+		struct fuse_inode *fi = get_fuse_inode(inode);
+
+		/*
+		 * Must remove file from write list. Otherwise it is possible
+		 * this file will get more writeback from another files
+		 * rerouted via write_files.
+		 */
+		spin_lock(&ff->fc->lock);
+		list_del_init(&ff->write_entry);
+		spin_unlock(&ff->fc->lock);
+
+		wait_event(fi->page_waitq, atomic_read(&ff->count) == 1);
+
+		/*
+		 * Wait for threads just released ff to leave their critical
+		 * sections. Taking spinlock is the first thing
+		 * fuse_release_common does, so that this is unnecessary, but
+		 * it is still good to emphasize right here, that we need this.
+		 */
+		spin_unlock_wait(&ff->fc->lock);
+	}
+
 	fuse_release_common(file, FUSE_RELEASE);
 
 	/* return value is ignored by VFS */
@@ -610,8 +646,17 @@ static void fuse_readpages_end(struct fuse_conn *fc, struct fuse_req *req)
 		unlock_page(page);
 		page_cache_release(page);
 	}
-	if (req->ff)
-		fuse_file_put(req->ff, false);
+	if (req->ff) {
+		if (fc->close_wait) {
+			struct fuse_inode *fi = get_fuse_inode(req->inode);
+
+			spin_lock(&fc->lock);
+			__fuse_file_put(req->ff);
+			wake_up(&fi->page_waitq);
+			spin_unlock(&fc->lock);
+		} else
+			fuse_file_put(req->ff, false);
+	}
 }
 
 struct fuse_fill_data {
@@ -637,6 +682,7 @@ static void fuse_send_readpages(struct fuse_fill_data *data)
 	if (fc->async_read) {
 		req->ff = fuse_file_get(ff);
 		req->end = fuse_readpages_end;
+		req->inode = data->inode;
 		fuse_request_send_background(fc, req);
 	} else {
 		fuse_request_send(fc, req);
@@ -1178,7 +1224,8 @@ static ssize_t fuse_direct_write(struct file *file, const char __user *buf,
 static void fuse_writepage_free(struct fuse_conn *fc, struct fuse_req *req)
 {
 	__free_page(req->pages[0]);
-	fuse_file_put(req->ff, false);
+	if (!fc->close_wait)
+		fuse_file_put(req->ff, false);
 }
 
 static void fuse_writepage_finish(struct fuse_conn *fc, struct fuse_req *req)
@@ -1191,6 +1238,8 @@ static void fuse_writepage_finish(struct fuse_conn *fc, struct fuse_req *req)
 	dec_bdi_stat(bdi, BDI_WRITEBACK);
 	dec_zone_page_state(req->pages[0], NR_WRITEBACK_TEMP);
 	bdi_writeout_inc(bdi);
+	if (fc->close_wait)
+		__fuse_file_put(req->ff);
 	wake_up(&fi->page_waitq);
 }
 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] fuse: enable close_wait feature
  2012-12-20 12:30 [PATCH 0/5] fuse: close file synchronously Maxim Patlasov
                   ` (2 preceding siblings ...)
  2012-12-20 12:31 ` [PATCH 3/5] fuse: wait for end of IO on release Maxim Patlasov
@ 2012-12-20 12:32 ` Maxim Patlasov
  2013-01-15 15:07   ` [PATCH] fuse: enable close_wait feature (v2) Maxim Patlasov
  2012-12-20 12:32 ` [PATCH 5/5] fuse: fix synchronous case of fuse_file_put() Maxim Patlasov
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Maxim Patlasov @ 2012-12-20 12:32 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

The patch enables feature by passing 'true' to fuse_file_put in
fuse_release_common.

Previously, this was safe only in special cases when we sure that
multi-threaded userspace won't deadlock if we'll synchronously send
FUSE_RELEASE in the context of read-ahead or write-back callback. Now, it's
always safe because callbacks don't send requests to userspace anymore.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index aed9be2..dac3a7c 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -272,7 +272,8 @@ void fuse_release_common(struct file *file, int opcode)
 	if (ff->fc->close_wait)
 		BUG_ON(atomic_read(&ff->count) != 1);
 
-	fuse_file_put(ff, ff->fc->destroy_req != NULL);
+	fuse_file_put(ff, ff->fc->destroy_req != NULL ||
+			  ff->fc->close_wait);
 }
 
 static int fuse_open(struct inode *inode, struct file *file)


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] fuse: fix synchronous case of fuse_file_put()
  2012-12-20 12:30 [PATCH 0/5] fuse: close file synchronously Maxim Patlasov
                   ` (3 preceding siblings ...)
  2012-12-20 12:32 ` [PATCH 4/5] fuse: enable close_wait feature Maxim Patlasov
@ 2012-12-20 12:32 ` Maxim Patlasov
  2013-04-11 11:21 ` [fuse-devel] [PATCH 0/5] fuse: close file synchronously Maxim V. Patlasov
  2013-04-15 15:08 ` Miklos Szeredi
  6 siblings, 0 replies; 18+ messages in thread
From: Maxim Patlasov @ 2012-12-20 12:32 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

If fuse_file_put() is called with sync==true, the user may be blocked for
a while, until userspace ACKs our FUSE_RELEASE request. This blocking must be
uninterruptible. Otherwise request could be interrupted, but file association
in user space remains.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index dac3a7c..e6770b8 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -126,6 +126,10 @@ static void fuse_file_put(struct fuse_file *ff, bool sync)
 		struct fuse_req *req = ff->reserved_req;
 
 		if (sync) {
+			/* Must force. Otherwise request could be interrupted,
+			 * but file association in user space remains.
+			 */
+			req->force = 1;
 			fuse_request_send(ff->fc, req);
 			path_put(&req->misc.release.path);
 			fuse_put_request(ff->fc, req);


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] fuse: wait for end of IO on release
  2012-12-20 12:31 ` [PATCH 3/5] fuse: wait for end of IO on release Maxim Patlasov
@ 2013-01-02 20:35   ` Brian Foster
  2013-01-15 14:04     ` Maxim V. Patlasov
  2013-01-15 15:02   ` [PATCH] fuse: wait for end of IO on release (v2) Maxim Patlasov
  1 sibling, 1 reply; 18+ messages in thread
From: Brian Foster @ 2013-01-02 20:35 UTC (permalink / raw)
  To: Maxim Patlasov
  Cc: miklos, dev, xemul, fuse-devel, linux-kernel, devel, anand.avati

On 12/20/2012 07:31 AM, Maxim Patlasov wrote:
> There are two types of I/O activity that can be "in progress" at the time
> of fuse_release() execution: asynchronous read-ahead and write-back. The
> patch ensures that they are completed before fuse_release_common sends
> FUSE_RELEASE to userspace.
> 
> So far as fuse_release() waits for end of async I/O, its callbacks
> (fuse_readpages_end and fuse_writepage_finish) calling fuse_file_put cannot
> be the last holders of fuse file anymore. To emphasize the fact, the patch
> replaces fuse_file_put with __fuse_file_put there.
> 
> Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
> ---
>  fs/fuse/file.c |   55 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 files changed, 52 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 4f23134..aed9be2 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -137,6 +137,12 @@ static void fuse_file_put(struct fuse_file *ff, bool sync)
>  	}
>  }
>  
> +static void __fuse_file_put(struct fuse_file *ff)
> +{
> +	if (atomic_dec_and_test(&ff->count))
> +		BUG();
> +}
> +

I think a comment in or before this function to explain the reasoning
for the BUG would be helpful.

>  int fuse_do_open(struct fuse_conn *fc, u64 nodeid, struct file *file,
>  		 bool isdir)
>  {
> @@ -260,7 +266,12 @@ void fuse_release_common(struct file *file, int opcode)
>  	 * Make the release synchronous if this is a fuseblk mount,
>  	 * synchronous RELEASE is allowed (and desirable) in this case
>  	 * because the server can be trusted not to screw up.
> +	 *
> +	 * We might wait for them (asynchronous READ or WRITE requests), so:
>  	 */
> +	if (ff->fc->close_wait)
> +		BUG_ON(atomic_read(&ff->count) != 1);
> +

It might be cleaner to pull the new part of the comment and the BUG_ON()
check to before the existing comment and fuse_file_put (e.g., create a
new comment).

>  	fuse_file_put(ff, ff->fc->destroy_req != NULL);
>  }
>  
> @@ -271,6 +282,31 @@ static int fuse_open(struct inode *inode, struct file *file)
>  
>  static int fuse_release(struct inode *inode, struct file *file)
>  {
> +	struct fuse_file *ff = file->private_data;
> +
> +	if (ff->fc->close_wait) {
> +		struct fuse_inode *fi = get_fuse_inode(inode);
> +
> +		/*
> +		 * Must remove file from write list. Otherwise it is possible
> +		 * this file will get more writeback from another files
> +		 * rerouted via write_files.
> +		 */
> +		spin_lock(&ff->fc->lock);
> +		list_del_init(&ff->write_entry);
> +		spin_unlock(&ff->fc->lock);
> +
> +		wait_event(fi->page_waitq, atomic_read(&ff->count) == 1);
> +
> +		/*
> +		 * Wait for threads just released ff to leave their critical
> +		 * sections. Taking spinlock is the first thing
> +		 * fuse_release_common does, so that this is unnecessary, but
> +		 * it is still good to emphasize right here, that we need this.
> +		 */
> +		spin_unlock_wait(&ff->fc->lock);

I'm all for clarity, but if the wait is unnecessary, perhaps just leave
the comment..? Just my .02.

Aside from the few nits here, the set looks pretty good to me.

Brian

> +	}
> +
>  	fuse_release_common(file, FUSE_RELEASE);
>  
>  	/* return value is ignored by VFS */
> @@ -610,8 +646,17 @@ static void fuse_readpages_end(struct fuse_conn *fc, struct fuse_req *req)
>  		unlock_page(page);
>  		page_cache_release(page);
>  	}
> -	if (req->ff)
> -		fuse_file_put(req->ff, false);
> +	if (req->ff) {
> +		if (fc->close_wait) {
> +			struct fuse_inode *fi = get_fuse_inode(req->inode);
> +
> +			spin_lock(&fc->lock);
> +			__fuse_file_put(req->ff);
> +			wake_up(&fi->page_waitq);
> +			spin_unlock(&fc->lock);
> +		} else
> +			fuse_file_put(req->ff, false);
> +	}
>  }
>  
>  struct fuse_fill_data {
> @@ -637,6 +682,7 @@ static void fuse_send_readpages(struct fuse_fill_data *data)
>  	if (fc->async_read) {
>  		req->ff = fuse_file_get(ff);
>  		req->end = fuse_readpages_end;
> +		req->inode = data->inode;
>  		fuse_request_send_background(fc, req);
>  	} else {
>  		fuse_request_send(fc, req);
> @@ -1178,7 +1224,8 @@ static ssize_t fuse_direct_write(struct file *file, const char __user *buf,
>  static void fuse_writepage_free(struct fuse_conn *fc, struct fuse_req *req)
>  {
>  	__free_page(req->pages[0]);
> -	fuse_file_put(req->ff, false);
> +	if (!fc->close_wait)
> +		fuse_file_put(req->ff, false);
>  }
>  
>  static void fuse_writepage_finish(struct fuse_conn *fc, struct fuse_req *req)
> @@ -1191,6 +1238,8 @@ static void fuse_writepage_finish(struct fuse_conn *fc, struct fuse_req *req)
>  	dec_bdi_stat(bdi, BDI_WRITEBACK);
>  	dec_zone_page_state(req->pages[0], NR_WRITEBACK_TEMP);
>  	bdi_writeout_inc(bdi);
> +	if (fc->close_wait)
> +		__fuse_file_put(req->ff);
>  	wake_up(&fi->page_waitq);
>  }
>  
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/5] fuse: wait for end of IO on release
  2013-01-02 20:35   ` Brian Foster
@ 2013-01-15 14:04     ` Maxim V. Patlasov
  0 siblings, 0 replies; 18+ messages in thread
From: Maxim V. Patlasov @ 2013-01-15 14:04 UTC (permalink / raw)
  To: Brian Foster
  Cc: miklos, dev, xemul, fuse-devel, linux-kernel, devel, anand.avati

Hi Brian,

01/03/2013 12:35 AM, Brian Foster пишет:
> On 12/20/2012 07:31 AM, Maxim Patlasov wrote:
>> There are two types of I/O activity that can be "in progress" at the time
>> of fuse_release() execution: asynchronous read-ahead and write-back. The
>> patch ensures that they are completed before fuse_release_common sends
>> FUSE_RELEASE to userspace.
>>
>> So far as fuse_release() waits for end of async I/O, its callbacks
>> (fuse_readpages_end and fuse_writepage_finish) calling fuse_file_put cannot
>> be the last holders of fuse file anymore. To emphasize the fact, the patch
>> replaces fuse_file_put with __fuse_file_put there.
>>
>> Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
>> ---
>>   fs/fuse/file.c |   55 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
>>   1 files changed, 52 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>> index 4f23134..aed9be2 100644
>> --- a/fs/fuse/file.c
>> +++ b/fs/fuse/file.c
>> @@ -137,6 +137,12 @@ static void fuse_file_put(struct fuse_file *ff, bool sync)
>>   	}
>>   }
>>   
>> +static void __fuse_file_put(struct fuse_file *ff)
>> +{
>> +	if (atomic_dec_and_test(&ff->count))
>> +		BUG();
>> +}
>> +
> I think a comment in or before this function to explain the reasoning
> for the BUG would be helpful.
>
>>   int fuse_do_open(struct fuse_conn *fc, u64 nodeid, struct file *file,
>>   		 bool isdir)
>>   {
>> @@ -260,7 +266,12 @@ void fuse_release_common(struct file *file, int opcode)
>>   	 * Make the release synchronous if this is a fuseblk mount,
>>   	 * synchronous RELEASE is allowed (and desirable) in this case
>>   	 * because the server can be trusted not to screw up.
>> +	 *
>> +	 * We might wait for them (asynchronous READ or WRITE requests), so:
>>   	 */
>> +	if (ff->fc->close_wait)
>> +		BUG_ON(atomic_read(&ff->count) != 1);
>> +
> It might be cleaner to pull the new part of the comment and the BUG_ON()
> check to before the existing comment and fuse_file_put (e.g., create a
> new comment).
>
>>   	fuse_file_put(ff, ff->fc->destroy_req != NULL);
>>   }
>>   
>> @@ -271,6 +282,31 @@ static int fuse_open(struct inode *inode, struct file *file)
>>   
>>   static int fuse_release(struct inode *inode, struct file *file)
>>   {
>> +	struct fuse_file *ff = file->private_data;
>> +
>> +	if (ff->fc->close_wait) {
>> +		struct fuse_inode *fi = get_fuse_inode(inode);
>> +
>> +		/*
>> +		 * Must remove file from write list. Otherwise it is possible
>> +		 * this file will get more writeback from another files
>> +		 * rerouted via write_files.
>> +		 */
>> +		spin_lock(&ff->fc->lock);
>> +		list_del_init(&ff->write_entry);
>> +		spin_unlock(&ff->fc->lock);
>> +
>> +		wait_event(fi->page_waitq, atomic_read(&ff->count) == 1);
>> +
>> +		/*
>> +		 * Wait for threads just released ff to leave their critical
>> +		 * sections. Taking spinlock is the first thing
>> +		 * fuse_release_common does, so that this is unnecessary, but
>> +		 * it is still good to emphasize right here, that we need this.
>> +		 */
>> +		spin_unlock_wait(&ff->fc->lock);
> I'm all for clarity, but if the wait is unnecessary, perhaps just leave
> the comment..? Just my .02.
>
> Aside from the few nits here, the set looks pretty good to me.

Thanks for review, the suggestions look reasonable. I'll resend 
corrected patch soon.

Maxim

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] fuse: wait for end of IO on release (v2)
  2012-12-20 12:31 ` [PATCH 3/5] fuse: wait for end of IO on release Maxim Patlasov
  2013-01-02 20:35   ` Brian Foster
@ 2013-01-15 15:02   ` Maxim Patlasov
  1 sibling, 0 replies; 18+ messages in thread
From: Maxim Patlasov @ 2013-01-15 15:02 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

There are two types of I/O activity that can be "in progress" at the time
of fuse_release() execution: asynchronous read-ahead and write-back. The
patch ensures that they are completed before fuse_release_common sends
FUSE_RELEASE to userspace.

So far as fuse_release() waits for end of async I/O, its callbacks
(fuse_readpages_end and fuse_writepage_finish) calling fuse_file_put cannot
be the last holders of fuse file anymore. To emphasize the fact, the patch
replaces fuse_file_put with __fuse_file_put there.

Changed in v2: Improve comments, comment spin_unlock_wait out. Thanks to Brian
for suggestions.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 4f23134..150033a 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -137,6 +137,17 @@ static void fuse_file_put(struct fuse_file *ff, bool sync)
 	}
 }
 
+/*
+ * Asynchronous callbacks may use it instead of fuse_file_put() because
+ * we guarantee that they are never last holders of ff. Hitting BUG() below
+ * will make clear any violation of the guarantee.
+ */
+static void __fuse_file_put(struct fuse_file *ff)
+{
+	if (atomic_dec_and_test(&ff->count))
+		BUG();
+}
+
 int fuse_do_open(struct fuse_conn *fc, u64 nodeid, struct file *file,
 		 bool isdir)
 {
@@ -253,6 +264,13 @@ void fuse_release_common(struct file *file, int opcode)
 	req->misc.release.path = file->f_path;
 
 	/*
+	 * No more in-flight asynchronous READ or WRITE requests if
+	 * fuse file release is synchronous
+	 */
+	if (ff->fc->close_wait)
+		BUG_ON(atomic_read(&ff->count) != 1);
+
+	/*
 	 * Normally this will send the RELEASE request, however if
 	 * some asynchronous READ or WRITE requests are outstanding,
 	 * the sending will be delayed.
@@ -271,6 +289,30 @@ static int fuse_open(struct inode *inode, struct file *file)
 
 static int fuse_release(struct inode *inode, struct file *file)
 {
+	struct fuse_file *ff = file->private_data;
+
+	if (ff->fc->close_wait) {
+		struct fuse_inode *fi = get_fuse_inode(inode);
+
+		/*
+		 * Must remove file from write list. Otherwise it is possible
+		 * this file will get more writeback from another files
+		 * rerouted via write_files.
+		 */
+		spin_lock(&ff->fc->lock);
+		list_del_init(&ff->write_entry);
+		spin_unlock(&ff->fc->lock);
+
+		wait_event(fi->page_waitq, atomic_read(&ff->count) == 1);
+
+		/*
+		 * spin_unlock_wait(&ff->fc->lock) would be natural here to
+		 * wait for threads just released ff to leave their critical
+		 * sections. But taking spinlock is the first thing
+		 * fuse_release_common does, so that this is unnecessary.
+		 */
+	}
+
 	fuse_release_common(file, FUSE_RELEASE);
 
 	/* return value is ignored by VFS */
@@ -610,8 +652,17 @@ static void fuse_readpages_end(struct fuse_conn *fc, struct fuse_req *req)
 		unlock_page(page);
 		page_cache_release(page);
 	}
-	if (req->ff)
-		fuse_file_put(req->ff, false);
+	if (req->ff) {
+		if (fc->close_wait) {
+			struct fuse_inode *fi = get_fuse_inode(req->inode);
+
+			spin_lock(&fc->lock);
+			__fuse_file_put(req->ff);
+			wake_up(&fi->page_waitq);
+			spin_unlock(&fc->lock);
+		} else
+			fuse_file_put(req->ff, false);
+	}
 }
 
 struct fuse_fill_data {
@@ -637,6 +688,7 @@ static void fuse_send_readpages(struct fuse_fill_data *data)
 	if (fc->async_read) {
 		req->ff = fuse_file_get(ff);
 		req->end = fuse_readpages_end;
+		req->inode = data->inode;
 		fuse_request_send_background(fc, req);
 	} else {
 		fuse_request_send(fc, req);
@@ -1178,7 +1230,8 @@ static ssize_t fuse_direct_write(struct file *file, const char __user *buf,
 static void fuse_writepage_free(struct fuse_conn *fc, struct fuse_req *req)
 {
 	__free_page(req->pages[0]);
-	fuse_file_put(req->ff, false);
+	if (!fc->close_wait)
+		fuse_file_put(req->ff, false);
 }
 
 static void fuse_writepage_finish(struct fuse_conn *fc, struct fuse_req *req)
@@ -1191,6 +1244,8 @@ static void fuse_writepage_finish(struct fuse_conn *fc, struct fuse_req *req)
 	dec_bdi_stat(bdi, BDI_WRITEBACK);
 	dec_zone_page_state(req->pages[0], NR_WRITEBACK_TEMP);
 	bdi_writeout_inc(bdi);
+	if (fc->close_wait)
+		__fuse_file_put(req->ff);
 	wake_up(&fi->page_waitq);
 }
 


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH] fuse: enable close_wait feature (v2)
  2012-12-20 12:32 ` [PATCH 4/5] fuse: enable close_wait feature Maxim Patlasov
@ 2013-01-15 15:07   ` Maxim Patlasov
  0 siblings, 0 replies; 18+ messages in thread
From: Maxim Patlasov @ 2013-01-15 15:07 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

The patch enables feature by passing 'true' to fuse_file_put in
fuse_release_common.

Previously, this was safe only in special cases when we sure that
multi-threaded userspace won't deadlock if we'll synchronously send
FUSE_RELEASE in the context of read-ahead or write-back callback. Now, it's
always safe because callbacks don't send requests to userspace anymore.

Changed in v2: update the patch to be applied cleanly on the top of
previous modified (v2) patch of the patch-set.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 150033a..83c8904 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -279,7 +279,8 @@ void fuse_release_common(struct file *file, int opcode)
 	 * synchronous RELEASE is allowed (and desirable) in this case
 	 * because the server can be trusted not to screw up.
 	 */
-	fuse_file_put(ff, ff->fc->destroy_req != NULL);
+	fuse_file_put(ff, ff->fc->destroy_req != NULL ||
+			  ff->fc->close_wait);
 }
 
 static int fuse_open(struct inode *inode, struct file *file)


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [fuse-devel] [PATCH 0/5] fuse: close file synchronously
  2012-12-20 12:30 [PATCH 0/5] fuse: close file synchronously Maxim Patlasov
                   ` (4 preceding siblings ...)
  2012-12-20 12:32 ` [PATCH 5/5] fuse: fix synchronous case of fuse_file_put() Maxim Patlasov
@ 2013-04-11 11:21 ` Maxim V. Patlasov
  2013-04-15 15:08 ` Miklos Szeredi
  6 siblings, 0 replies; 18+ messages in thread
From: Maxim V. Patlasov @ 2013-04-11 11:21 UTC (permalink / raw)
  To: miklos; +Cc: dev, xemul, fuse-devel, linux-kernel, devel

Hi Miklos,

Any feedback would be highly appreciated.

Thanks,
Maxim

12/20/2012 04:30 PM, Maxim Patlasov пишет:
> Hi,
>
> There is a long-standing demand for syncronous behaviour of fuse_release:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=19343889
> http://sourceforge.net/mailarchive/message.php?msg_id=29814693
>
> A few months ago Avati and me explained why such a feature would be useful:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=29889055
> http://sourceforge.net/mailarchive/message.php?msg_id=29867423
>
> In short, the problem is that fuse_release (that's called on last user
> close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
> ACK from userspace. Consequently, there is a gap when user regards the
> file released while userspace fuse is still working on it. An attempt to
> access the file from another node leads to complicated synchronization
> problems because the first node still "holds" the file.
>
> The patch-set resolves the problem by making fuse_release synchronous:
> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>
> To keep single-threaded userspace implementations happy the patch-set
> ensures that by the time fuse_release_common calls fuse_file_put, no
> more in-flight I/O exists. Asynchronous fuse callbacks (like
> fuse_readpages_end) cannot trigger FUSE_RELEASE anymore. Hence, we'll
> never block in contexts other than close().
>
> Thanks,
> Maxim
>
> ---
>
> Maxim Patlasov (5):
>        fuse: add close_wait flag to fuse_conn
>        fuse: cosmetic rework of fuse_send_readpages
>        fuse: wait for end of IO on release
>        fuse: enable close_wait feature
>        fuse: fix synchronous case of fuse_file_put()
>
>
>   fs/fuse/file.c            |   82 ++++++++++++++++++++++++++++++++++++++-------
>   fs/fuse/fuse_i.h          |    3 ++
>   fs/fuse/inode.c           |    5 ++-
>   include/uapi/linux/fuse.h |    7 +++-
>   4 files changed, 82 insertions(+), 15 deletions(-)
>


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously
  2012-12-20 12:30 [PATCH 0/5] fuse: close file synchronously Maxim Patlasov
                   ` (5 preceding siblings ...)
  2013-04-11 11:21 ` [fuse-devel] [PATCH 0/5] fuse: close file synchronously Maxim V. Patlasov
@ 2013-04-15 15:08 ` Miklos Szeredi
  2013-04-15 15:30   ` Miklos Szeredi
  2013-04-16 18:13   ` Maxim Patlasov
  6 siblings, 2 replies; 18+ messages in thread
From: Miklos Szeredi @ 2013-04-15 15:08 UTC (permalink / raw)
  To: Maxim Patlasov
  Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

On Thu, Dec 20, 2012 at 1:30 PM, Maxim Patlasov <mpatlasov@parallels.com> wrote:
> Hi,
>
> There is a long-standing demand for syncronous behaviour of fuse_release:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=19343889
> http://sourceforge.net/mailarchive/message.php?msg_id=29814693
>
> A few months ago Avati and me explained why such a feature would be useful:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=29889055
> http://sourceforge.net/mailarchive/message.php?msg_id=29867423
>
> In short, the problem is that fuse_release (that's called on last user
> close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
> ACK from userspace. Consequently, there is a gap when user regards the
> file released while userspace fuse is still working on it. An attempt to
> access the file from another node leads to complicated synchronization
> problems because the first node still "holds" the file.
>
> The patch-set resolves the problem by making fuse_release synchronous:
> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>
> To keep single-threaded userspace implementations happy the patch-set
> ensures that by the time fuse_release_common calls fuse_file_put, no
> more in-flight I/O exists. Asynchronous fuse callbacks (like
> fuse_readpages_end) cannot trigger FUSE_RELEASE anymore. Hence, we'll
> never block in contexts other than close().

There are a few fput() calls outside sys_close(), all of these can
trigger FUSE_RELEASE.  Most of those are OK, but for some I'm
reluctant to enable synchronous release.

For example doing a readlink() on a magic symlink under /proc
shouldn't result in a synchronous call to a fuse filesystem.  Making
fput() synchronous may actually end up doing that (even if it's not
very likely).

At least for the unprivileged fuse daemon case it shouldn't be done.
If the fuse daemon can be "trusted" then enabling synchronous release
should be okay, that's why it's enabled for fuseblk.

But maybe I'm just too paranoid...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously
  2013-04-15 15:08 ` Miklos Szeredi
@ 2013-04-15 15:30   ` Miklos Szeredi
  2013-04-15 18:17     ` Al Viro
  2013-04-17 20:53     ` Miklos Szeredi
  2013-04-16 18:13   ` Maxim Patlasov
  1 sibling, 2 replies; 18+ messages in thread
From: Miklos Szeredi @ 2013-04-15 15:30 UTC (permalink / raw)
  To: Maxim Patlasov
  Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

On Mon, Apr 15, 2013 at 5:08 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> For example doing a readlink() on a magic symlink under /proc
> shouldn't result in a synchronous call to a fuse filesystem.  Making
> fput() synchronous may actually end up doing that (even if it's not
> very likely).

Thinking about this a bit more.  As it is it sounds wrong to rely on a
synchronous release, when in fact release is just not synchronous, as
indicated by the above example.  Maybe it's the proc code that's buggy
and shouldn't do get_file/fput because everyone is assuming release
being synchronous with close().  Don't know.

Let's approach it from the other direction:  what if you give back the
write lease on the first flush?  It will probably work fine for 99% of
cases, since no other writes are going to happen after the first
flush.  For the remaining cases you'll just have to reacquire the
lease when a write happens after the flush.  I guess performance-wise
that will not be an issue, but I may be wrong.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously
  2013-04-15 15:30   ` Miklos Szeredi
@ 2013-04-15 18:17     ` Al Viro
  2013-04-16  9:09       ` Miklos Szeredi
  2013-04-17 20:53     ` Miklos Szeredi
  1 sibling, 1 reply; 18+ messages in thread
From: Al Viro @ 2013-04-15 18:17 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Maxim Patlasov, dev, xemul, fuse-devel, bfoster, linux-kernel,
	devel, anand.avati

On Mon, Apr 15, 2013 at 05:30:41PM +0200, Miklos Szeredi wrote:
> On Mon, Apr 15, 2013 at 5:08 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > For example doing a readlink() on a magic symlink under /proc
> > shouldn't result in a synchronous call to a fuse filesystem.  Making
> > fput() synchronous may actually end up doing that (even if it's not
> > very likely).
> 
> Thinking about this a bit more.  As it is it sounds wrong to rely on a
> synchronous release, when in fact release is just not synchronous, as
> indicated by the above example.  Maybe it's the proc code that's buggy
> and shouldn't do get_file/fput because everyone is assuming release
> being synchronous with close().  Don't know.

What the hell?  ->release() is not and has never been synchronous with close().
There is any number of places where the final fput() might be called and no,
this readlink example is irrelevant - things like munmap()/dup2()/close
of a socket discarding a datagram with the last reference to struct file in
it, et sodding cetera.

Hell, another thread might be in the middle of read(2) at the moment when you
call close().  Result: the final fput() will be done when we are about to
return from that read(2).

People, ->release() is *NOT* guaranteed to be anywhere near close(2).  Never
had been.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously
  2013-04-15 18:17     ` Al Viro
@ 2013-04-16  9:09       ` Miklos Szeredi
  0 siblings, 0 replies; 18+ messages in thread
From: Miklos Szeredi @ 2013-04-16  9:09 UTC (permalink / raw)
  To: Al Viro
  Cc: Maxim Patlasov, dev, xemul, fuse-devel, bfoster, linux-kernel,
	devel, anand.avati

On Mon, Apr 15, 2013 at 8:17 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Mon, Apr 15, 2013 at 05:30:41PM +0200, Miklos Szeredi wrote:
>> On Mon, Apr 15, 2013 at 5:08 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
>> > For example doing a readlink() on a magic symlink under /proc
>> > shouldn't result in a synchronous call to a fuse filesystem.  Making
>> > fput() synchronous may actually end up doing that (even if it's not
>> > very likely).
>>
>> Thinking about this a bit more.  As it is it sounds wrong to rely on a
>> synchronous release, when in fact release is just not synchronous, as
>> indicated by the above example.  Maybe it's the proc code that's buggy
>> and shouldn't do get_file/fput because everyone is assuming release
>> being synchronous with close().  Don't know.
>
> What the hell?  ->release() is not and has never been synchronous with close().
> There is any number of places where the final fput() might be called and no,
> this readlink example is irrelevant - things like munmap()/dup2()/close
> of a socket discarding a datagram with the last reference to struct file in
> it, et sodding cetera.
>
> Hell, another thread might be in the middle of read(2) at the moment when you
> call close().  Result: the final fput() will be done when we are about to
> return from that read(2).

Apparently we do make some pains to make ->release() return before the
syscall that triggered it returns.  Why is that then?

I think the difference between proc symlink and all the rest is that
everything the app does to the file descriptor is its own business.
If it just does single threaded open, read/write, close (which is what
the vast majority of apps do) then close *is* going to be synchronous
with release.  At least most of the time.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously
  2013-04-15 15:08 ` Miklos Szeredi
  2013-04-15 15:30   ` Miklos Szeredi
@ 2013-04-16 18:13   ` Maxim Patlasov
  1 sibling, 0 replies; 18+ messages in thread
From: Maxim Patlasov @ 2013-04-16 18:13 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: dev, xemul, fuse-devel, bfoster, linux-kernel, devel, anand.avati

Hi Miklos,

On 4/15/13 7:08 PM, Miklos Szeredi wrote:
> On Thu, Dec 20, 2012 at 1:30 PM, Maxim Patlasov<mpatlasov@parallels.com>  wrote:
>> Hi,
>>
>> There is a long-standing demand for syncronous behaviour of fuse_release:
>>
>> http://sourceforge.net/mailarchive/message.php?msg_id=19343889
>> http://sourceforge.net/mailarchive/message.php?msg_id=29814693
>>
>> A few months ago Avati and me explained why such a feature would be useful:
>>
>> http://sourceforge.net/mailarchive/message.php?msg_id=29889055
>> http://sourceforge.net/mailarchive/message.php?msg_id=29867423
>>
>> In short, the problem is that fuse_release (that's called on last user
>> close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
>> ACK from userspace. Consequently, there is a gap when user regards the
>> file released while userspace fuse is still working on it. An attempt to
>> access the file from another node leads to complicated synchronization
>> problems because the first node still "holds" the file.
>>
>> The patch-set resolves the problem by making fuse_release synchronous:
>> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>>
>> To keep single-threaded userspace implementations happy the patch-set
>> ensures that by the time fuse_release_common calls fuse_file_put, no
>> more in-flight I/O exists. Asynchronous fuse callbacks (like
>> fuse_readpages_end) cannot trigger FUSE_RELEASE anymore. Hence, we'll
>> never block in contexts other than close().
> There are a few fput() calls outside sys_close(), all of these can
> trigger FUSE_RELEASE.  Most of those are OK, but for some I'm
> reluctant to enable synchronous release.
>
> For example doing a readlink() on a magic symlink under /proc
> shouldn't result in a synchronous call to a fuse filesystem.  Making
> fput() synchronous may actually end up doing that (even if it's not
> very likely).
>
> At least for the unprivileged fuse daemon case it shouldn't be done.
> If the fuse daemon can be "trusted" then enabling synchronous release
> should be okay, that's why it's enabled for fuseblk.
>
> But maybe I'm just too paranoid...

No, I don't think it's too paranoid. I suggest to put the feature under 
fusermount control by adding "close_wait" mount option. This is very 
simple and straightforward and let sysad to decide whether to allow the 
feature for unprivileged users or not.

Btw, having read last messages on this thread, I realized that the name 
of patchset is a bit misleading - it would be better to name it "process 
last fput() synchronously". But the core idea still looks sensible to 
me: userspace may hold a reference to a file in one way or another (e.g. 
by mmap-ed region), but when all references are released the file should 
be ready for reuse again (e.g. to be accessed from another node).

The patch-set was reviewed by Brian Foster and now you looked at it as 
well. Is it time for me to rebase the patchset to be applied on top of 
writeback-cache patches?

Thanks,
Maxim


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously
  2013-04-15 15:30   ` Miklos Szeredi
  2013-04-15 18:17     ` Al Viro
@ 2013-04-17 20:53     ` Miklos Szeredi
  2013-04-18  3:25       ` Maxim Patlasov
  1 sibling, 1 reply; 18+ messages in thread
From: Miklos Szeredi @ 2013-04-17 20:53 UTC (permalink / raw)
  To: Maxim Patlasov
  Cc: Kirill Korotaev, Pavel Emelianov, fuse-devel, Brian Foster,
	Kernel Mailing List, devel, Anand Avati

On Mon, Apr 15, 2013 at 5:30 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> Let's approach it from the other direction:  what if you give back the
> write lease on the first flush?  It will probably work fine for 99% of
> cases, since no other writes are going to happen after the first
> flush.  For the remaining cases you'll just have to reacquire the
> lease when a write happens after the flush.  I guess performance-wise
> that will not be an issue, but I may be wrong.

What about this?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously
  2013-04-17 20:53     ` Miklos Szeredi
@ 2013-04-18  3:25       ` Maxim Patlasov
  0 siblings, 0 replies; 18+ messages in thread
From: Maxim Patlasov @ 2013-04-18  3:25 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Kirill Korotaev, Pavel Emelianov, fuse-devel, Brian Foster,
	Kernel Mailing List, devel, Anand Avati

On 4/17/13 1:53 PM, Miklos Szeredi wrote:
> On Mon, Apr 15, 2013 at 5:30 PM, Miklos Szeredi<miklos@szeredi.hu>  wrote:
>> Let's approach it from the other direction:  what if you give back the
>> write lease on the first flush?  It will probably work fine for 99% of
>> cases, since no other writes are going to happen after the first
>> flush.  For the remaining cases you'll just have to reacquire the
>> lease when a write happens after the flush.  I guess performance-wise
>> that will not be an issue, but I may be wrong.
> What about this?

We'd like to do it, but we can't. Firstly because we rely on the fact 
that the file cannot be modified by someone else while we hold exclusive 
write lease. By the time we decide to reacquire the lease, the file can 
be re-used by someone else and become completely different comparatively 
with its state at the moment of first flush. Secondly, we can't sensibly 
handle a case when the lease is already acquired by someone else by the 
time of attempt to reacquire it.

Thanks,
Maxim

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2013-04-18  3:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-12-20 12:30 [PATCH 0/5] fuse: close file synchronously Maxim Patlasov
2012-12-20 12:31 ` [PATCH 1/5] fuse: add close_wait flag to fuse_conn Maxim Patlasov
2012-12-20 12:31 ` [PATCH 2/5] fuse: cosmetic rework of fuse_send_readpages Maxim Patlasov
2012-12-20 12:31 ` [PATCH 3/5] fuse: wait for end of IO on release Maxim Patlasov
2013-01-02 20:35   ` Brian Foster
2013-01-15 14:04     ` Maxim V. Patlasov
2013-01-15 15:02   ` [PATCH] fuse: wait for end of IO on release (v2) Maxim Patlasov
2012-12-20 12:32 ` [PATCH 4/5] fuse: enable close_wait feature Maxim Patlasov
2013-01-15 15:07   ` [PATCH] fuse: enable close_wait feature (v2) Maxim Patlasov
2012-12-20 12:32 ` [PATCH 5/5] fuse: fix synchronous case of fuse_file_put() Maxim Patlasov
2013-04-11 11:21 ` [fuse-devel] [PATCH 0/5] fuse: close file synchronously Maxim V. Patlasov
2013-04-15 15:08 ` Miklos Szeredi
2013-04-15 15:30   ` Miklos Szeredi
2013-04-15 18:17     ` Al Viro
2013-04-16  9:09       ` Miklos Szeredi
2013-04-17 20:53     ` Miklos Szeredi
2013-04-18  3:25       ` Maxim Patlasov
2013-04-16 18:13   ` Maxim Patlasov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).