All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] fuse: close file synchronously (v2)
@ 2014-06-06 13:27 Maxim Patlasov
  2014-06-06 13:27 ` [PATCH 1/5] fuse: add close_wait flag to fuse_conn Maxim Patlasov
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-06 13:27 UTC (permalink / raw)
  To: miklos; +Cc: fuse-devel, linux-kernel

Hi,

There is a long-standing demand for synchronous behaviour of fuse_release:

http://sourceforge.net/mailarchive/message.php?msg_id=19343889
http://sourceforge.net/mailarchive/message.php?msg_id=29814693

A year ago Avati and me explained why such a feature would be useful:

http://sourceforge.net/mailarchive/message.php?msg_id=29889055
http://sourceforge.net/mailarchive/message.php?msg_id=29867423

In short, the problem is that fuse_release (that's called on last user
close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
ACK from userspace. Consequently, there is a gap when user regards the
file released while userspace fuse is still working on it. An attempt to
access the file from another node leads to complicated synchronization
problems because the first node still "holds" the file.

The patch-set resolves the problem by making fuse_release synchronous:
wait for ACK from userspace for FUSE_RELEASE if the feature is ON.

To keep single-threaded userspace implementations happy the patch-set
ensures that by the time fuse_release_common calls fuse_file_put, no
more in-flight I/O exists. Asynchronous fuse callbacks (like
fuse_readpages_end) cannot trigger FUSE_RELEASE anymore. Hence, we'll
never block in contexts other than close().

Changed in v2:
 - improved comments, commented spin_unlock_wait out according to Brian'
suggestions.
 - rebased on v3.15-rc8 tag of Linus' tree.

Thanks,
Maxim

---

Maxim Patlasov (5):
      fuse: add close_wait flag to fuse_conn
      fuse: cosmetic rework of fuse_send_readpages
      fuse: wait for end of IO on release
      fuse: enable close_wait feature
      fuse: fix synchronous case of fuse_file_put()


 fs/fuse/file.c            |  100 ++++++++++++++++++++++++++++++++++++++-------
 fs/fuse/fuse_i.h          |    3 +
 fs/fuse/inode.c           |    4 +-
 include/uapi/linux/fuse.h |    3 +
 4 files changed, 93 insertions(+), 17 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] fuse: add close_wait flag to fuse_conn
  2014-06-06 13:27 [PATCH 0/5] fuse: close file synchronously (v2) Maxim Patlasov
@ 2014-06-06 13:27 ` Maxim Patlasov
  2014-06-06 13:28 ` [PATCH 2/5] fuse: cosmetic rework of fuse_send_readpages Maxim Patlasov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-06 13:27 UTC (permalink / raw)
  To: miklos; +Cc: fuse-devel, linux-kernel

The feature will be governed by fc->close_wait. Userspace can enable it in
the same way as auto_inval_data or any other kernel fuse capability.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/fuse_i.h          |    3 +++
 fs/fuse/inode.c           |    4 +++-
 include/uapi/linux/fuse.h |    3 +++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 7aa5c75..434ff08 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -557,6 +557,9 @@ struct fuse_conn {
 	/** Does the filesystem support asynchronous direct-IO submission? */
 	unsigned async_dio:1;
 
+	/** Wait for response from daemon on close */
+	unsigned close_wait:1;
+
 	/** The number of requests waiting for completion */
 	atomic_t num_waiting;
 
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 754dcf2..580d1b3 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -898,6 +898,8 @@ static void process_init_reply(struct fuse_conn *fc, struct fuse_req *req)
 			else
 				fc->sb->s_time_gran = 1000000000;
 
+			if (arg->flags & FUSE_CLOSE_WAIT)
+				fc->close_wait = 1;
 		} else {
 			ra_pages = fc->max_read / PAGE_CACHE_SIZE;
 			fc->no_lock = 1;
@@ -926,7 +928,7 @@ static void fuse_send_init(struct fuse_conn *fc, struct fuse_req *req)
 		FUSE_SPLICE_WRITE | FUSE_SPLICE_MOVE | FUSE_SPLICE_READ |
 		FUSE_FLOCK_LOCKS | FUSE_IOCTL_DIR | FUSE_AUTO_INVAL_DATA |
 		FUSE_DO_READDIRPLUS | FUSE_READDIRPLUS_AUTO | FUSE_ASYNC_DIO |
-		FUSE_WRITEBACK_CACHE;
+		FUSE_WRITEBACK_CACHE | FUSE_CLOSE_WAIT;
 	req->in.h.opcode = FUSE_INIT;
 	req->in.numargs = 1;
 	req->in.args[0].size = sizeof(*arg);
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 40b5ca8..1e1b6fa 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -101,6 +101,7 @@
  *  - add FATTR_CTIME
  *  - add ctime and ctimensec to fuse_setattr_in
  *  - add FUSE_RENAME2 request
+ *  - add FUSE_CLOSE_WAIT
  */
 
 #ifndef _LINUX_FUSE_H
@@ -229,6 +230,7 @@ struct fuse_file_lock {
  * FUSE_READDIRPLUS_AUTO: adaptive readdirplus
  * FUSE_ASYNC_DIO: asynchronous direct I/O submission
  * FUSE_WRITEBACK_CACHE: use writeback cache for buffered writes
+ * FUSE_CLOSE_WAIT: wait for response from daemon on close
  */
 #define FUSE_ASYNC_READ		(1 << 0)
 #define FUSE_POSIX_LOCKS	(1 << 1)
@@ -247,6 +249,7 @@ struct fuse_file_lock {
 #define FUSE_READDIRPLUS_AUTO	(1 << 14)
 #define FUSE_ASYNC_DIO		(1 << 15)
 #define FUSE_WRITEBACK_CACHE	(1 << 16)
+#define FUSE_CLOSE_WAIT		(1 << 17)
 
 /**
  * CUSE INIT request/reply flags


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] fuse: cosmetic rework of fuse_send_readpages
  2014-06-06 13:27 [PATCH 0/5] fuse: close file synchronously (v2) Maxim Patlasov
  2014-06-06 13:27 ` [PATCH 1/5] fuse: add close_wait flag to fuse_conn Maxim Patlasov
@ 2014-06-06 13:28 ` Maxim Patlasov
  2014-06-06 13:29 ` [PATCH 3/5] fuse: wait for end of IO on release Maxim Patlasov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-06 13:28 UTC (permalink / raw)
  To: miklos; +Cc: fuse-devel, linux-kernel

The patch change arguments of fuse_send_readpages to give it access to inode
(will be used in the next patch of patch-set). The change is cosmetic,
no logic changed.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |   22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 96d513e..b81a945 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -827,8 +827,17 @@ static void fuse_readpages_end(struct fuse_conn *fc, struct fuse_req *req)
 		fuse_file_put(req->ff, false);
 }
 
-static void fuse_send_readpages(struct fuse_req *req, struct file *file)
+struct fuse_fill_data {
+	struct fuse_req *req;
+	struct file *file;
+	struct inode *inode;
+	unsigned nr_pages;
+};
+
+static void fuse_send_readpages(struct fuse_fill_data *data)
 {
+	struct fuse_req *req = data->req;
+	struct file *file = data->file;
 	struct fuse_file *ff = file->private_data;
 	struct fuse_conn *fc = ff->fc;
 	loff_t pos = page_offset(req->pages[0]);
@@ -850,13 +859,6 @@ static void fuse_send_readpages(struct fuse_req *req, struct file *file)
 	}
 }
 
-struct fuse_fill_data {
-	struct fuse_req *req;
-	struct file *file;
-	struct inode *inode;
-	unsigned nr_pages;
-};
-
 static int fuse_readpages_fill(void *_data, struct page *page)
 {
 	struct fuse_fill_data *data = _data;
@@ -872,7 +874,7 @@ static int fuse_readpages_fill(void *_data, struct page *page)
 	     req->pages[req->num_pages - 1]->index + 1 != page->index)) {
 		int nr_alloc = min_t(unsigned, data->nr_pages,
 				     FUSE_MAX_PAGES_PER_REQ);
-		fuse_send_readpages(req, data->file);
+		fuse_send_readpages(data);
 		if (fc->async_read)
 			req = fuse_get_req_for_background(fc, nr_alloc);
 		else
@@ -925,7 +927,7 @@ static int fuse_readpages(struct file *file, struct address_space *mapping,
 	err = read_cache_pages(mapping, pages, fuse_readpages_fill, &data);
 	if (!err) {
 		if (data.req->num_pages)
-			fuse_send_readpages(data.req, file);
+			fuse_send_readpages(&data);
 		else
 			fuse_put_request(fc, data.req);
 	}


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] fuse: wait for end of IO on release
  2014-06-06 13:27 [PATCH 0/5] fuse: close file synchronously (v2) Maxim Patlasov
  2014-06-06 13:27 ` [PATCH 1/5] fuse: add close_wait flag to fuse_conn Maxim Patlasov
  2014-06-06 13:28 ` [PATCH 2/5] fuse: cosmetic rework of fuse_send_readpages Maxim Patlasov
@ 2014-06-06 13:29 ` Maxim Patlasov
  2014-06-06 13:30 ` [PATCH 4/5] fuse: enable close_wait feature Maxim Patlasov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-06 13:29 UTC (permalink / raw)
  To: miklos; +Cc: fuse-devel, linux-kernel

There are two types of I/O activity that can be "in progress" at the time
of fuse_release() execution: asynchronous read-ahead and write-back. The
patch ensures that they are completed before fuse_release_common sends
FUSE_RELEASE to userspace.

So far as fuse_release() waits for end of async I/O, its callbacks
(fuse_readpages_end and fuse_writepage_finish) calling fuse_file_put cannot
be the last holders of fuse file anymore. To emphasize the fact, the patch
replaces fuse_file_put with __fuse_file_put there.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |   71 ++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 66 insertions(+), 5 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index b81a945..d50af99 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -149,6 +149,17 @@ static void fuse_file_put(struct fuse_file *ff, bool sync)
 	}
 }
 
+/*
+ * Asynchronous callbacks may use it instead of fuse_file_put() because
+ * we guarantee that they are never last holders of ff. Hitting BUG() below
+ * will make clear any violation of the guarantee.
+ */
+static void __fuse_file_put(struct fuse_file *ff)
+{
+	if (atomic_dec_and_test(&ff->count))
+		BUG();
+}
+
 int fuse_do_open(struct fuse_conn *fc, u64 nodeid, struct file *file,
 		 bool isdir)
 {
@@ -302,6 +313,13 @@ void fuse_release_common(struct file *file, int opcode)
 	req->misc.release.path = file->f_path;
 
 	/*
+	 * No more in-flight asynchronous READ or WRITE requests if
+	 * fuse file release is synchronous
+	 */
+	if (ff->fc->close_wait)
+		BUG_ON(atomic_read(&ff->count) != 1);
+
+	/*
 	 * Normally this will send the RELEASE request, however if
 	 * some asynchronous READ or WRITE requests are outstanding,
 	 * the sending will be delayed.
@@ -321,11 +339,34 @@ static int fuse_open(struct inode *inode, struct file *file)
 static int fuse_release(struct inode *inode, struct file *file)
 {
 	struct fuse_conn *fc = get_fuse_conn(inode);
+	struct fuse_file *ff = file->private_data;
 
 	/* see fuse_vma_close() for !writeback_cache case */
 	if (fc->writeback_cache)
 		write_inode_now(inode, 1);
 
+	if (ff->fc->close_wait) {
+		struct fuse_inode *fi = get_fuse_inode(inode);
+
+		/*
+		 * Must remove file from write list. Otherwise it is possible
+		 * this file will get more writeback from another files
+		 * rerouted via write_files.
+		 */
+		spin_lock(&ff->fc->lock);
+		list_del_init(&ff->write_entry);
+		spin_unlock(&ff->fc->lock);
+
+		wait_event(fi->page_waitq, atomic_read(&ff->count) == 1);
+
+		/*
+		 * spin_unlock_wait(&ff->fc->lock) would be natural here to
+		 * wait for threads just released ff to leave their critical
+		 * sections. But taking spinlock is the first thing
+		 * fuse_release_common does, so that this is unnecessary.
+		 */
+	}
+
 	fuse_release_common(file, FUSE_RELEASE);
 
 	/* return value is ignored by VFS */
@@ -823,8 +864,17 @@ static void fuse_readpages_end(struct fuse_conn *fc, struct fuse_req *req)
 		unlock_page(page);
 		page_cache_release(page);
 	}
-	if (req->ff)
-		fuse_file_put(req->ff, false);
+	if (req->ff) {
+		if (fc->close_wait) {
+			struct fuse_inode *fi = get_fuse_inode(req->inode);
+
+			spin_lock(&fc->lock);
+			__fuse_file_put(req->ff);
+			wake_up(&fi->page_waitq);
+			spin_unlock(&fc->lock);
+		} else
+			fuse_file_put(req->ff, false);
+	}
 }
 
 struct fuse_fill_data {
@@ -851,6 +901,7 @@ static void fuse_send_readpages(struct fuse_fill_data *data)
 	if (fc->async_read) {
 		req->ff = fuse_file_get(ff);
 		req->end = fuse_readpages_end;
+		req->inode = data->inode;
 		fuse_request_send_background(fc, req);
 	} else {
 		fuse_request_send(fc, req);
@@ -1537,7 +1588,7 @@ static void fuse_writepage_free(struct fuse_conn *fc, struct fuse_req *req)
 	for (i = 0; i < req->num_pages; i++)
 		__free_page(req->pages[i]);
 
-	if (req->ff)
+	if (req->ff && !fc->close_wait)
 		fuse_file_put(req->ff, false);
 }
 
@@ -1554,6 +1605,8 @@ static void fuse_writepage_finish(struct fuse_conn *fc, struct fuse_req *req)
 		dec_zone_page_state(req->pages[i], NR_WRITEBACK_TEMP);
 		bdi_writeout_inc(bdi);
 	}
+	if (fc->close_wait)
+		__fuse_file_put(req->ff);
 	wake_up(&fi->page_waitq);
 }
 
@@ -1694,8 +1747,16 @@ int fuse_write_inode(struct inode *inode, struct writeback_control *wbc)
 
 	ff = __fuse_write_file_get(fc, fi);
 	err = fuse_flush_times(inode, ff);
-	if (ff)
-		fuse_file_put(ff, 0);
+	if (ff) {
+		if (fc->close_wait) {
+			spin_lock(&fc->lock);
+			__fuse_file_put(ff);
+			wake_up(&fi->page_waitq);
+			spin_unlock(&fc->lock);
+
+		} else
+			fuse_file_put(ff, false);
+	}
 
 	return err;
 }


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] fuse: enable close_wait feature
  2014-06-06 13:27 [PATCH 0/5] fuse: close file synchronously (v2) Maxim Patlasov
                   ` (2 preceding siblings ...)
  2014-06-06 13:29 ` [PATCH 3/5] fuse: wait for end of IO on release Maxim Patlasov
@ 2014-06-06 13:30 ` Maxim Patlasov
  2014-06-06 13:31 ` [PATCH 5/5] fuse: fix synchronous case of fuse_file_put() Maxim Patlasov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-06 13:30 UTC (permalink / raw)
  To: miklos; +Cc: fuse-devel, linux-kernel

The patch enables feature by passing 'true' to fuse_file_put in
fuse_release_common.

Previously, this was safe only in special cases when we sure that
multi-threaded userspace won't deadlock if we'll synchronously send
FUSE_RELEASE in the context of read-ahead or write-back callback. Now, it's
always safe because callbacks don't send requests to userspace anymore.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index d50af99..783cb52 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -328,7 +328,8 @@ void fuse_release_common(struct file *file, int opcode)
 	 * synchronous RELEASE is allowed (and desirable) in this case
 	 * because the server can be trusted not to screw up.
 	 */
-	fuse_file_put(ff, ff->fc->destroy_req != NULL);
+	fuse_file_put(ff, ff->fc->destroy_req != NULL ||
+			  ff->fc->close_wait);
 }
 
 static int fuse_open(struct inode *inode, struct file *file)


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] fuse: fix synchronous case of fuse_file_put()
  2014-06-06 13:27 [PATCH 0/5] fuse: close file synchronously (v2) Maxim Patlasov
                   ` (3 preceding siblings ...)
  2014-06-06 13:30 ` [PATCH 4/5] fuse: enable close_wait feature Maxim Patlasov
@ 2014-06-06 13:31 ` Maxim Patlasov
  2014-06-06 13:51 ` [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2) John Muir
  2014-08-13 12:44 ` Miklos Szeredi
  6 siblings, 0 replies; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-06 13:31 UTC (permalink / raw)
  To: miklos; +Cc: fuse-devel, linux-kernel

If fuse_file_put() is called with sync==true, the user may be blocked for
a while, until userspace ACKs our FUSE_RELEASE request. This blocking must be
uninterruptible. Otherwise request could be interrupted, but file association
in user space remains.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
---
 fs/fuse/file.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 783cb52..9f38568 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -136,6 +136,10 @@ static void fuse_file_put(struct fuse_file *ff, bool sync)
 			path_put(&req->misc.release.path);
 			fuse_put_request(ff->fc, req);
 		} else if (sync) {
+			/* Must force. Otherwise request could be interrupted,
+			 * but file association in user space remains.
+			 */
+			req->force = 1;
 			req->background = 0;
 			fuse_request_send(ff->fc, req);
 			path_put(&req->misc.release.path);


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2)
  2014-06-06 13:27 [PATCH 0/5] fuse: close file synchronously (v2) Maxim Patlasov
                   ` (4 preceding siblings ...)
  2014-06-06 13:31 ` [PATCH 5/5] fuse: fix synchronous case of fuse_file_put() Maxim Patlasov
@ 2014-06-06 13:51 ` John Muir
  2014-06-09  7:50   ` Maxim Patlasov
  2014-08-13 12:44 ` Miklos Szeredi
  6 siblings, 1 reply; 14+ messages in thread
From: John Muir @ 2014-06-06 13:51 UTC (permalink / raw)
  To: Maxim Patlasov; +Cc: miklos, fuse-devel, Linux List

On 2014.06.06, at 15:27 , Maxim Patlasov <mpatlasov@parallels.com> wrote:

> The patch-set resolves the problem by making fuse_release synchronous:
> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.

Why not make this feature per-file with a new flag bit in struct fuse_file_info rather than as a file-system global?

John.

--
John Muir - john@jmuir.com
+32 491 64 22 76


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2)
  2014-06-06 13:51 ` [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2) John Muir
@ 2014-06-09  7:50   ` Maxim Patlasov
  2014-06-09  9:26     ` John Muir
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-09  7:50 UTC (permalink / raw)
  To: John Muir; +Cc: miklos, fuse-devel, Linux List

On 06/06/2014 05:51 PM, John Muir wrote:
> On 2014.06.06, at 15:27 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>
>> The patch-set resolves the problem by making fuse_release synchronous:
>> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
> Why not make this feature per-file with a new flag bit in struct fuse_file_info rather than as a file-system global?

I don't expect a great demand for such a granularity. File-system global 
"close_wait" conveys a general user expectation about filesystem 
behaviour in distributed environment: if you stopped using a file on 
given node, whether it means that the file is immediately accessible 
from another node.

Maxim

>
> John.
>
> --
> John Muir - john@jmuir.com
> +32 491 64 22 76
>
>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2)
  2014-06-09  7:50   ` Maxim Patlasov
@ 2014-06-09  9:26     ` John Muir
  2014-06-09 10:46       ` Maxim Patlasov
  0 siblings, 1 reply; 14+ messages in thread
From: John Muir @ 2014-06-09  9:26 UTC (permalink / raw)
  To: Maxim Patlasov; +Cc: Miklos Szeredi, fuse-devel, Linux List

On 2014.06.09, at 9:50 , Maxim Patlasov <mpatlasov@parallels.com> wrote:

> On 06/06/2014 05:51 PM, John Muir wrote:
>> On 2014.06.06, at 15:27 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>> 
>>> The patch-set resolves the problem by making fuse_release synchronous:
>>> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>> Why not make this feature per-file with a new flag bit in struct fuse_file_info rather than as a file-system global?
> 
> I don't expect a great demand for such a granularity. File-system global "close_wait" conveys a general user expectation about filesystem behaviour in distributed environment: if you stopped using a file on given node, whether it means that the file is immediately accessible from another node.
> 

By user do you mean the end-user, or the implementor of the file-system? It seems to me that the end-user doesn't care, and just wants the file-system to work as expected. I don't think we're really talking about the end-user.

The implementor of a file-system, on the other hand, might want the semantics for close_wait on some files, but not on others. Won't there be a performance impact? Some distributed file-systems might want this on specific files only. Implementing it as a flag on the struct fuse_file_info gives the flexibility to the file-system implementor.

Regards,

John.

--
John Muir - john@jmuir.com
+32 491 64 22 76

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2)
  2014-06-09  9:26     ` John Muir
@ 2014-06-09 10:46       ` Maxim Patlasov
  2014-06-09 11:11         ` John Muir
  0 siblings, 1 reply; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-09 10:46 UTC (permalink / raw)
  To: John Muir; +Cc: Miklos Szeredi, fuse-devel, Linux List

On 06/09/2014 01:26 PM, John Muir wrote:
> On 2014.06.09, at 9:50 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>
>> On 06/06/2014 05:51 PM, John Muir wrote:
>>> On 2014.06.06, at 15:27 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>>>
>>>> The patch-set resolves the problem by making fuse_release synchronous:
>>>> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>>> Why not make this feature per-file with a new flag bit in struct fuse_file_info rather than as a file-system global?
>> I don't expect a great demand for such a granularity. File-system global "close_wait" conveys a general user expectation about filesystem behaviour in distributed environment: if you stopped using a file on given node, whether it means that the file is immediately accessible from another node.
>>
> By user do you mean the end-user, or the implementor of the file-system? It seems to me that the end-user doesn't care, and just wants the file-system to work as expected. I don't think we're really talking about the end-user.

No, this is exactly about end-user expectations. Imagine a complicated 
heavy-loaded shared storage where handling FUSE_RELEASE in userspace may 
take a few minutes. In close_wait=0 case, an end-user who has just 
called close(2) has no idea when it's safe to access the file from 
another node or even when it's OK to umount filesystem.

>
> The implementor of a file-system, on the other hand, might want the semantics for close_wait on some files, but not on others. Won't there be a performance impact? Some distributed file-systems might want this on specific files only. Implementing it as a flag on the struct fuse_file_info gives the flexibility to the file-system implementor.

fuse_file_info is an userspace structure, in-kernel fuse knows nothing 
about it. In close_wait=1 case, nothing prevents a file-system 
implementation from ACK-ing FUSE_RELEASE request immediately (for 
specific files) and schedule actual handling for future processing.

Thanks,
Maxim

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2)
  2014-06-09 10:46       ` Maxim Patlasov
@ 2014-06-09 11:11         ` John Muir
  2014-06-09 12:00           ` Maxim Patlasov
  0 siblings, 1 reply; 14+ messages in thread
From: John Muir @ 2014-06-09 11:11 UTC (permalink / raw)
  To: Maxim Patlasov; +Cc: Miklos Szeredi, fuse-devel, Linux List

On 2014.06.09, at 12:46 , Maxim Patlasov <mpatlasov@parallels.com> wrote:

> On 06/09/2014 01:26 PM, John Muir wrote:
>> On 2014.06.09, at 9:50 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>> 
>>> On 06/06/2014 05:51 PM, John Muir wrote:
>>>> On 2014.06.06, at 15:27 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>>>> 
>>>>> The patch-set resolves the problem by making fuse_release synchronous:
>>>>> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>>>> Why not make this feature per-file with a new flag bit in struct fuse_file_info rather than as a file-system global?
>>> I don't expect a great demand for such a granularity. File-system global "close_wait" conveys a general user expectation about filesystem behaviour in distributed environment: if you stopped using a file on given node, whether it means that the file is immediately accessible from another node.
>>> 
>> By user do you mean the end-user, or the implementor of the file-system? It seems to me that the end-user doesn't care, and just wants the file-system to work as expected. I don't think we're really talking about the end-user.
> 
> No, this is exactly about end-user expectations. Imagine a complicated heavy-loaded shared storage where handling FUSE_RELEASE in userspace may take a few minutes. In close_wait=0 case, an end-user who has just called close(2) has no idea when it's safe to access the file from another node or even when it's OK to umount filesystem.

I think we're saying the same thing here from different perspectives. The end-user wants the file-system to operate with the semantics you describe, but I don't think it makes sense to give the end-user control over those semantics. The file-system itself should be implemented that way, or not, or per-file

If it's a read-only file, then does this not add the overhead of having the kernel wait for the user-space file-system process to respond before closing it. In my experience, there is actually significant cost to the kernel to user-space messaging in FUSE when manipulating thousands of files.

> 
>> 
>> The implementor of a file-system, on the other hand, might want the semantics for close_wait on some files, but not on others. Won't there be a performance impact? Some distributed file-systems might want this on specific files only. Implementing it as a flag on the struct fuse_file_info gives the flexibility to the file-system implementor.
> 
> fuse_file_info is an userspace structure, in-kernel fuse knows nothing about it. In close_wait=1 case, nothing prevents a file-system implementation from ACK-ing FUSE_RELEASE request immediately (for specific files) and schedule actual handling for future processing.

Of course you know I meant that you'd add another flag to both fuse_file_info, and in the kernel equivalent for those flags which is struct fuse_open_out -> open_flags. This is where other such per file options are specified such as whether or not to keep the in-kernal cache for a file, whether or not to allow direct-io, and whether or not to allow seek.

Anyway, I guess you're the one doing all the work on this and if you have a particular implementation that doesn't require such fine-grained control, and no one else does then it's up to you. I'm just trying to show an alternative implementation that gives the file-system implementor more control while keeping the ability to meet user expectations.

Regards,
 
John.

--
John Muir - john@jmuir.com
+32 491 64 22 76


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2)
  2014-06-09 11:11         ` John Muir
@ 2014-06-09 12:00           ` Maxim Patlasov
  0 siblings, 0 replies; 14+ messages in thread
From: Maxim Patlasov @ 2014-06-09 12:00 UTC (permalink / raw)
  To: John Muir; +Cc: Miklos Szeredi, fuse-devel, Linux List

On 06/09/2014 03:11 PM, John Muir wrote:
> On 2014.06.09, at 12:46 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>
>> On 06/09/2014 01:26 PM, John Muir wrote:
>>> On 2014.06.09, at 9:50 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>>>
>>>> On 06/06/2014 05:51 PM, John Muir wrote:
>>>>> On 2014.06.06, at 15:27 , Maxim Patlasov <mpatlasov@parallels.com> wrote:
>>>>>
>>>>>> The patch-set resolves the problem by making fuse_release synchronous:
>>>>>> wait for ACK from userspace for FUSE_RELEASE if the feature is ON.
>>>>> Why not make this feature per-file with a new flag bit in struct fuse_file_info rather than as a file-system global?
>>>> I don't expect a great demand for such a granularity. File-system global "close_wait" conveys a general user expectation about filesystem behaviour in distributed environment: if you stopped using a file on given node, whether it means that the file is immediately accessible from another node.
>>>>
>>> By user do you mean the end-user, or the implementor of the file-system? It seems to me that the end-user doesn't care, and just wants the file-system to work as expected. I don't think we're really talking about the end-user.
>> No, this is exactly about end-user expectations. Imagine a complicated heavy-loaded shared storage where handling FUSE_RELEASE in userspace may take a few minutes. In close_wait=0 case, an end-user who has just called close(2) has no idea when it's safe to access the file from another node or even when it's OK to umount filesystem.
> I think we're saying the same thing here from different perspectives. The end-user wants the file-system to operate with the semantics you describe, but I don't think it makes sense to give the end-user control over those semantics. The file-system itself should be implemented that way, or not, or per-file
>
> If it's a read-only file, then does this not add the overhead of having the kernel wait for the user-space file-system process to respond before closing it. In my experience, there is actually significant cost to the kernel to user-space messaging in FUSE when manipulating thousands of files.
>
>>> The implementor of a file-system, on the other hand, might want the semantics for close_wait on some files, but not on others. Won't there be a performance impact? Some distributed file-systems might want this on specific files only. Implementing it as a flag on the struct fuse_file_info gives the flexibility to the file-system implementor.
>> fuse_file_info is an userspace structure, in-kernel fuse knows nothing about it. In close_wait=1 case, nothing prevents a file-system implementation from ACK-ing FUSE_RELEASE request immediately (for specific files) and schedule actual handling for future processing.
> Of course you know I meant that you'd add another flag to both fuse_file_info, and in the kernel equivalent for those flags which is struct fuse_open_out -> open_flags. This is where other such per file options are specified such as whether or not to keep the in-kernal cache for a file, whether or not to allow direct-io, and whether or not to allow seek.
>
> Anyway, I guess you're the one doing all the work on this and if you have a particular implementation that doesn't require such fine-grained control, and no one else does then it's up to you. I'm just trying to show an alternative implementation that gives the file-system implementor more control while keeping the ability to meet user expectations.

Thank you, John. That's really depends on whether someone else wants 
fine-grained control or not. I'm generally OK to re-work the patch-set 
if more requesters emerge.

Thanks,
Maxim

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously (v2)
  2014-06-06 13:27 [PATCH 0/5] fuse: close file synchronously (v2) Maxim Patlasov
                   ` (5 preceding siblings ...)
  2014-06-06 13:51 ` [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2) John Muir
@ 2014-08-13 12:44 ` Miklos Szeredi
  2014-08-14 12:14   ` Maxim Patlasov
  6 siblings, 1 reply; 14+ messages in thread
From: Miklos Szeredi @ 2014-08-13 12:44 UTC (permalink / raw)
  To: Maxim Patlasov; +Cc: fuse-devel, Kernel Mailing List

On Fri, Jun 6, 2014 at 3:27 PM, Maxim Patlasov <MPatlasov@parallels.com> wrote:
> Hi,
>
> There is a long-standing demand for synchronous behaviour of fuse_release:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=19343889
> http://sourceforge.net/mailarchive/message.php?msg_id=29814693
>
> A year ago Avati and me explained why such a feature would be useful:
>
> http://sourceforge.net/mailarchive/message.php?msg_id=29889055
> http://sourceforge.net/mailarchive/message.php?msg_id=29867423
>
> In short, the problem is that fuse_release (that's called on last user
> close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
> ACK from userspace. Consequently, there is a gap when user regards the
> file released while userspace fuse is still working on it. An attempt to
> access the file from another node leads to complicated synchronization
> problems because the first node still "holds" the file.

Tying RELEASE to close(2) is not going to work.  Look at all the
places that call fput() (or fdput() in recent kernels), those are all
potential triggers for RELEASE, some realistic, some not quite, but
all are certainly places that a synchronous release could block
*instead* of close.

Which just means, that close will still be asynchronous with release
some of the time.  So it's not clear to me what is to be gained from
this patchset.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/5] fuse: close file synchronously (v2)
  2014-08-13 12:44 ` Miklos Szeredi
@ 2014-08-14 12:14   ` Maxim Patlasov
  0 siblings, 0 replies; 14+ messages in thread
From: Maxim Patlasov @ 2014-08-14 12:14 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, Kernel Mailing List

On 08/13/2014 04:44 PM, Miklos Szeredi wrote:
> On Fri, Jun 6, 2014 at 3:27 PM, Maxim Patlasov <MPatlasov@parallels.com> wrote:
>> Hi,
>>
>> There is a long-standing demand for synchronous behaviour of fuse_release:
>>
>> http://sourceforge.net/mailarchive/message.php?msg_id=19343889
>> http://sourceforge.net/mailarchive/message.php?msg_id=29814693
>>
>> A year ago Avati and me explained why such a feature would be useful:
>>
>> http://sourceforge.net/mailarchive/message.php?msg_id=29889055
>> http://sourceforge.net/mailarchive/message.php?msg_id=29867423
>>
>> In short, the problem is that fuse_release (that's called on last user
>> close(2)) sends FUSE_RELEASE to userspace and returns without waiting for
>> ACK from userspace. Consequently, there is a gap when user regards the
>> file released while userspace fuse is still working on it. An attempt to
>> access the file from another node leads to complicated synchronization
>> problems because the first node still "holds" the file.
> Tying RELEASE to close(2) is not going to work.  Look at all the
> places that call fput() (or fdput() in recent kernels), those are all
> potential triggers for RELEASE, some realistic, some not quite, but
> all are certainly places that a synchronous release could block
> *instead* of close.
>
> Which just means, that close will still be asynchronous with release
> some of the time.  So it's not clear to me what is to be gained from
> this patchset.

The patch-set doesn't tie RELEASE to close(2), it ensures that we report 
to user space exactly the last fput(). That's correct because this is 
exactly the moment when any file system with sharing mode connected to 
open/close must drop sharing mode. This is the case even for some local 
filesystems, for example, ntfs-3g.

Could you please look closely at your commit 
5a18ec176c934ca1bc9dc61580a5e0e90a9b5733. It actually implemented two 
different things: 1) synchronous release and 2) delayed path_put. The 
latter was well explained by the comment:

 >        /*
 >         * If this is a fuseblk mount, then it's possible that
 >         * releasing the path will result in releasing the
 >         * super block and sending the DESTROY request.  If
 >         * the server is single threaded, this would hang.
 >         * For this reason do the path_put() in a separate
 >         * thread.
 >         */

So it's clear why the delay needed and why it's bound to fuseblk 
condition. But synchronous close was made under the same condition, 
which is obviously wrong. I understand why you made that decision in 
2011: otherwise, we could block in a wrong context (last decrement of 
ff->count might happen in scope of read-ahead or mmap-ed writeback). But 
now, with the approach implemented in this patch-set, this is impossible 
-- we wait for completion of all async operations before triggering 
synchronous release. Thus the patch-set untie a functionality which 
already existed before (synchronous release) from wrong condition 
(fuseblk mount) putting it under well-defined control (FUSE_CLOSE_WAIT).

Thanks,
Maxim

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-08-14 12:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-06 13:27 [PATCH 0/5] fuse: close file synchronously (v2) Maxim Patlasov
2014-06-06 13:27 ` [PATCH 1/5] fuse: add close_wait flag to fuse_conn Maxim Patlasov
2014-06-06 13:28 ` [PATCH 2/5] fuse: cosmetic rework of fuse_send_readpages Maxim Patlasov
2014-06-06 13:29 ` [PATCH 3/5] fuse: wait for end of IO on release Maxim Patlasov
2014-06-06 13:30 ` [PATCH 4/5] fuse: enable close_wait feature Maxim Patlasov
2014-06-06 13:31 ` [PATCH 5/5] fuse: fix synchronous case of fuse_file_put() Maxim Patlasov
2014-06-06 13:51 ` [fuse-devel] [PATCH 0/5] fuse: close file synchronously (v2) John Muir
2014-06-09  7:50   ` Maxim Patlasov
2014-06-09  9:26     ` John Muir
2014-06-09 10:46       ` Maxim Patlasov
2014-06-09 11:11         ` John Muir
2014-06-09 12:00           ` Maxim Patlasov
2014-08-13 12:44 ` Miklos Szeredi
2014-08-14 12:14   ` Maxim Patlasov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.