All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
@ 2012-07-05 10:50 Mitsuo Hayasaka
  2012-07-05 10:50 ` [RFC PATCH 1/5] " Mitsuo Hayasaka
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Mitsuo Hayasaka @ 2012-07-05 10:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt

Hi,

This patch series make maximum read/write request size tunable in FUSE.
Currently, it is limited to FUSE_MAX_PAGES_PER_REQ which is equal
to 32 pages. It is required to change it in order to improve the
throughput since optimized value depends on various factors such
as type and version of local filesystems used and HW specs, etc.

In addition, recently FUSE is widely used as a gateway to connect
cloud storage services and distributed filesystems. Larger data
might be stored in them over networking via FUSE and the overhead
might affect the throughput.

It seems there were many requests to increase FUSE_MAX_PAGES_PER_REQ
to improve the throughput, as follows.

http://sourceforge.net/mailarchive/forum.php?thread_name=4FC2F7A1.4010609%40gmail.com&forum_name=fuse-devel

http://old.nabble.com/-Fuse-2.8--big_write-option---%3E-128kb-write-syscall-...-howto-set-higher-value-td22292589.html

http://old.nabble.com/Block-size-%3E128k-td18675772.html

These discussions mention how to change both FUSE kernel and libfuse
sources such as FUSE_MAX_PAGES_PER_REQ and MIN_BUFSIZE, but the
changed and increased values have not been default yet. We guess this
is because it will be applied to the FUSE filesystems that do not need
the increased value.

One of the ways to solve this is to make them tunable.
In this series, the new sysfs parameter max_pages_per_req is introduced.
It limits the maximum read/write size in fuse request and it can be
changed from 32 to 256 pages in current implementations. When the
max_read/max_write mount option is specified, FUSE request size is set
per mount. (The size is rounded-up to page size and limited up to
max_pages_per_req.)

We think the sysfs parameter control is required, as follows.

- The libfuse should change the current MIN_BUFSIZE limitation according
  to this value. If not, The libfuse must always set it to the maximum
  request limit (= [256 pages * 4KB + 0x1000] in current implementation),
  which leads to waste of memory.

The 32 pages are set by default. Here, we use 256 pages as the hardlimit
in order to make maximum size of struct fuse_req within a page size (4KB).
However, we don't think it is the best value. Any comments are appreciated.

Also, the patch set for libfuse to change current MIN_BUFSIZE limitation
according to the sysfs parameter will be sent soon.


* Performance example

We evaluated the performance improvement due to this patch series.
FUSE filesystems are mounted on tmpfs and we measured the read/write
throughput using 512MB random data.

The results of average read/write throughtput are shown as follows.
 - we measured 10 times throughput for read and write operations,
   and calculated their average.

** write

For without direct_io option,
# of pages   |original(32)|tuning(32)|(64)  |(128) |(256)
---------------------------------------------------------
thruput(MB/s)|402.43      | 398.23   |441.86|485.41|525.78


For with direct_io option,
# of pages   |original(32)|tuning(32)|(64)  |(128) |(256)
---------------------------------------------------------
thruput(MB/s)|556.48      | 562.90   |611.74|722.18|743.43


** read

For without direct_io option, there is no deference between
original 32 pages and tuning patches since the read request size
does not changed even if changing the sysfs parameter.


For with direct_io option,
# of pages   |original(32)|tuning(32)|(64)  |(128) |(256)
---------------------------------------------------------
thruput(MB/s)|509.58      | 509.46   |592.43|640.04|677.26


 From these evaluations, this patch series can improve the
performance with an increase of the sysfs parameter. In
particular, the read/write throughput with direct_io achieves
a high improvement. These are just an exmaple and the results
may be changed in different systems. Therefore, we think
a tunable functionality of read/write request size is useful.

Thanks,

---

Mitsuo Hayasaka (5):
      fuse: add documentation of sysfs parameter to limit maximum fuse request size
      fuse: add a sysfs parameter to control maximum request size
      fuse: make default global limit minimum value
      fuse: do not create cache for fuse request allocation
      fuse: make maximum read/write request size tunable


 Documentation/filesystems/fuse.txt |   17 ++++++-
 fs/fuse/dev.c                      |   48 ++++++-------------
 fs/fuse/file.c                     |   32 +++++++------
 fs/fuse/fuse_i.h                   |   34 +++++++++----
 fs/fuse/inode.c                    |   91 +++++++++++++++++++++++++++++++++---
 5 files changed, 155 insertions(+), 67 deletions(-)

-- 
Mitsuo Hayasaka (mitsuo.hayasaka.hu@hitachi.com)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH 1/5] fuse: make maximum read/write request size tunable
  2012-07-05 10:50 [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Mitsuo Hayasaka
@ 2012-07-05 10:50 ` Mitsuo Hayasaka
  2012-07-05 10:50 ` [RFC PATCH 2/5] fuse: do not create cache for fuse request allocation Mitsuo Hayasaka
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Mitsuo Hayasaka @ 2012-07-05 10:50 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt,
	Mitsuo Hayasaka, Miklos Szeredi

Currently, the maximum read/write request size is limited to
FUSE_MAX_PAGES_PER_REQ which is equal to 32 pages. It is required to
change it in order to maximize the throughput since the optimized value
depends on various factors such as type and version of local filesystems
used and hardware specs, etc.

In addition, recently FUSE is widely used as a gateway to connect
cloud storage services and distributed filesystems. Larger data might be
stored in them over networking via FUSE and the overhead might affect the
read/write throughput.

This patch makes it tunable from 32 to 256 pages per mount.
The mount options of max_read or max_write affects it. The 32 pages
are used by default without these options.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
---

 fs/fuse/dev.c    |   27 ++++++++++++++-------------
 fs/fuse/file.c   |   32 +++++++++++++++++---------------
 fs/fuse/fuse_i.h |   29 +++++++++++++++++++----------
 fs/fuse/inode.c  |   40 +++++++++++++++++++++++++++++++++-------
 4 files changed, 83 insertions(+), 45 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 7df2b5e..511560b 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -34,35 +34,36 @@ static struct fuse_conn *fuse_get_conn(struct file *file)
 	return file->private_data;
 }
 
-static void fuse_request_init(struct fuse_req *req)
+static void fuse_request_init(struct fuse_conn *fc, struct fuse_req *req)
 {
-	memset(req, 0, sizeof(*req));
+	memset(req, 0, fc->fuse_req_size);
 	INIT_LIST_HEAD(&req->list);
 	INIT_LIST_HEAD(&req->intr_entry);
 	init_waitqueue_head(&req->waitq);
 	atomic_set(&req->count, 1);
 }
 
-struct fuse_req *fuse_request_alloc(void)
+struct fuse_req *fuse_request_alloc(struct fuse_conn *fc)
 {
-	struct fuse_req *req = kmem_cache_alloc(fuse_req_cachep, GFP_KERNEL);
+	struct fuse_req *req = kmalloc(fc->fuse_req_size, GFP_KERNEL);
+
 	if (req)
-		fuse_request_init(req);
+		fuse_request_init(fc, req);
 	return req;
 }
 EXPORT_SYMBOL_GPL(fuse_request_alloc);
 
-struct fuse_req *fuse_request_alloc_nofs(void)
+struct fuse_req *fuse_request_alloc_nofs(struct fuse_conn *fc)
 {
-	struct fuse_req *req = kmem_cache_alloc(fuse_req_cachep, GFP_NOFS);
+	struct fuse_req *req = kmalloc(fc->fuse_req_size, GFP_NOFS);
 	if (req)
-		fuse_request_init(req);
+		fuse_request_init(fc, req);
 	return req;
 }
 
 void fuse_request_free(struct fuse_req *req)
 {
-	kmem_cache_free(fuse_req_cachep, req);
+	kfree(req);
 }
 
 static void block_sigs(sigset_t *oldset)
@@ -116,7 +117,7 @@ struct fuse_req *fuse_get_req(struct fuse_conn *fc)
 	if (!fc->connected)
 		goto out;
 
-	req = fuse_request_alloc();
+	req = fuse_request_alloc(fc);
 	err = -ENOMEM;
 	if (!req)
 		goto out;
@@ -166,7 +167,7 @@ static void put_reserved_req(struct fuse_conn *fc, struct fuse_req *req)
 	struct fuse_file *ff = file->private_data;
 
 	spin_lock(&fc->lock);
-	fuse_request_init(req);
+	fuse_request_init(fc, req);
 	BUG_ON(ff->reserved_req);
 	ff->reserved_req = req;
 	wake_up_all(&fc->reserved_req_waitq);
@@ -193,7 +194,7 @@ struct fuse_req *fuse_get_req_nofail(struct fuse_conn *fc, struct file *file)
 
 	atomic_inc(&fc->num_waiting);
 	wait_event(fc->blocked_waitq, !fc->blocked);
-	req = fuse_request_alloc();
+	req = fuse_request_alloc(fc);
 	if (!req)
 		req = get_reserved_req(fc, file);
 
@@ -1564,7 +1565,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode,
 	else if (outarg->offset + num > file_size)
 		num = file_size - outarg->offset;
 
-	while (num && req->num_pages < FUSE_MAX_PAGES_PER_REQ) {
+	while (num && req->num_pages < fc->max_pages) {
 		struct page *page;
 		unsigned int this_num;
 
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index b321a68..7b96b00 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -57,7 +57,7 @@ struct fuse_file *fuse_file_alloc(struct fuse_conn *fc)
 		return NULL;
 
 	ff->fc = fc;
-	ff->reserved_req = fuse_request_alloc();
+	ff->reserved_req = fuse_request_alloc(fc);
 	if (unlikely(!ff->reserved_req)) {
 		kfree(ff);
 		return NULL;
@@ -653,7 +653,7 @@ static int fuse_readpages_fill(void *_data, struct page *page)
 	fuse_wait_on_page_writeback(inode, page->index);
 
 	if (req->num_pages &&
-	    (req->num_pages == FUSE_MAX_PAGES_PER_REQ ||
+	    (req->num_pages == fc->max_pages ||
 	     (req->num_pages + 1) * PAGE_CACHE_SIZE > fc->max_read ||
 	     req->pages[req->num_pages - 1]->index + 1 != page->index)) {
 		fuse_send_readpages(req, data->file);
@@ -866,7 +866,7 @@ static ssize_t fuse_fill_write_pages(struct fuse_req *req,
 		if (!fc->big_writes)
 			break;
 	} while (iov_iter_count(ii) && count < fc->max_write &&
-		 req->num_pages < FUSE_MAX_PAGES_PER_REQ && offset == 0);
+		 req->num_pages < fc->max_pages && offset == 0);
 
 	return count > 0 ? count : err;
 }
@@ -1020,8 +1020,9 @@ static void fuse_release_user_pages(struct fuse_req *req, int write)
 	}
 }
 
-static int fuse_get_user_pages(struct fuse_req *req, const char __user *buf,
-			       size_t *nbytesp, int write)
+static int fuse_get_user_pages(struct fuse_conn *fc, struct fuse_req *req,
+			       const char __user *buf, size_t *nbytesp,
+			       int write)
 {
 	size_t nbytes = *nbytesp;
 	unsigned long user_addr = (unsigned long) buf;
@@ -1038,9 +1039,9 @@ static int fuse_get_user_pages(struct fuse_req *req, const char __user *buf,
 		return 0;
 	}
 
-	nbytes = min_t(size_t, nbytes, FUSE_MAX_PAGES_PER_REQ << PAGE_SHIFT);
+	nbytes = min_t(size_t, nbytes, fc->max_pages << PAGE_SHIFT);
 	npages = (nbytes + offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
-	npages = clamp(npages, 1, FUSE_MAX_PAGES_PER_REQ);
+	npages = clamp(npages, 1, (int)fc->max_pages);
 	npages = get_user_pages_fast(user_addr, npages, !write, req->pages);
 	if (npages < 0)
 		return npages;
@@ -1077,7 +1078,7 @@ ssize_t fuse_direct_io(struct file *file, const char __user *buf,
 		size_t nres;
 		fl_owner_t owner = current->files;
 		size_t nbytes = min(count, nmax);
-		int err = fuse_get_user_pages(req, buf, &nbytes, write);
+		int err = fuse_get_user_pages(fc, req, buf, &nbytes, write);
 		if (err) {
 			res = err;
 			break;
@@ -1269,7 +1270,7 @@ static int fuse_writepage_locked(struct page *page)
 
 	set_page_writeback(page);
 
-	req = fuse_request_alloc_nofs();
+	req = fuse_request_alloc_nofs(fc);
 	if (!req)
 		goto err;
 
@@ -1695,10 +1696,11 @@ static int fuse_copy_ioctl_iovec_old(struct iovec *dst, void *src,
 }
 
 /* Make sure iov_length() won't overflow */
-static int fuse_verify_ioctl_iov(struct iovec *iov, size_t count)
+static int fuse_verify_ioctl_iov(struct fuse_conn *fc, struct iovec *iov,
+				 size_t count)
 {
 	size_t n;
-	u32 max = FUSE_MAX_PAGES_PER_REQ << PAGE_SHIFT;
+	u32 max = fc->max_pages << PAGE_SHIFT;
 
 	for (n = 0; n < count; n++) {
 		if (iov->iov_len > (size_t) max)
@@ -1821,7 +1823,7 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
 	BUILD_BUG_ON(sizeof(struct fuse_ioctl_iovec) * FUSE_IOCTL_MAX_IOV > PAGE_SIZE);
 
 	err = -ENOMEM;
-	pages = kcalloc(FUSE_MAX_PAGES_PER_REQ, sizeof(pages[0]), GFP_KERNEL);
+	pages = kcalloc(fc->max_pages, sizeof(pages[0]), GFP_KERNEL);
 	iov_page = (struct iovec *) __get_free_page(GFP_KERNEL);
 	if (!pages || !iov_page)
 		goto out;
@@ -1860,7 +1862,7 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
 
 	/* make sure there are enough buffer pages and init request with them */
 	err = -ENOMEM;
-	if (max_pages > FUSE_MAX_PAGES_PER_REQ)
+	if (max_pages > fc->max_pages)
 		goto out;
 	while (num_pages < max_pages) {
 		pages[num_pages] = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
@@ -1943,11 +1945,11 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
 		in_iov = iov_page;
 		out_iov = in_iov + in_iovs;
 
-		err = fuse_verify_ioctl_iov(in_iov, in_iovs);
+		err = fuse_verify_ioctl_iov(fc, in_iov, in_iovs);
 		if (err)
 			goto out;
 
-		err = fuse_verify_ioctl_iov(out_iov, out_iovs);
+		err = fuse_verify_ioctl_iov(fc, out_iov, out_iovs);
 		if (err)
 			goto out;
 
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 771fb63..c96dc5f 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -23,8 +23,11 @@
 #include <linux/poll.h>
 #include <linux/workqueue.h>
 
-/** Max number of pages that can be used in a single read request */
-#define FUSE_MAX_PAGES_PER_REQ 32
+/** Maximum number of pages that can be used in a single read/write request */
+#define FUSE_MAX_PAGES_PER_REQ 256
+
+/** Default number of pages that can be used in a single read/write request */
+#define FUSE_DEFAULT_MAX_PAGES_PER_REQ 32
 
 /** Bias for fi->writectr, meaning new writepages must not be sent */
 #define FUSE_NOWRITE INT_MIN
@@ -290,12 +293,6 @@ struct fuse_req {
 		struct fuse_lk_in lk_in;
 	} misc;
 
-	/** page vector */
-	struct page *pages[FUSE_MAX_PAGES_PER_REQ];
-
-	/** number of pages in vector */
-	unsigned num_pages;
-
 	/** offset of data on first page */
 	unsigned page_offset;
 
@@ -313,6 +310,12 @@ struct fuse_req {
 
 	/** Request is stolen from fuse_file->reserved_req */
 	struct file *stolen_file;
+
+	/** number of pages in vector */
+	unsigned num_pages;
+
+	/** page vector */
+	struct page *pages[0];
 };
 
 /**
@@ -347,6 +350,12 @@ struct fuse_conn {
 	/** Maximum write size */
 	unsigned max_write;
 
+	/** Maximum number of pages per req */
+	unsigned max_pages;
+
+	/** fuse_req size per connection */
+	unsigned fuse_req_size;
+
 	/** Readers of the connection are waiting on this */
 	wait_queue_head_t waitq;
 
@@ -655,9 +664,9 @@ void fuse_ctl_cleanup(void);
 /**
  * Allocate a request
  */
-struct fuse_req *fuse_request_alloc(void);
+struct fuse_req *fuse_request_alloc(struct fuse_conn *fc);
 
-struct fuse_req *fuse_request_alloc_nofs(void);
+struct fuse_req *fuse_request_alloc_nofs(struct fuse_conn *fc);
 
 /**
  * Free a request
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 1cd6165..aadf157 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -550,6 +550,9 @@ void fuse_conn_init(struct fuse_conn *fc)
 	atomic_set(&fc->num_waiting, 0);
 	fc->max_background = FUSE_DEFAULT_MAX_BACKGROUND;
 	fc->congestion_threshold = FUSE_DEFAULT_CONGESTION_THRESHOLD;
+	fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ;
+	fc->fuse_req_size = sizeof(struct fuse_req) +
+			    fc->max_pages * sizeof(struct page *);
 	fc->khctr = 0;
 	fc->polled_files = RB_ROOT;
 	fc->reqctr = 0;
@@ -774,6 +777,16 @@ static int set_global_limit(const char *val, struct kernel_param *kp)
 	return 0;
 }
 
+static void set_conn_max_pages(struct fuse_conn *fc, unsigned max_pages)
+{
+	if (max_pages > fc->max_pages) {
+		fc->max_pages = min_t(unsigned, FUSE_MAX_PAGES_PER_REQ,
+				      max_pages);
+		fc->fuse_req_size = sizeof(struct fuse_req) +
+				    fc->max_pages * sizeof(struct page *);
+	}
+}
+
 static void process_init_limits(struct fuse_conn *fc, struct fuse_init_out *arg)
 {
 	int cap_sys_admin = capable(CAP_SYS_ADMIN);
@@ -807,6 +820,7 @@ static void process_init_reply(struct fuse_conn *fc, struct fuse_req *req)
 		fc->conn_error = 1;
 	else {
 		unsigned long ra_pages;
+		unsigned max_pages;
 
 		process_init_limits(fc, arg);
 
@@ -844,6 +858,8 @@ static void process_init_reply(struct fuse_conn *fc, struct fuse_req *req)
 		fc->minor = arg->minor;
 		fc->max_write = arg->minor < 5 ? 4096 : arg->max_write;
 		fc->max_write = max_t(unsigned, 4096, fc->max_write);
+		max_pages = DIV_ROUND_UP(fc->max_write, PAGE_SIZE);
+		set_conn_max_pages(fc, max_pages);
 		fc->conn_init = 1;
 	}
 	fc->blocked = 0;
@@ -880,6 +896,20 @@ static void fuse_free_conn(struct fuse_conn *fc)
 	kfree(fc);
 }
 
+static void fuse_conn_setup(struct fuse_conn *fc,
+			    struct fuse_mount_data *d)
+{
+	unsigned max_pages;
+
+	fc->release = fuse_free_conn;
+	fc->flags = d->flags;
+	fc->user_id = d->user_id;
+	fc->group_id = d->group_id;
+	fc->max_read = max_t(unsigned, 4096, d->max_read);
+	max_pages = DIV_ROUND_UP(fc->max_read, PAGE_SIZE);
+	set_conn_max_pages(fc, max_pages);
+}
+
 static int fuse_bdi_init(struct fuse_conn *fc, struct super_block *sb)
 {
 	int err;
@@ -986,11 +1016,7 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
 		fc->dont_mask = 1;
 	sb->s_flags |= MS_POSIXACL;
 
-	fc->release = fuse_free_conn;
-	fc->flags = d.flags;
-	fc->user_id = d.user_id;
-	fc->group_id = d.group_id;
-	fc->max_read = max_t(unsigned, 4096, d.max_read);
+	fuse_conn_setup(fc, &d);
 
 	/* Used by get_root_inode() */
 	sb->s_fs_info = fc;
@@ -1003,12 +1029,12 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
 	/* only now - we want root dentry with NULL ->d_op */
 	sb->s_d_op = &fuse_dentry_operations;
 
-	init_req = fuse_request_alloc();
+	init_req = fuse_request_alloc(fc);
 	if (!init_req)
 		goto err_put_root;
 
 	if (is_bdev) {
-		fc->destroy_req = fuse_request_alloc();
+		fc->destroy_req = fuse_request_alloc(fc);
 		if (!fc->destroy_req)
 			goto err_free_init_req;
 	}


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 2/5] fuse: do not create cache for fuse request allocation
  2012-07-05 10:50 [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Mitsuo Hayasaka
  2012-07-05 10:50 ` [RFC PATCH 1/5] " Mitsuo Hayasaka
@ 2012-07-05 10:50 ` Mitsuo Hayasaka
  2012-07-05 10:51 ` [RFC PATCH 3/5] fuse: make default global limit minimum value Mitsuo Hayasaka
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Mitsuo Hayasaka @ 2012-07-05 10:50 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt,
	Mitsuo Hayasaka, Miklos Szeredi

The fuse_req_cachep was used for request allocation in fuse.
This patch does not create it since it is not used anymore
due to the tunable read/write request size in fuse.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
---

 fs/fuse/dev.c |   21 +--------------------
 1 files changed, 1 insertions(+), 20 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 511560b..4087ff4 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -23,8 +23,6 @@
 MODULE_ALIAS_MISCDEV(FUSE_MINOR);
 MODULE_ALIAS("devname:fuse");
 
-static struct kmem_cache *fuse_req_cachep;
-
 static struct fuse_conn *fuse_get_conn(struct file *file)
 {
 	/*
@@ -2075,27 +2073,10 @@ static struct miscdevice fuse_miscdevice = {
 
 int __init fuse_dev_init(void)
 {
-	int err = -ENOMEM;
-	fuse_req_cachep = kmem_cache_create("fuse_request",
-					    sizeof(struct fuse_req),
-					    0, 0, NULL);
-	if (!fuse_req_cachep)
-		goto out;
-
-	err = misc_register(&fuse_miscdevice);
-	if (err)
-		goto out_cache_clean;
-
-	return 0;
-
- out_cache_clean:
-	kmem_cache_destroy(fuse_req_cachep);
- out:
-	return err;
+	return misc_register(&fuse_miscdevice);
 }
 
 void fuse_dev_cleanup(void)
 {
 	misc_deregister(&fuse_miscdevice);
-	kmem_cache_destroy(fuse_req_cachep);
 }


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 3/5] fuse: make default global limit minimum value
  2012-07-05 10:50 [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Mitsuo Hayasaka
  2012-07-05 10:50 ` [RFC PATCH 1/5] " Mitsuo Hayasaka
  2012-07-05 10:50 ` [RFC PATCH 2/5] fuse: do not create cache for fuse request allocation Mitsuo Hayasaka
@ 2012-07-05 10:51 ` Mitsuo Hayasaka
  2012-07-05 10:51 ` [RFC PATCH 4/5] fuse: add a sysfs parameter to control maximum request size Mitsuo Hayasaka
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Mitsuo Hayasaka @ 2012-07-05 10:51 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt,
	Mitsuo Hayasaka, Miklos Szeredi

The default global limits for congestion threshold and backgrounded
requests are calculated using size of fuse_req structure, which is
variable due to the tunable read/write request size. This patch sets
them to their minimum values by default in order to avoid the variable
and unstable limits per mount.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
---

 fs/fuse/fuse_i.h |    5 +++++
 fs/fuse/inode.c  |    2 +-
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index c96dc5f..72210a8 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -29,6 +29,11 @@
 /** Default number of pages that can be used in a single read/write request */
 #define FUSE_DEFAULT_MAX_PAGES_PER_REQ 32
 
+/** Maximum size of struct fuse_req */
+#define FUSE_MAX_FUSE_REQ_SIZE	(sizeof(struct fuse_req) + \
+				 FUSE_MAX_PAGES_PER_REQ * \
+				 sizeof(struct page *))
+
 /** Bias for fi->writectr, meaning new writepages must not be sent */
 #define FUSE_NOWRITE INT_MIN
 
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index aadf157..d8d302a 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -758,7 +758,7 @@ static void sanitize_global_limit(unsigned *limit)
 {
 	if (*limit == 0)
 		*limit = ((num_physpages << PAGE_SHIFT) >> 13) /
-			 sizeof(struct fuse_req);
+			 FUSE_MAX_FUSE_REQ_SIZE;
 
 	if (*limit >= 1 << 16)
 		*limit = (1 << 16) - 1;


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 4/5] fuse: add a sysfs parameter to control maximum request size
  2012-07-05 10:50 [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Mitsuo Hayasaka
                   ` (2 preceding siblings ...)
  2012-07-05 10:51 ` [RFC PATCH 3/5] fuse: make default global limit minimum value Mitsuo Hayasaka
@ 2012-07-05 10:51 ` Mitsuo Hayasaka
  2012-07-05 10:51 ` [RFC PATCH 5/5] fuse: add documentation of sysfs parameter to limit maximum fuse " Mitsuo Hayasaka
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Mitsuo Hayasaka @ 2012-07-05 10:51 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt,
	Mitsuo Hayasaka, Miklos Szeredi

The tunable maximum read/write request size changes the size of
fuse request when max_read/max_write mount option is specified.

The libfuse should change the current MIN_BUFSIZE limitation
according to the current maximum request size. If not, the
libfuse must always set MIN_BUFSIZE to the maximum request limit
(= [256 pages * 4KB + 0x1000] in current implementation), which
leads to waste of memory. So, it is necessary to get it from
userspace.

This patch adds a sysfs parameter to achieve it. It can be
changed from 32 to 256 pages and the 32 pages are set by default.

When we want to increase the maximum request size, it is required
to change this parameter before mounting FUSE filesystems.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
---

 fs/fuse/inode.c |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index d8d302a..50b78c6 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -47,6 +47,13 @@ MODULE_PARM_DESC(max_user_congthresh,
  "Global limit for the maximum congestion threshold an "
  "unprivileged user can set");
 
+/**
+ * Maximum number of pages allocated for struct fuse_req.
+ * It can be changed via sysfs from FUSE_DEFAULT_MAX_PAGES_PER_REQ
+ * to FUSE_MAX_PAGES_PER_REQ.
+ */
+static unsigned sysfs_max_req_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ;
+
 #define FUSE_SUPER_MAGIC 0x65735546
 
 #define FUSE_DEFAULT_BLKSIZE 512
@@ -780,8 +787,7 @@ static int set_global_limit(const char *val, struct kernel_param *kp)
 static void set_conn_max_pages(struct fuse_conn *fc, unsigned max_pages)
 {
 	if (max_pages > fc->max_pages) {
-		fc->max_pages = min_t(unsigned, FUSE_MAX_PAGES_PER_REQ,
-				      max_pages);
+		fc->max_pages = min_t(unsigned, sysfs_max_req_pages, max_pages);
 		fc->fuse_req_size = sizeof(struct fuse_req) +
 				    fc->max_pages * sizeof(struct page *);
 	}
@@ -1203,6 +1209,43 @@ static void fuse_fs_cleanup(void)
 static struct kobject *fuse_kobj;
 static struct kobject *connections_kobj;
 
+static ssize_t max_req_pages_show(struct kobject *kobj,
+				  struct kobj_attribute *attr, char *buf)
+{
+	return sprintf(buf, "%u\n", sysfs_max_req_pages);
+}
+
+static ssize_t max_req_pages_store(struct kobject *kobj,
+				   struct kobj_attribute *attr,
+				   const char *buf, size_t count)
+{
+	int err;
+	unsigned long t;
+
+	err = kstrtoul(skip_spaces(buf), 0, &t);
+	if (err)
+		return err;
+
+	t = max_t(unsigned long, t, FUSE_DEFAULT_MAX_PAGES_PER_REQ);
+	t = min_t(unsigned long, t, FUSE_MAX_PAGES_PER_REQ);
+
+	sysfs_max_req_pages = t;
+	return count;
+}
+
+static struct kobj_attribute max_req_pages_attr =
+	__ATTR(max_pages_per_req, 0644, max_req_pages_show,
+	       max_req_pages_store);
+
+static struct attribute *fuse_attrs[] = {
+	&max_req_pages_attr.attr,
+	NULL,
+};
+
+static struct attribute_group fuse_attr_grp = {
+	.attrs = fuse_attrs,
+};
+
 static int fuse_sysfs_init(void)
 {
 	int err;
@@ -1219,8 +1262,14 @@ static int fuse_sysfs_init(void)
 		goto out_fuse_unregister;
 	}
 
+	err = sysfs_create_group(fuse_kobj, &fuse_attr_grp);
+	if (err)
+		goto out_conn_unregister;
+
 	return 0;
 
+ out_conn_unregister:
+	kobject_put(connections_kobj);
  out_fuse_unregister:
 	kobject_put(fuse_kobj);
  out_err:


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 5/5] fuse: add documentation of sysfs parameter to limit maximum fuse request size
  2012-07-05 10:50 [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Mitsuo Hayasaka
                   ` (3 preceding siblings ...)
  2012-07-05 10:51 ` [RFC PATCH 4/5] fuse: add a sysfs parameter to control maximum request size Mitsuo Hayasaka
@ 2012-07-05 10:51 ` Mitsuo Hayasaka
  2012-07-06 12:54   ` Rob Landley
  2012-07-05 13:04 ` [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Nikolaus Rath
  2012-07-06  5:53 ` Liu Yuan
  6 siblings, 1 reply; 16+ messages in thread
From: Mitsuo Hayasaka @ 2012-07-05 10:51 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt,
	Mitsuo Hayasaka, Rob Landley, Miklos Szeredi

Add an explantion about the sysfs parameter to the limit
maximum read/write request size.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Rob Landley <rob@landley.net>
Cc: Miklos Szeredi <miklos@szeredi.hu>
---

 Documentation/filesystems/fuse.txt |   17 ++++++++++++++++-
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/Documentation/filesystems/fuse.txt b/Documentation/filesystems/fuse.txt
index 13af4a4..e6ffba3 100644
--- a/Documentation/filesystems/fuse.txt
+++ b/Documentation/filesystems/fuse.txt
@@ -108,13 +108,28 @@ Mount options
 
   With this option the maximum size of read operations can be set.
   The default is infinite.  Note that the size of read requests is
-  limited anyway to 32 pages (which is 128kbyte on i386).
+  limited to 32 pages (which is 128kbyte on i386) if direct_io
+  option is not specified. When direct_io option is specified,
+  the request size is limited to max_pages_per_req sysfs parameter.
 
 'blksize=N'
 
   Set the block size for the filesystem.  The default is 512.  This
   option is only valid for 'fuseblk' type mounts.
 
+Sysfs parameter
+~~~~~~~~~~~~~~~
+
+The sysfs parameter max_pages_per_req limits the maximum page size per
+FUSE request.
+
+	/sys/fs/fuse/max_pages_per_req
+
+The default is 32 pages. It can be changed from 32 to 256 pages, which
+may improve the read/write throughput optimizing it. This change is
+effective per mount. Therefore, the re-mounting of FUSE filesystem
+is required after changing it.
+
 Control filesystem
 ~~~~~~~~~~~~~~~~~~
 


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
  2012-07-05 10:50 [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Mitsuo Hayasaka
                   ` (4 preceding siblings ...)
  2012-07-05 10:51 ` [RFC PATCH 5/5] fuse: add documentation of sysfs parameter to limit maximum fuse " Mitsuo Hayasaka
@ 2012-07-05 13:04 ` Nikolaus Rath
  2012-07-06 10:09   ` HAYASAKA Mitsuo
  2012-07-06  5:53 ` Liu Yuan
  6 siblings, 1 reply; 16+ messages in thread
From: Nikolaus Rath @ 2012-07-05 13:04 UTC (permalink / raw)
  To: Mitsuo Hayasaka
  Cc: Miklos Szeredi, fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt

Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> writes:
> Hi,
>
> This patch series make maximum read/write request size tunable in FUSE.
> Currently, it is limited to FUSE_MAX_PAGES_PER_REQ which is equal
> to 32 pages. It is required to change it in order to improve the
> throughput since optimized value depends on various factors such
> as type and version of local filesystems used and HW specs, etc.

This truly is a joyful week for FUSE :-).

Are these patches compatible with the fuse write-back patch series
posted by Pavel a few days ago?


Thanks,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
  2012-07-05 10:50 [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Mitsuo Hayasaka
                   ` (5 preceding siblings ...)
  2012-07-05 13:04 ` [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Nikolaus Rath
@ 2012-07-06  5:53 ` Liu Yuan
  2012-07-06 13:58   ` [fuse-devel] " Han-Wen Nienhuys
  6 siblings, 1 reply; 16+ messages in thread
From: Liu Yuan @ 2012-07-06  5:53 UTC (permalink / raw)
  To: Mitsuo Hayasaka
  Cc: Miklos Szeredi, fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt

On 07/05/2012 06:50 PM, Mitsuo Hayasaka wrote:
> One of the ways to solve this is to make them tunable.
> In this series, the new sysfs parameter max_pages_per_req is introduced.
> It limits the maximum read/write size in fuse request and it can be
> changed from 32 to 256 pages in current implementations. When the
> max_read/max_write mount option is specified, FUSE request size is set
> per mount. (The size is rounded-up to page size and limited up to
> max_pages_per_req.)

Why maxim 256 pages? If we are here, we can go further: most of object
storage system has object size of multiple to dozens of megabytes. So I
think probably 1M is too small. Our distribution storage system has 4M
per object, so I think at least maxim size could be bigger than 4M.

Thanks,
Yuan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
  2012-07-05 13:04 ` [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Nikolaus Rath
@ 2012-07-06 10:09   ` HAYASAKA Mitsuo
  0 siblings, 0 replies; 16+ messages in thread
From: HAYASAKA Mitsuo @ 2012-07-06 10:09 UTC (permalink / raw)
  To: Nikolaus Rath; +Cc: Miklos Szeredi, fuse-devel, linux-kernel, yrl.pp-manager.tt

Hi Nikolaus,

Thank you for your comments.

(2012/07/05 22:04), Nikolaus Rath wrote:
> Mitsuo Hayasaka<mitsuo.hayasaka.hu@hitachi.com>  writes:
>> Hi,
>>
>> This patch series make maximum read/write request size tunable in FUSE.
>> Currently, it is limited to FUSE_MAX_PAGES_PER_REQ which is equal
>> to 32 pages. It is required to change it in order to improve the
>> throughput since optimized value depends on various factors such
>> as type and version of local filesystems used and HW specs, etc.
>
> This truly is a joyful week for FUSE :-).
>
> Are these patches compatible with the fuse write-back patch series
> posted by Pavel a few days ago?


I applied this patch series to the latest upstream kernel and measured
the read/write throughput using it. So, I have not try Pavel's patch yet.

However, I think it is compatible with his write-back patch series
since it just makes maximum limit of fuse request size tunable.
And my patch will be effective even for direct I/O.

Thanks,

>
>
> Thanks,
>
>     -Nikolaus
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 5/5] fuse: add documentation of sysfs parameter to limit maximum fuse request size
  2012-07-05 10:51 ` [RFC PATCH 5/5] fuse: add documentation of sysfs parameter to limit maximum fuse " Mitsuo Hayasaka
@ 2012-07-06 12:54   ` Rob Landley
  2012-07-12 13:13     ` HAYASAKA Mitsuo
  0 siblings, 1 reply; 16+ messages in thread
From: Rob Landley @ 2012-07-06 12:54 UTC (permalink / raw)
  To: Mitsuo Hayasaka
  Cc: Miklos Szeredi, fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt

On 07/05/2012 05:51 AM, Mitsuo Hayasaka wrote:
> Add an explantion about the sysfs parameter to the limit
> maximum read/write request size.
> 
> Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
> Cc: Rob Landley <rob@landley.net>
> Cc: Miklos Szeredi <miklos@szeredi.hu>
> ---
> 
>  Documentation/filesystems/fuse.txt |   17 ++++++++++++++++-
>  1 files changed, 16 insertions(+), 1 deletions(-)
> 
> diff --git a/Documentation/filesystems/fuse.txt b/Documentation/filesystems/fuse.txt
> index 13af4a4..e6ffba3 100644
> --- a/Documentation/filesystems/fuse.txt
> +++ b/Documentation/filesystems/fuse.txt
> @@ -108,13 +108,28 @@ Mount options
>  
>    With this option the maximum size of read operations can be set.
>    The default is infinite.  Note that the size of read requests is
> -  limited anyway to 32 pages (which is 128kbyte on i386).
> +  limited to 32 pages (which is 128kbyte on i386) if direct_io
> +  option is not specified. When direct_io option is specified,
> +  the request size is limited to max_pages_per_req sysfs parameter.

"Note that the maximum size of read requests defaults to 32 pages (128k
on i386), use max_pages_per_req to change this default."

And then describe max_page_per_req sufficiently thoroughly below, all in
one place.

(By the way, has anybody actually tested it with a single page as the
limit? Does that work?)

>  'blksize=N'
>  
>    Set the block size for the filesystem.  The default is 512.  This
>    option is only valid for 'fuseblk' type mounts.
>  
> +Sysfs parameter
> +~~~~~~~~~~~~~~~
> +
> +The sysfs parameter max_pages_per_req limits the maximum page size per
> +FUSE request.

No, it limits the maximum size of a data request and the units are
decimal number of pages. It doesn't change the size of memory pages in
the system.

Also, your first hunk implies this setting only takes effect if they
mounted with "-o direct_io", is that true?

> +	/sys/fs/fuse/max_pages_per_req
> +
> +The default is 32 pages. It can be changed from 32 to 256 pages, which
> +may improve the read/write throughput optimizing it. This change is
> +effective per mount. Therefore, the re-mounting of FUSE filesystem
> +is required after changing it.

I'd say "Changing it to 256 pages may improve read/write throguhput on
systems with enough memory. Existing FUSE mounts must be remounted for
this change to take effect."

I.E. don't imply 32 and 256 are the only options unless they are. (Is
there some requirement that it be a power of 2, or just a good idea?)

And per-mount sounds like you're setting it for a specific mount point,
so if I have three mounts there would be three entries under
/sys/fs/fuse, which does not seem to be the case. (Which is odd, because
you'd think there would be an "-o max_pages_per_req=128" that _would_
set this per-mount if the value actually used is cached in the
superblock, but I'm not seeing one...)

Rob
-- 
GNU/Linux isn't: Linux=GPLv2, GNU=GPLv3+, they can't share code.
Either it's "mere aggregation", or a license violation.  Pick one.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [fuse-devel] [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
  2012-07-06  5:53 ` Liu Yuan
@ 2012-07-06 13:58   ` Han-Wen Nienhuys
  2012-07-12  5:58     ` HAYASAKA Mitsuo
  0 siblings, 1 reply; 16+ messages in thread
From: Han-Wen Nienhuys @ 2012-07-06 13:58 UTC (permalink / raw)
  To: Liu Yuan
  Cc: Mitsuo Hayasaka, fuse-devel, yrl.pp-manager.tt, linux-doc,
	linux-kernel, Miklos Szeredi

On Fri, Jul 6, 2012 at 2:53 AM, Liu Yuan <namei.unix@gmail.com> wrote:
> On 07/05/2012 06:50 PM, Mitsuo Hayasaka wrote:
>> One of the ways to solve this is to make them tunable.
>> In this series, the new sysfs parameter max_pages_per_req is introduced.
>> It limits the maximum read/write size in fuse request and it can be
>> changed from 32 to 256 pages in current implementations. When the
>> max_read/max_write mount option is specified, FUSE request size is set
>> per mount. (The size is rounded-up to page size and limited up to
>> max_pages_per_req.)
>
> Why maxim 256 pages? If we are here, we can go further: most of object
> storage system has object size of multiple to dozens of megabytes. So I
> think probably 1M is too small. Our distribution storage system has 4M
> per object, so I think at least maxim size could be bigger than 4M.

The maximum pipe size on my system is 1M, so if you go beyond that,
splicing from the FD won't work.

Also, the userspace client must reserve a buffer this size so it can
receive a write, which is a waste since most requests are much
smaller.

-- 
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [fuse-devel] [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
  2012-07-06 13:58   ` [fuse-devel] " Han-Wen Nienhuys
@ 2012-07-12  5:58     ` HAYASAKA Mitsuo
  2012-07-12  6:13       ` Liu Yuan
  2012-07-12 10:13       ` Miklos Szeredi
  0 siblings, 2 replies; 16+ messages in thread
From: HAYASAKA Mitsuo @ 2012-07-12  5:58 UTC (permalink / raw)
  To: hanwen, Han-Wen Nienhuys, Liu Yuan
  Cc: fuse-devel, linux-doc, linux-kernel, Miklos Szeredi, yrl.pp-manager.tt

Hi Yuan and Han-Wen,

Thank you for your comments.

(2012/07/06 22:58), Han-Wen Nienhuys wrote:
> On Fri, Jul 6, 2012 at 2:53 AM, Liu Yuan<namei.unix@gmail.com>  wrote:
>> On 07/05/2012 06:50 PM, Mitsuo Hayasaka wrote:
>>> One of the ways to solve this is to make them tunable.
>>> In this series, the new sysfs parameter max_pages_per_req is introduced.
>>> It limits the maximum read/write size in fuse request and it can be
>>> changed from 32 to 256 pages in current implementations. When the
>>> max_read/max_write mount option is specified, FUSE request size is set
>>> per mount. (The size is rounded-up to page size and limited up to
>>> max_pages_per_req.)
>>
>> Why maxim 256 pages? If we are here, we can go further: most of object
>> storage system has object size of multiple to dozens of megabytes. So I
>> think probably 1M is too small. Our distribution storage system has 4M
>> per object, so I think at least maxim size could be bigger than 4M.
>
> The maximum pipe size on my system is 1M, so if you go beyond that,
> splicing from the FD won't work.
>
> Also, the userspace client must reserve a buffer this size so it can
> receive a write, which is a waste since most requests are much
> smaller.
>

I checked the maximum pipe size can be changed using fcntl(2) or
/proc/sys/fs/pipe-max-size. It is clear that it is not a fixed value.

Also, it seems that there is a request for setting the maximum number
of pages per fuse request to 4M (1024 pages). One of the reasons to
introduce the sysfs max_pages_per_req parameter is to set a threshold
of the maximum number of pages dynamically according to the
administrator's demand, and root can only change it.

So, when the maximum value is required to be set to not more than the
pipe-max-size, the max_pages_per_req should be changed considering it.
It seems that the upper limit of this parameter does not have to be
not more than it.

I'm planning to limit max_pages_per_req up to 1024 pages and add the
document to /Documentation/filesystems/fuse.txt, as follows.

"the sysfs max_pages_per_req parameter can be changed from 32 to 1024.
The default is 32 pages. Generally, the pipe-max-size is 1M (256 pages)
and it is better to set it to not more than the pipe-max-size."

This is just a plan and any comments are appreciated.

Thanks,

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [fuse-devel] [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
  2012-07-12  5:58     ` HAYASAKA Mitsuo
@ 2012-07-12  6:13       ` Liu Yuan
  2012-07-12 10:13       ` Miklos Szeredi
  1 sibling, 0 replies; 16+ messages in thread
From: Liu Yuan @ 2012-07-12  6:13 UTC (permalink / raw)
  To: HAYASAKA Mitsuo
  Cc: hanwen, Han-Wen Nienhuys, fuse-devel, linux-doc, linux-kernel,
	Miklos Szeredi, yrl.pp-manager.tt

On 07/12/2012 01:58 PM, HAYASAKA Mitsuo wrote:
> Hi Yuan and Han-Wen,
> 
> Thank you for your comments.
> 
> (2012/07/06 22:58), Han-Wen Nienhuys wrote:
>> On Fri, Jul 6, 2012 at 2:53 AM, Liu Yuan<namei.unix@gmail.com>  wrote:
>>> On 07/05/2012 06:50 PM, Mitsuo Hayasaka wrote:
>>>> One of the ways to solve this is to make them tunable.
>>>> In this series, the new sysfs parameter max_pages_per_req is
>>>> introduced.
>>>> It limits the maximum read/write size in fuse request and it can be
>>>> changed from 32 to 256 pages in current implementations. When the
>>>> max_read/max_write mount option is specified, FUSE request size is set
>>>> per mount. (The size is rounded-up to page size and limited up to
>>>> max_pages_per_req.)
>>>
>>> Why maxim 256 pages? If we are here, we can go further: most of object
>>> storage system has object size of multiple to dozens of megabytes. So I
>>> think probably 1M is too small. Our distribution storage system has 4M
>>> per object, so I think at least maxim size could be bigger than 4M.
>>
>> The maximum pipe size on my system is 1M, so if you go beyond that,
>> splicing from the FD won't work.
>>
>> Also, the userspace client must reserve a buffer this size so it can
>> receive a write, which is a waste since most requests are much
>> smaller.
>>
> 
> I checked the maximum pipe size can be changed using fcntl(2) or
> /proc/sys/fs/pipe-max-size. It is clear that it is not a fixed value.
> 
> Also, it seems that there is a request for setting the maximum number
> of pages per fuse request to 4M (1024 pages). One of the reasons to
> introduce the sysfs max_pages_per_req parameter is to set a threshold
> of the maximum number of pages dynamically according to the
> administrator's demand, and root can only change it.
> 
> So, when the maximum value is required to be set to not more than the
> pipe-max-size, the max_pages_per_req should be changed considering it.
> It seems that the upper limit of this parameter does not have to be
> not more than it.
> 
> I'm planning to limit max_pages_per_req up to 1024 pages and add the
> document to /Documentation/filesystems/fuse.txt, as follows.
> 
> "the sysfs max_pages_per_req parameter can be changed from 32 to 1024.
> The default is 32 pages. Generally, the pipe-max-size is 1M (256 pages)
> and it is better to set it to not more than the pipe-max-size."
> 
> This is just a plan and any comments are appreciated.

This looks reasonable to me, we should try our best to maximize the
upper ceiling to deal with various of kinds of demands.

Thanks for your work, Mitsuo, as a user of FUSE, I'd vote +1 for your
patch set.

Thanks,
Yuan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [fuse-devel] [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
  2012-07-12  5:58     ` HAYASAKA Mitsuo
  2012-07-12  6:13       ` Liu Yuan
@ 2012-07-12 10:13       ` Miklos Szeredi
  2012-07-13  7:30         ` HAYASAKA Mitsuo
  1 sibling, 1 reply; 16+ messages in thread
From: Miklos Szeredi @ 2012-07-12 10:13 UTC (permalink / raw)
  To: HAYASAKA Mitsuo
  Cc: hanwen, Han-Wen Nienhuys, Liu Yuan, fuse-devel, linux-doc,
	linux-kernel, yrl.pp-manager.tt

HAYASAKA Mitsuo <mitsuo.hayasaka.hu@hitachi.com> writes:

> Hi Yuan and Han-Wen,
>
> Thank you for your comments.
>
> (2012/07/06 22:58), Han-Wen Nienhuys wrote:
>> On Fri, Jul 6, 2012 at 2:53 AM, Liu Yuan<namei.unix@gmail.com>  wrote:
>>> On 07/05/2012 06:50 PM, Mitsuo Hayasaka wrote:
>>>> One of the ways to solve this is to make them tunable.
>>>> In this series, the new sysfs parameter max_pages_per_req is introduced.
>>>> It limits the maximum read/write size in fuse request and it can be
>>>> changed from 32 to 256 pages in current implementations. When the
>>>> max_read/max_write mount option is specified, FUSE request size is set
>>>> per mount. (The size is rounded-up to page size and limited up to
>>>> max_pages_per_req.)
>>>
>>> Why maxim 256 pages? If we are here, we can go further: most of object
>>> storage system has object size of multiple to dozens of megabytes. So I
>>> think probably 1M is too small. Our distribution storage system has 4M
>>> per object, so I think at least maxim size could be bigger than 4M.
>>
>> The maximum pipe size on my system is 1M, so if you go beyond that,
>> splicing from the FD won't work.
>>
>> Also, the userspace client must reserve a buffer this size so it can
>> receive a write, which is a waste since most requests are much
>> smaller.
>>
>
> I checked the maximum pipe size can be changed using fcntl(2) or
> /proc/sys/fs/pipe-max-size. It is clear that it is not a fixed value.
>
> Also, it seems that there is a request for setting the maximum number
> of pages per fuse request to 4M (1024 pages). One of the reasons to
> introduce the sysfs max_pages_per_req parameter is to set a threshold
> of the maximum number of pages dynamically according to the
> administrator's demand, and root can only change it.
>
> So, when the maximum value is required to be set to not more than the
> pipe-max-size, the max_pages_per_req should be changed considering it.
> It seems that the upper limit of this parameter does not have to be
> not more than it.
>
> I'm planning to limit max_pages_per_req up to 1024 pages and add the
> document to /Documentation/filesystems/fuse.txt, as follows.
>
> "the sysfs max_pages_per_req parameter can be changed from 32 to 1024.
> The default is 32 pages. Generally, the pipe-max-size is 1M (256 pages)
> and it is better to set it to not more than the pipe-max-size."

Can't we just use pipe-max-size for the limit?

Then we'll use the minimum of pipe-max-size and max_read/max_write for
sizing the requests.

Another comment: do we really need to allocate each and every request
with space for the pages?  I don't think that makes sense.  Let's leave
some small number of pages inline in the request and allocate a separate
array if the number of pages is too large.  There may even be some utilities
in the kernel to handle dynamically sized page arrays (I haven't looked
but I suspect there is).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 5/5] fuse: add documentation of sysfs parameter to limit maximum fuse request size
  2012-07-06 12:54   ` Rob Landley
@ 2012-07-12 13:13     ` HAYASAKA Mitsuo
  0 siblings, 0 replies; 16+ messages in thread
From: HAYASAKA Mitsuo @ 2012-07-12 13:13 UTC (permalink / raw)
  To: Rob Landley
  Cc: Miklos Szeredi, fuse-devel, linux-kernel, linux-doc, yrl.pp-manager.tt

Hi Rob,

Thank you for your comments.


(2012/07/06 21:54), Rob Landley wrote:
> On 07/05/2012 05:51 AM, Mitsuo Hayasaka wrote:
>> Add an explantion about the sysfs parameter to the limit
>> maximum read/write request size.
>>
>> Signed-off-by: Mitsuo Hayasaka<mitsuo.hayasaka.hu@hitachi.com>
>> Cc: Rob Landley<rob@landley.net>
>> Cc: Miklos Szeredi<miklos@szeredi.hu>
>> ---
>>
>>   Documentation/filesystems/fuse.txt |   17 ++++++++++++++++-
>>   1 files changed, 16 insertions(+), 1 deletions(-)
>>
>> diff --git a/Documentation/filesystems/fuse.txt b/Documentation/filesystems/fuse.txt
>> index 13af4a4..e6ffba3 100644
>> --- a/Documentation/filesystems/fuse.txt
>> +++ b/Documentation/filesystems/fuse.txt
>> @@ -108,13 +108,28 @@ Mount options
>>
>>     With this option the maximum size of read operations can be set.
>>     The default is infinite.  Note that the size of read requests is
>> -  limited anyway to 32 pages (which is 128kbyte on i386).
>> +  limited to 32 pages (which is 128kbyte on i386) if direct_io
>> +  option is not specified. When direct_io option is specified,
>> +  the request size is limited to max_pages_per_req sysfs parameter.
>
> "Note that the maximum size of read requests defaults to 32 pages (128k
> on i386), use max_pages_per_req to change this default."
>
> And then describe max_page_per_req sufficiently thoroughly below, all in
> one place.

OK, I will revise it.

>
> (By the way, has anybody actually tested it with a single page as the
> limit? Does that work?)


This patch series enables the maximum request size to change to arbitrary
number from 32 to 256, and cannot set it to less than 32 pages.

>
>>   'blksize=N'
>>
>>     Set the block size for the filesystem.  The default is 512.  This
>>     option is only valid for 'fuseblk' type mounts.
>>
>> +Sysfs parameter
>> +~~~~~~~~~~~~~~~
>> +
>> +The sysfs parameter max_pages_per_req limits the maximum page size per
>> +FUSE request.
>
> No, it limits the maximum size of a data request and the units are
> decimal number of pages. It doesn't change the size of memory pages in
> the system.

You are right. I will revise it.


>
> Also, your first hunk implies this setting only takes effect if they
> mounted with "-o direct_io", is that true?


The request size increases using max_pages_per_req for read operation w/
direct_io, and write operation w/ and w/o direct_io. But it is not changed
for read operation w/o direct_io. So, it is true if only focusing on read
operation.


>
>> +	/sys/fs/fuse/max_pages_per_req
>> +
>> +The default is 32 pages. It can be changed from 32 to 256 pages, which
>> +may improve the read/write throughput optimizing it. This change is
>> +effective per mount. Therefore, the re-mounting of FUSE filesystem
>> +is required after changing it.
>
> I'd say "Changing it to 256 pages may improve read/write throguhput on
> systems with enough memory. Existing FUSE mounts must be remounted for
> this change to take effect."
>
> I.E. don't imply 32 and 256 are the only options unless they are. (Is
> there some requirement that it be a power of 2, or just a good idea?)


Here, I wanted to imply that the max_paegs_per_req can be changed to
arbitrary number from 32 to 256. I will revise it since this explanation
is misleading.

Also, there is no requirement that it be a power of 2 although it is a
good idea if only focusing on kmalloc(). One of the reasons to introduce
the max_pages_per_req sysfs parameter is to let the libfuse get the
current maximum request size and change the MIN_BUFSIZE limitation
according to it to avoid an waste of memory in userspace.


>
> And per-mount sounds like you're setting it for a specific mount point,
> so if I have three mounts there would be three entries under
> /sys/fs/fuse, which does not seem to be the case. (Which is odd, because
> you'd think there would be an "-o max_pages_per_req=128" that _would_
> set this per-mount if the value actually used is cached in the
> superblock, but I'm not seeing one...)


The max_pages_per_req is a system limitation controlled by the administrator.
The actual number of allocated pages per request can be changed using
max_read/max_write mount options below this system limitation.

I will revise and resubmit this patch series soon.

Thanks,

>
> Rob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [fuse-devel] [RFC PATCH 0/5] fuse: make maximum read/write request size tunable
  2012-07-12 10:13       ` Miklos Szeredi
@ 2012-07-13  7:30         ` HAYASAKA Mitsuo
  0 siblings, 0 replies; 16+ messages in thread
From: HAYASAKA Mitsuo @ 2012-07-13  7:30 UTC (permalink / raw)
  To: Miklos Szeredi, hanwen, Han-Wen Nienhuys, Liu Yuan
  Cc: fuse-devel, linux-doc, linux-kernel, yrl.pp-manager.tt

Hi Miklos,

Thank you for your comments.

(2012/07/12 19:13), Miklos Szeredi wrote:
> HAYASAKA Mitsuo<mitsuo.hayasaka.hu@hitachi.com>  writes:
>
>> Hi Yuan and Han-Wen,
>>
>> Thank you for your comments.
>>
>> (2012/07/06 22:58), Han-Wen Nienhuys wrote:
>>> On Fri, Jul 6, 2012 at 2:53 AM, Liu Yuan<namei.unix@gmail.com>   wrote:
>>>> On 07/05/2012 06:50 PM, Mitsuo Hayasaka wrote:
>>>>> One of the ways to solve this is to make them tunable.
>>>>> In this series, the new sysfs parameter max_pages_per_req is introduced.
>>>>> It limits the maximum read/write size in fuse request and it can be
>>>>> changed from 32 to 256 pages in current implementations. When the
>>>>> max_read/max_write mount option is specified, FUSE request size is set
>>>>> per mount. (The size is rounded-up to page size and limited up to
>>>>> max_pages_per_req.)
>>>>
>>>> Why maxim 256 pages? If we are here, we can go further: most of object
>>>> storage system has object size of multiple to dozens of megabytes. So I
>>>> think probably 1M is too small. Our distribution storage system has 4M
>>>> per object, so I think at least maxim size could be bigger than 4M.
>>>
>>> The maximum pipe size on my system is 1M, so if you go beyond that,
>>> splicing from the FD won't work.
>>>
>>> Also, the userspace client must reserve a buffer this size so it can
>>> receive a write, which is a waste since most requests are much
>>> smaller.
>>>
>>
>> I checked the maximum pipe size can be changed using fcntl(2) or
>> /proc/sys/fs/pipe-max-size. It is clear that it is not a fixed value.
>>
>> Also, it seems that there is a request for setting the maximum number
>> of pages per fuse request to 4M (1024 pages). One of the reasons to
>> introduce the sysfs max_pages_per_req parameter is to set a threshold
>> of the maximum number of pages dynamically according to the
>> administrator's demand, and root can only change it.
>>
>> So, when the maximum value is required to be set to not more than the
>> pipe-max-size, the max_pages_per_req should be changed considering it.
>> It seems that the upper limit of this parameter does not have to be
>> not more than it.
>>
>> I'm planning to limit max_pages_per_req up to 1024 pages and add the
>> document to /Documentation/filesystems/fuse.txt, as follows.
>>
>> "the sysfs max_pages_per_req parameter can be changed from 32 to 1024.
>> The default is 32 pages. Generally, the pipe-max-size is 1M (256 pages)
>> and it is better to set it to not more than the pipe-max-size."
>
> Can't we just use pipe-max-size for the limit?

This is great!
I'd like to change this patch to using the pipe-max-size for the upper
limit of the max_pages_per_req sysfs paramter, and resubmit it.

>
> Then we'll use the minimum of pipe-max-size and max_read/max_write for
> sizing the requests.
>
> Another comment: do we really need to allocate each and every request
> with space for the pages?  I don't think that makes sense.  Let's leave
> some small number of pages inline in the request and allocate a separate
> array if the number of pages is too large.  There may even be some utilities
> in the kernel to handle dynamically sized page arrays (I haven't looked
> but I suspect there is).

This is interesting and enables to dramatically reduce the number of page
allocation and free. However, it seems that it is necessary to investigate
if this is feasible.

Thanks,

>
> Thanks,
> Miklos
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-07-13  7:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-05 10:50 [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Mitsuo Hayasaka
2012-07-05 10:50 ` [RFC PATCH 1/5] " Mitsuo Hayasaka
2012-07-05 10:50 ` [RFC PATCH 2/5] fuse: do not create cache for fuse request allocation Mitsuo Hayasaka
2012-07-05 10:51 ` [RFC PATCH 3/5] fuse: make default global limit minimum value Mitsuo Hayasaka
2012-07-05 10:51 ` [RFC PATCH 4/5] fuse: add a sysfs parameter to control maximum request size Mitsuo Hayasaka
2012-07-05 10:51 ` [RFC PATCH 5/5] fuse: add documentation of sysfs parameter to limit maximum fuse " Mitsuo Hayasaka
2012-07-06 12:54   ` Rob Landley
2012-07-12 13:13     ` HAYASAKA Mitsuo
2012-07-05 13:04 ` [RFC PATCH 0/5] fuse: make maximum read/write request size tunable Nikolaus Rath
2012-07-06 10:09   ` HAYASAKA Mitsuo
2012-07-06  5:53 ` Liu Yuan
2012-07-06 13:58   ` [fuse-devel] " Han-Wen Nienhuys
2012-07-12  5:58     ` HAYASAKA Mitsuo
2012-07-12  6:13       ` Liu Yuan
2012-07-12 10:13       ` Miklos Szeredi
2012-07-13  7:30         ` HAYASAKA Mitsuo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.