linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] fuse: add max_pages option
@ 2018-09-06 12:37 Constantine Shulyupin
  2018-09-26  8:54 ` Constantine Shulyupin
  0 siblings, 1 reply; 8+ messages in thread
From: Constantine Shulyupin @ 2018-09-06 12:37 UTC (permalink / raw)
  To: miklos, linux-fsdevel, viro, corbet, liushuoran, mitsuo.hayasaka.hu
  Cc: amir73il, Constantine Shulyupin

Replace FUSE_MAX_PAGES_PER_REQ with the configurable
mount parameter max_pages to improve performance.

Old RFC with detailed description of the problem and
many fixes by Mitsuo Hayasaka (mitsuo.hayasaka.hu@hitachi.com):
 - https://lkml.org/lkml/2012/7/5/136

Implementation as a mount option selected as similar to implementation
of max_read and blksize. More complex implementation can use INIT message.

	Performance degradation and restoration.

Fuse introduces significant performance degradation under conditions:
- block storage is very fast.
- CPU is slow or busy.
- User space fuse adds lag on each request.

Parameter max_pages helps to restore performance in this case.

We've encountered performance degradation and fixed it on a big and
complex virtual environment.

Environment to reproduce degradation and improvement:

1. Add lag to user mode FUSE
Add nanosleep(&(struct timespec){ 0, 1000 }, NULL); to xmp_write_buf
in passthrough_fh.c

2. patch UM fuse with configurable max_pages parameter. The patch will be
provided latter.

3. run test script and perform test on tmpfs
fuse_test()
{

       cd /tmp
       mkdir -p fusemnt
       passthrough_fh -o max_pages=$1 /tmp/fusemnt
       grep fuse /proc/self/mounts
       dd conv=fdatasync oflag=dsync if=/dev/zero of=fusemnt/tmp/tmp \
		count=1K bs=1M 2>&1 | grep -v records
       rm fusemnt/tmp/tmp
       killall passthrough_fh
}

Test results:

passthrough_fh /tmp/fusemnt fuse.passthrough_fh \
	rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0
1073741824 bytes (1.1 GB) copied, 1.73867 s, 618 MB/s

passthrough_fh /tmp/fusemnt fuse.passthrough_fh \
	rw,nosuid,nodev,relatime,user_id=0,group_id=0,max_pages=256 0 0
1073741824 bytes (1.1 GB) copied, 1.15643 s, 928 MB/s

Obviously with bigger lag the difference between 'before' and 'after'
will be more significant.

Mitsuo Hayasaka, in 2012 (https://lkml.org/lkml/2012/7/5/136),
observed improvement from 400-550 to 520-740.

Signed-off-by: Constantine Shulyupin <const@MakeLinux.com>

---

Hi Miklos,

Above is information that you requested.

Thanks,
Costa.

Changes in v4:
- join three patches together
- add notes about mount option and performance notes

Changes in v3:
- used clamp_val
- split documentation change
- split EXPORT_SYMBOL(pipe_max_size)

Changes in v2:
- add limitation by pipe_max_size, which was requested in
  https://lkml.org/lkml/2012/7/12/32

Changes in v1: https://lkml.org/lkml/2017/8/6/194
 - replace FUSE_MAX_PAGES_PER_REQ with
   FUSE_DEFAULT_MAX_PAGES_PER_REQ and
   fc->max_pages
 - add mount option max_pages
---
 Documentation/filesystems/fuse.txt |  5 ++-
 fs/fuse/dev.c                      |  4 +--
 fs/fuse/file.c                     | 54 +++++++++++++++---------------
 fs/fuse/fuse_i.h                   |  5 ++-
 fs/fuse/inode.c                    | 14 ++++++++
 fs/pipe.c                          |  1 +
 6 files changed, 52 insertions(+), 31 deletions(-)

diff --git a/Documentation/filesystems/fuse.txt b/Documentation/filesystems/fuse.txt
index 13af4a49e7db..d4e832fe9ce6 100644
--- a/Documentation/filesystems/fuse.txt
+++ b/Documentation/filesystems/fuse.txt
@@ -108,7 +108,10 @@ Mount options
 
   With this option the maximum size of read operations can be set.
   The default is infinite.  Note that the size of read requests is
-  limited anyway to 32 pages (which is 128kbyte on i386).
+  limited anyway to max_pages (which by default is 32 or 128KB on x86).
+
+'max_pages=N'
+   Maximal number of pages per request. The default is 32 or 128KB on x86.
 
 'blksize=N'
 
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 11ea2c4a38ab..b324ffc2d82a 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1663,7 +1663,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode,
 	unsigned int num;
 	unsigned int offset;
 	size_t total_len = 0;
-	int num_pages;
+	unsigned num_pages;
 
 	offset = outarg->offset & ~PAGE_MASK;
 	file_size = i_size_read(inode);
@@ -1675,7 +1675,7 @@ static int fuse_retrieve(struct fuse_conn *fc, struct inode *inode,
 		num = file_size - outarg->offset;
 
 	num_pages = (num + offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
-	num_pages = min(num_pages, FUSE_MAX_PAGES_PER_REQ);
+	num_pages = min(num_pages, fc->max_pages);
 
 	req = fuse_get_req(fc, num_pages);
 	if (IS_ERR(req))
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index fe8d84eb714a..e0924e46ef24 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -847,11 +847,11 @@ static int fuse_readpages_fill(void *_data, struct page *page)
 	fuse_wait_on_page_writeback(inode, page->index);
 
 	if (req->num_pages &&
-	    (req->num_pages == FUSE_MAX_PAGES_PER_REQ ||
+	    (req->num_pages == fc->max_pages ||
 	     (req->num_pages + 1) * PAGE_SIZE > fc->max_read ||
 	     req->pages[req->num_pages - 1]->index + 1 != page->index)) {
 		int nr_alloc = min_t(unsigned, data->nr_pages,
-				     FUSE_MAX_PAGES_PER_REQ);
+				     fc->max_pages);
 		fuse_send_readpages(req, data->file);
 		if (fc->async_read)
 			req = fuse_get_req_for_background(fc, nr_alloc);
@@ -886,7 +886,7 @@ static int fuse_readpages(struct file *file, struct address_space *mapping,
 	struct fuse_conn *fc = get_fuse_conn(inode);
 	struct fuse_fill_data data;
 	int err;
-	int nr_alloc = min_t(unsigned, nr_pages, FUSE_MAX_PAGES_PER_REQ);
+	int nr_alloc = min_t(unsigned, nr_pages, fc->max_pages);
 
 	err = -EIO;
 	if (is_bad_inode(inode))
@@ -1101,12 +1101,12 @@ static ssize_t fuse_fill_write_pages(struct fuse_req *req,
 	return count > 0 ? count : err;
 }
 
-static inline unsigned fuse_wr_pages(loff_t pos, size_t len)
+static inline unsigned fuse_wr_pages(loff_t pos, size_t len, unsigned max_pages)
 {
 	return min_t(unsigned,
 		     ((pos + len - 1) >> PAGE_SHIFT) -
 		     (pos >> PAGE_SHIFT) + 1,
-		     FUSE_MAX_PAGES_PER_REQ);
+		     max_pages);
 }
 
 static ssize_t fuse_perform_write(struct kiocb *iocb,
@@ -1128,7 +1128,8 @@ static ssize_t fuse_perform_write(struct kiocb *iocb,
 	do {
 		struct fuse_req *req;
 		ssize_t count;
-		unsigned nr_pages = fuse_wr_pages(pos, iov_iter_count(ii));
+		unsigned nr_pages = fuse_wr_pages(pos, iov_iter_count(ii),
+						  fc->max_pages);
 
 		req = fuse_get_req(fc, nr_pages);
 		if (IS_ERR(req)) {
@@ -1318,11 +1319,6 @@ static int fuse_get_user_pages(struct fuse_req *req, struct iov_iter *ii,
 	return ret < 0 ? ret : 0;
 }
 
-static inline int fuse_iter_npages(const struct iov_iter *ii_p)
-{
-	return iov_iter_npages(ii_p, FUSE_MAX_PAGES_PER_REQ);
-}
-
 ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 		       loff_t *ppos, int flags)
 {
@@ -1342,9 +1338,10 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 	int err = 0;
 
 	if (io->async)
-		req = fuse_get_req_for_background(fc, fuse_iter_npages(iter));
+		req = fuse_get_req_for_background(fc, iov_iter_npages(iter,
+								fc->max_pages));
 	else
-		req = fuse_get_req(fc, fuse_iter_npages(iter));
+		req = fuse_get_req(fc, iov_iter_npages(iter, fc->max_pages));
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
@@ -1389,9 +1386,10 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 			fuse_put_request(fc, req);
 			if (io->async)
 				req = fuse_get_req_for_background(fc,
-					fuse_iter_npages(iter));
+					iov_iter_npages(iter, fc->max_pages));
 			else
-				req = fuse_get_req(fc, fuse_iter_npages(iter));
+				req = fuse_get_req(fc, iov_iter_npages(iter,
+								fc->max_pages));
 			if (IS_ERR(req))
 				break;
 		}
@@ -1818,7 +1816,7 @@ static int fuse_writepages_fill(struct page *page,
 	is_writeback = fuse_page_is_writeback(inode, page->index);
 
 	if (req && req->num_pages &&
-	    (is_writeback || req->num_pages == FUSE_MAX_PAGES_PER_REQ ||
+	    (is_writeback || req->num_pages == fc->max_pages ||
 	     (req->num_pages + 1) * PAGE_SIZE > fc->max_write ||
 	     data->orig_pages[req->num_pages - 1]->index + 1 != page->index)) {
 		fuse_writepages_send(data);
@@ -1846,7 +1844,7 @@ static int fuse_writepages_fill(struct page *page,
 		struct fuse_inode *fi = get_fuse_inode(inode);
 
 		err = -ENOMEM;
-		req = fuse_request_alloc_nofs(FUSE_MAX_PAGES_PER_REQ);
+		req = fuse_request_alloc_nofs(fc->max_pages);
 		if (!req) {
 			__free_page(tmp_page);
 			goto out_unlock;
@@ -1903,6 +1901,7 @@ static int fuse_writepages(struct address_space *mapping,
 			   struct writeback_control *wbc)
 {
 	struct inode *inode = mapping->host;
+	struct fuse_conn *fc = get_fuse_conn(inode);
 	struct fuse_fill_wb_data data;
 	int err;
 
@@ -1915,7 +1914,7 @@ static int fuse_writepages(struct address_space *mapping,
 	data.ff = NULL;
 
 	err = -ENOMEM;
-	data.orig_pages = kcalloc(FUSE_MAX_PAGES_PER_REQ,
+	data.orig_pages = kcalloc(fc->max_pages,
 				  sizeof(struct page *),
 				  GFP_NOFS);
 	if (!data.orig_pages)
@@ -2386,10 +2385,11 @@ static int fuse_copy_ioctl_iovec_old(struct iovec *dst, void *src,
 }
 
 /* Make sure iov_length() won't overflow */
-static int fuse_verify_ioctl_iov(struct iovec *iov, size_t count)
+static int fuse_verify_ioctl_iov(struct fuse_conn *fc, struct iovec *iov,
+				 size_t count)
 {
 	size_t n;
-	u32 max = FUSE_MAX_PAGES_PER_REQ << PAGE_SHIFT;
+	u32 max = fc->max_pages << PAGE_SHIFT;
 
 	for (n = 0; n < count; n++, iov++) {
 		if (iov->iov_len > (size_t) max)
@@ -2513,7 +2513,7 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
 	BUILD_BUG_ON(sizeof(struct fuse_ioctl_iovec) * FUSE_IOCTL_MAX_IOV > PAGE_SIZE);
 
 	err = -ENOMEM;
-	pages = kcalloc(FUSE_MAX_PAGES_PER_REQ, sizeof(pages[0]), GFP_KERNEL);
+	pages = kcalloc(fc->max_pages, sizeof(pages[0]), GFP_KERNEL);
 	iov_page = (struct iovec *) __get_free_page(GFP_KERNEL);
 	if (!pages || !iov_page)
 		goto out;
@@ -2552,7 +2552,7 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
 
 	/* make sure there are enough buffer pages and init request with them */
 	err = -ENOMEM;
-	if (max_pages > FUSE_MAX_PAGES_PER_REQ)
+	if (max_pages > fc->max_pages)
 		goto out;
 	while (num_pages < max_pages) {
 		pages[num_pages] = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
@@ -2639,11 +2639,11 @@ long fuse_do_ioctl(struct file *file, unsigned int cmd, unsigned long arg,
 		in_iov = iov_page;
 		out_iov = in_iov + in_iovs;
 
-		err = fuse_verify_ioctl_iov(in_iov, in_iovs);
+		err = fuse_verify_ioctl_iov(fc, in_iov, in_iovs);
 		if (err)
 			goto out;
 
-		err = fuse_verify_ioctl_iov(out_iov, out_iovs);
+		err = fuse_verify_ioctl_iov(fc, out_iov, out_iovs);
 		if (err)
 			goto out;
 
@@ -2834,9 +2834,9 @@ static void fuse_do_truncate(struct file *file)
 	fuse_do_setattr(file_dentry(file), &attr, file);
 }
 
-static inline loff_t fuse_round_up(loff_t off)
+static inline loff_t fuse_round_up(struct fuse_conn *fc, loff_t off)
 {
-	return round_up(off, FUSE_MAX_PAGES_PER_REQ << PAGE_SHIFT);
+	return round_up(off, fc->max_pages << PAGE_SHIFT);
 }
 
 static ssize_t
@@ -2865,7 +2865,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 	if (async_dio && iov_iter_rw(iter) != WRITE && offset + count > i_size) {
 		if (offset >= i_size)
 			return 0;
-		iov_iter_truncate(iter, fuse_round_up(i_size - offset));
+		iov_iter_truncate(iter, fuse_round_up(ff->fc, i_size - offset));
 		count = iov_iter_count(iter);
 	}
 
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index f78e9614bb5f..7556301cb493 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -29,7 +29,7 @@
 #include <linux/user_namespace.h>
 
 /** Max number of pages that can be used in a single read request */
-#define FUSE_MAX_PAGES_PER_REQ 32
+#define FUSE_DEFAULT_MAX_PAGES_PER_REQ 32
 
 /** Bias for fi->writectr, meaning new writepages must not be sent */
 #define FUSE_NOWRITE INT_MIN
@@ -476,6 +476,9 @@ struct fuse_conn {
 	/** Maximum write size */
 	unsigned max_write;
 
+	/** Maxmum number of pages that can be used in a single request */
+	unsigned max_pages;
+
 	/** Input queue */
 	struct fuse_iqueue iq;
 
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index db9e60b7eb69..aed77c77cccc 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -22,6 +22,7 @@
 #include <linux/exportfs.h>
 #include <linux/posix_acl.h>
 #include <linux/pid_namespace.h>
+#include <linux/pipe_fs_i.h>
 
 MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
 MODULE_DESCRIPTION("Filesystem in Userspace");
@@ -71,6 +72,7 @@ struct fuse_mount_data {
 	unsigned default_permissions:1;
 	unsigned allow_other:1;
 	unsigned max_read;
+	unsigned max_pages;
 	unsigned blksize;
 };
 
@@ -453,6 +455,7 @@ enum {
 	OPT_DEFAULT_PERMISSIONS,
 	OPT_ALLOW_OTHER,
 	OPT_MAX_READ,
+	OPT_MAX_PAGES,
 	OPT_BLKSIZE,
 	OPT_ERR
 };
@@ -465,6 +468,7 @@ static const match_table_t tokens = {
 	{OPT_DEFAULT_PERMISSIONS,	"default_permissions"},
 	{OPT_ALLOW_OTHER,		"allow_other"},
 	{OPT_MAX_READ,			"max_read=%u"},
+	{OPT_MAX_PAGES,                 "max_pages=%u"},
 	{OPT_BLKSIZE,			"blksize=%u"},
 	{OPT_ERR,			NULL}
 };
@@ -546,6 +550,12 @@ static int parse_fuse_opt(char *opt, struct fuse_mount_data *d, int is_bdev,
 			d->max_read = value;
 			break;
 
+		case OPT_MAX_PAGES:
+			if (match_int(&args[0], &value))
+				return 0;
+			d->max_pages = value;
+			break;
+
 		case OPT_BLKSIZE:
 			if (!is_bdev || match_int(&args[0], &value))
 				return 0;
@@ -577,6 +587,8 @@ static int fuse_show_options(struct seq_file *m, struct dentry *root)
 		seq_puts(m, ",allow_other");
 	if (fc->max_read != ~0)
 		seq_printf(m, ",max_read=%u", fc->max_read);
+	if (fc->max_pages != FUSE_DEFAULT_MAX_PAGES_PER_REQ)
+		seq_printf(m, ",max_pages=%u", fc->max_pages);
 	if (sb->s_bdev && sb->s_blocksize != FUSE_DEFAULT_BLKSIZE)
 		seq_printf(m, ",blksize=%lu", sb->s_blocksize);
 	return 0;
@@ -1141,6 +1153,8 @@ static int fuse_fill_super(struct super_block *sb, void *data, int silent)
 	fc->user_id = d.user_id;
 	fc->group_id = d.group_id;
 	fc->max_read = max_t(unsigned, 4096, d.max_read);
+	fc->max_pages = clamp_val(d.max_pages, FUSE_DEFAULT_MAX_PAGES_PER_REQ,
+				  pipe_max_size >> PAGE_SHIFT);
 
 	/* Used by get_root_inode() */
 	sb->s_fs_info = fc;
diff --git a/fs/pipe.c b/fs/pipe.c
index bb0840e234f3..4990d92b0849 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -34,6 +34,7 @@
  * be set by root in /proc/sys/fs/pipe-max-size
  */
 unsigned int pipe_max_size = 1048576;
+EXPORT_SYMBOL(pipe_max_size);
 
 /* Maximum allocatable pages per user. Hard limit is unset by default, soft
  * matches default values.
-- 
2.17.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] fuse: add max_pages option
  2018-09-06 12:37 [PATCH v4] fuse: add max_pages option Constantine Shulyupin
@ 2018-09-26  8:54 ` Constantine Shulyupin
  2018-09-26  9:19   ` Miklos Szeredi
  2018-10-01  9:16   ` Miklos Szeredi
  0 siblings, 2 replies; 8+ messages in thread
From: Constantine Shulyupin @ 2018-09-26  8:54 UTC (permalink / raw)
  To: Miklos Szeredi, open list:FUSE: FILESYSTEM IN USERSPACE,
	Jonathan Corbet, 刘硕然,
	mitsuo.hayasaka.hu
  Cc: Amir Goldstein

Hi Miklos,

Passed 20 days since the previous messages, so would you like to
review the patch?
I've added reproduction and measurement details up your request.
Also this patch can help resolve BDI_CAP_STRICTLIMIT performance issue
reported recently by liushuoran@jd.com.

Thanks

On Thu, Sep 6, 2018 at 3:37 PM Constantine Shulyupin
<const@makelinux.com> wrote:
>
> Replace FUSE_MAX_PAGES_PER_REQ with the configurable
> mount parameter max_pages to improve performance.

Link to message: https://marc.info/?l=linux-fsdevel&m=153623932516084&w=2

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] fuse: add max_pages option
  2018-09-26  8:54 ` Constantine Shulyupin
@ 2018-09-26  9:19   ` Miklos Szeredi
  2018-09-26  9:39     ` 答复: " 刘硕然
  2018-10-01  9:16   ` Miklos Szeredi
  1 sibling, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2018-09-26  9:19 UTC (permalink / raw)
  To: Constantine Shulyupin
  Cc: open list:FUSE: FILESYSTEM IN USERSPACE, Jonathan Corbet,
	刘硕然,
	mitsuo.hayasaka.hu, Amir Goldstein

On Wed, Sep 26, 2018 at 10:54 AM, Constantine Shulyupin
<const@makelinux.com> wrote:
> Hi Miklos,
>
> Passed 20 days since the previous messages, so would you like to
> review the patch?

Yes.

> I've added reproduction and measurement details up your request.
> Also this patch can help resolve BDI_CAP_STRICTLIMIT performance issue
> reported recently by liushuoran@jd.com.

Do you have feedback from <liushuoran@jd.com> that this patch helps?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* 答复: [PATCH v4] fuse: add max_pages option
  2018-09-26  9:19   ` Miklos Szeredi
@ 2018-09-26  9:39     ` 刘硕然
  2018-09-26 10:06       ` Constantine Shulyupin
  0 siblings, 1 reply; 8+ messages in thread
From: 刘硕然 @ 2018-09-26  9:39 UTC (permalink / raw)
  To: Miklos Szeredi, Constantine Shulyupin
  Cc: open list:FUSE: FILESYSTEM IN USERSPACE, Jonathan Corbet,
	mitsuo.hayasaka.hu, Amir Goldstein

Hi Constantine,

I haven't tested the patch yet. But after reviewing the patch, I don't see anything related to BDI_CAP_STRICTLIMIT. So would you please explain a little bit more? Thanks.

PS: I did try increasing FUSE_MAX_PAGES_PER_REQ, but it seemed not helping in my scenario(writeback cache enabled, 4K writes, total write size is not very large).

Shuoran Liu

-----邮件原件-----
发件人: Miklos Szeredi [mailto:miklos@szeredi.hu] 
发送时间: 2018年9月26日 17:20
收件人: Constantine Shulyupin <const@makelinux.com>
抄送: open list:FUSE: FILESYSTEM IN USERSPACE <linux-fsdevel@vger.kernel.org>; Jonathan Corbet <corbet@lwn.net>; 刘硕然 <liushuoran@jd.com>; mitsuo.hayasaka.hu@hitachi.com; Amir Goldstein <amir73il@gmail.com>
主题: Re: [PATCH v4] fuse: add max_pages option

On Wed, Sep 26, 2018 at 10:54 AM, Constantine Shulyupin <const@makelinux.com> wrote:
> Hi Miklos,
>
> Passed 20 days since the previous messages, so would you like to 
> review the patch?

Yes.

> I've added reproduction and measurement details up your request.
> Also this patch can help resolve BDI_CAP_STRICTLIMIT performance issue 
> reported recently by liushuoran@jd.com.

Do you have feedback from <liushuoran@jd.com> that this patch helps?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] fuse: add max_pages option
  2018-09-26  9:39     ` 答复: " 刘硕然
@ 2018-09-26 10:06       ` Constantine Shulyupin
  2018-09-26 10:39         ` 答复: " 刘硕然
  0 siblings, 1 reply; 8+ messages in thread
From: Constantine Shulyupin @ 2018-09-26 10:06 UTC (permalink / raw)
  To: 刘硕然
  Cc: Miklos Szeredi, open list:FUSE: FILESYSTEM IN USERSPACE,
	Jonathan Corbet, mitsuo.hayasaka.hu, Amir Goldstein

Hi Shuoran,

On Wed, Sep 26, 2018 at 12:40 PM 刘硕然 <liushuoran@jd.com> wrote:
> I haven't tested the patch yet. But after reviewing the patch, I don't see anything related to BDI_CAP_STRICTLIMIT. So would you please explain a little bit more? Thanks.

Bigger size of request reduces total number of requests and reduces
overhead of per request operations.

> PS: I did try increasing FUSE_MAX_PAGES_PER_REQ, but it seemed not helping in my scenario(writeback cache enabled, 4K writes, total write size is not very large).

To utilize FUSE_MAX_PAGES_PER_REQ in kernel it is must to increase
KERNEL_BUF_PAGES it libfuse too. I have libfuse patch with
configurable max_pages.

But 4K writes will not benefit from increase of request size, which
now is 128K and proposed size is 1M.

Thanks

^ permalink raw reply	[flat|nested] 8+ messages in thread

* 答复: [PATCH v4] fuse: add max_pages option
  2018-09-26 10:06       ` Constantine Shulyupin
@ 2018-09-26 10:39         ` 刘硕然
  2018-10-01  9:18           ` Miklos Szeredi
  0 siblings, 1 reply; 8+ messages in thread
From: 刘硕然 @ 2018-09-26 10:39 UTC (permalink / raw)
  To: Constantine Shulyupin
  Cc: Miklos Szeredi, open list:FUSE: FILESYSTEM IN USERSPACE,
	Jonathan Corbet, mitsuo.hayasaka.hu, Amir Goldstein



刘硕然 
商城技术架构部   智能存储部
------------------------------------------------------------------------------------------------
手机/+86 18201588100
邮箱/liushuoran@jd.com
地址/北京市经济技术开发区科创十一街18号院京东大厦B座16层
------------------------------------------------------------------------------------------------



> -----邮件原件-----
> 发件人: Constantine Shulyupin [mailto:const@makelinux.com]
> 发送时间: 2018年9月26日 18:07
> 收件人: 刘硕然 <liushuoran@jd.com>
> 抄送: Miklos Szeredi <miklos@szeredi.hu>; open list:FUSE: FILESYSTEM IN
> USERSPACE <linux-fsdevel@vger.kernel.org>; Jonathan Corbet
> <corbet@lwn.net>; mitsuo.hayasaka.hu@hitachi.com; Amir Goldstein
> <amir73il@gmail.com>
> 主题: Re: [PATCH v4] fuse: add max_pages option
> 
> Hi Shuoran,
> 
> On Wed, Sep 26, 2018 at 12:40 PM 刘硕然 <liushuoran@jd.com> wrote:
> > I haven't tested the patch yet. But after reviewing the patch, I don't see
> anything related to BDI_CAP_STRICTLIMIT. So would you please explain a
> little bit more? Thanks.
> 
> Bigger size of request reduces total number of requests and reduces
> overhead of per request operations.
> 
> > PS: I did try increasing FUSE_MAX_PAGES_PER_REQ, but it seemed not
> helping in my scenario(writeback cache enabled, 4K writes, total write size is
> not very large).
> 
> To utilize FUSE_MAX_PAGES_PER_REQ in kernel it is must to increase
> KERNEL_BUF_PAGES it libfuse too. I have libfuse patch with configurable
> max_pages.
> 
> But 4K writes will not benefit from increase of request size, which now is
> 128K and proposed size is 1M.
>

This is true. So my original purpose of using writeback cache was trying to improve the performance of small writes. Because for big writes, fuse kernel would send requests to libfuse anyway, with or without writeback cache.

The BDI_CAP_STRICTLIMIT issue happens when writeback cache is enabled, since balance_dirty_pages() is triggered in every small writes even if no request is sent to libfuse, which slows things down.

So IMHO, increasing FUSE_MAX_PAGES_PER_REQ would certainly help for big writes, but might not be a solution to BDI_CAP_STRICTLIMIT issue. Please feel free to correct me if I misunderstand anything. Thanks in advance.

> Thanks

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v4] fuse: add max_pages option
  2018-09-26  8:54 ` Constantine Shulyupin
  2018-09-26  9:19   ` Miklos Szeredi
@ 2018-10-01  9:16   ` Miklos Szeredi
  1 sibling, 0 replies; 8+ messages in thread
From: Miklos Szeredi @ 2018-10-01  9:16 UTC (permalink / raw)
  To: Constantine Shulyupin
  Cc: open list:FUSE: FILESYSTEM IN USERSPACE, Jonathan Corbet,
	刘硕然,
	mitsuo.hayasaka.hu, Amir Goldstein

On Wed, Sep 26, 2018 at 10:54 AM, Constantine Shulyupin
<const@makelinux.com> wrote:
> Hi Miklos,
>
> Passed 20 days since the previous messages, so would you like to
> review the patch?

Updated patch pushed to

   git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git#for-next

The mount option was converted to negotiation via INIT request.

Please test.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 答复: [PATCH v4] fuse: add max_pages option
  2018-09-26 10:39         ` 答复: " 刘硕然
@ 2018-10-01  9:18           ` Miklos Szeredi
  0 siblings, 0 replies; 8+ messages in thread
From: Miklos Szeredi @ 2018-10-01  9:18 UTC (permalink / raw)
  To: 刘硕然
  Cc: Constantine Shulyupin, open list:FUSE: FILESYSTEM IN USERSPACE,
	Jonathan Corbet, mitsuo.hayasaka.hu, Amir Goldstein

On Wed, Sep 26, 2018 at 12:39 PM, 刘硕然 <liushuoran@jd.com> wrote:
>
>
> 刘硕然
> 商城技术架构部   智能存储部
> ------------------------------------------------------------------------------------------------
> 手机/+86 18201588100
> 邮箱/liushuoran@jd.com
> 地址/北京市经济技术开发区科创十一街18号院京东大厦B座16层
> ------------------------------------------------------------------------------------------------
>
>
>
>> -----邮件原件-----
>> 发件人: Constantine Shulyupin [mailto:const@makelinux.com]
>> 发送时间: 2018年9月26日 18:07
>> 收件人: 刘硕然 <liushuoran@jd.com>
>> 抄送: Miklos Szeredi <miklos@szeredi.hu>; open list:FUSE: FILESYSTEM IN
>> USERSPACE <linux-fsdevel@vger.kernel.org>; Jonathan Corbet
>> <corbet@lwn.net>; mitsuo.hayasaka.hu@hitachi.com; Amir Goldstein
>> <amir73il@gmail.com>
>> 主题: Re: [PATCH v4] fuse: add max_pages option
>>
>> Hi Shuoran,
>>
>> On Wed, Sep 26, 2018 at 12:40 PM 刘硕然 <liushuoran@jd.com> wrote:
>> > I haven't tested the patch yet. But after reviewing the patch, I don't see
>> anything related to BDI_CAP_STRICTLIMIT. So would you please explain a
>> little bit more? Thanks.
>>
>> Bigger size of request reduces total number of requests and reduces
>> overhead of per request operations.
>>
>> > PS: I did try increasing FUSE_MAX_PAGES_PER_REQ, but it seemed not
>> helping in my scenario(writeback cache enabled, 4K writes, total write size is
>> not very large).
>>
>> To utilize FUSE_MAX_PAGES_PER_REQ in kernel it is must to increase
>> KERNEL_BUF_PAGES it libfuse too. I have libfuse patch with configurable
>> max_pages.
>>
>> But 4K writes will not benefit from increase of request size, which now is
>> 128K and proposed size is 1M.
>>
>
> This is true. So my original purpose of using writeback cache was trying to improve the performance of small writes. Because for big writes, fuse kernel would send requests to libfuse anyway, with or without writeback cache.
>
> The BDI_CAP_STRICTLIMIT issue happens when writeback cache is enabled, since balance_dirty_pages() is triggered in every small writes even if no request is sent to libfuse, which slows things down.
>
> So IMHO, increasing FUSE_MAX_PAGES_PER_REQ would certainly help for big writes, but might not be a solution to BDI_CAP_STRICTLIMIT issue. Please feel free to correct me if I misunderstand anything. Thanks in advance.


Makes sense.  Can you do a kernel profile to see where most of the
time is actually spent in your workload?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-10-01 15:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-06 12:37 [PATCH v4] fuse: add max_pages option Constantine Shulyupin
2018-09-26  8:54 ` Constantine Shulyupin
2018-09-26  9:19   ` Miklos Szeredi
2018-09-26  9:39     ` 答复: " 刘硕然
2018-09-26 10:06       ` Constantine Shulyupin
2018-09-26 10:39         ` 答复: " 刘硕然
2018-10-01  9:18           ` Miklos Szeredi
2018-10-01  9:16   ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).