All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RESEND 0/3] Improvements to page writeback commit policy
@ 2017-06-20 23:35 Trond Myklebust
  2017-06-20 23:35 ` [PATCH RESEND 1/3] NFS: Remove unused fields in the page I/O structures Trond Myklebust
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Trond Myklebust @ 2017-06-20 23:35 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

The following patches are intended to smooth out the page writeback
performance by ensuring that we commit the data earlier on the server.

We assume that if something is starting writeback on the pages, then
that process wants to commit the data as soon as possible, whether it
is an application or just the background flush process.
We also assume that for streaming type processes, we don't want to pause
the I/O in order to commit, so we don't want to rely on a counter of
in-flight I/O to the entire inode going to zero.

We therefore set up a monitor that counts the number of in-flight
writes for each call to nfs_writepages(). Once all the writes to that
call to nfs_writepages has completed, we send the commit. Note that this
mirrors the behaviour for O_DIRECT writes, where we similarly track the
in-flight writes on a per-call basis.

Trond Myklebust (3):
  NFS: Remove unused fields in the page I/O structures
  NFS: Ensure we commit after writeback is complete
  NFS: Fix commit policy for non-blocking calls to nfs_write_inode()

 fs/nfs/pagelist.c        |  5 ++--
 fs/nfs/write.c           | 59 +++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/nfs_page.h |  2 +-
 include/linux/nfs_xdr.h  |  3 ++-
 4 files changed, 64 insertions(+), 5 deletions(-)

-- 
2.9.4


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH RESEND 1/3] NFS: Remove unused fields in the page I/O structures
  2017-06-20 23:35 [PATCH RESEND 0/3] Improvements to page writeback commit policy Trond Myklebust
@ 2017-06-20 23:35 ` Trond Myklebust
  2017-06-20 23:35   ` [PATCH RESEND 2/3] NFS: Ensure we commit after writeback is complete Trond Myklebust
  2017-06-21 14:31 ` [PATCH RESEND 0/3] Improvements to page writeback commit policy Chuck Lever
  2017-06-23 14:56 ` J. Bruce Fields
  2 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2017-06-20 23:35 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

Remove the 'layout_private' fields that were only used by the pNFS OSD
layout driver.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfs/pagelist.c        | 2 --
 include/linux/nfs_page.h | 1 -
 include/linux/nfs_xdr.h  | 1 -
 3 files changed, 4 deletions(-)

diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index ad92b401326c..de107d866297 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -51,7 +51,6 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
 	hdr->io_start = req_offset(hdr->req);
 	hdr->good_bytes = mirror->pg_count;
 	hdr->dreq = desc->pg_dreq;
-	hdr->layout_private = desc->pg_layout_private;
 	hdr->release = release;
 	hdr->completion_ops = desc->pg_completion_ops;
 	if (hdr->completion_ops->init_hdr)
@@ -711,7 +710,6 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
 	desc->pg_error = 0;
 	desc->pg_lseg = NULL;
 	desc->pg_dreq = NULL;
-	desc->pg_layout_private = NULL;
 	desc->pg_bsize = bsize;
 
 	desc->pg_mirror_count = 1;
diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
index 247cc3d3498f..6138cf91346b 100644
--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -94,7 +94,6 @@ struct nfs_pageio_descriptor {
 	const struct nfs_pgio_completion_ops *pg_completion_ops;
 	struct pnfs_layout_segment *pg_lseg;
 	struct nfs_direct_req	*pg_dreq;
-	void			*pg_layout_private;
 	unsigned int		pg_bsize;	/* default bsize for mirrors */
 
 	u32			pg_mirror_count;
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index b28c83475ee8..ee998ed08cf6 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1436,7 +1436,6 @@ struct nfs_pgio_header {
 	const struct nfs_pgio_completion_ops *completion_ops;
 	const struct nfs_rw_ops	*rw_ops;
 	struct nfs_direct_req	*dreq;
-	void			*layout_private;
 	spinlock_t		lock;
 	/* fields protected by lock */
 	int			pnfs_error;
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RESEND 2/3] NFS: Ensure we commit after writeback is complete
  2017-06-20 23:35 ` [PATCH RESEND 1/3] NFS: Remove unused fields in the page I/O structures Trond Myklebust
@ 2017-06-20 23:35   ` Trond Myklebust
  2017-06-20 23:35     ` [PATCH RESEND 3/3] NFS: Fix commit policy for non-blocking calls to nfs_write_inode() Trond Myklebust
  0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2017-06-20 23:35 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

If the page cache is being flushed, then we want to ensure that we
do start a commit once the pages are done being flushed.
If we just wait until all I/O is done to that file, we can end up
livelocking until the balance_dirty_pages() mechanism puts its
foot down and forces I/O to stop.
So instead we do more or less the same thing that O_DIRECT does,
and set up a counter to tell us when the flush is done,

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfs/pagelist.c        |  3 +++
 fs/nfs/write.c           | 57 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/nfs_page.h |  1 +
 include/linux/nfs_xdr.h  |  2 ++
 4 files changed, 63 insertions(+)

diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index de107d866297..83d2918f1b13 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -50,6 +50,7 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc,
 	hdr->cred = hdr->req->wb_context->cred;
 	hdr->io_start = req_offset(hdr->req);
 	hdr->good_bytes = mirror->pg_count;
+	hdr->io_completion = desc->pg_io_completion;
 	hdr->dreq = desc->pg_dreq;
 	hdr->release = release;
 	hdr->completion_ops = desc->pg_completion_ops;
@@ -709,6 +710,7 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc,
 	desc->pg_ioflags = io_flags;
 	desc->pg_error = 0;
 	desc->pg_lseg = NULL;
+	desc->pg_io_completion = NULL;
 	desc->pg_dreq = NULL;
 	desc->pg_bsize = bsize;
 
@@ -1231,6 +1233,7 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc,
 {
 	LIST_HEAD(failed);
 
+	desc->pg_io_completion = hdr->io_completion;
 	desc->pg_dreq = hdr->dreq;
 	while (!list_empty(&hdr->pages)) {
 		struct nfs_page *req = nfs_list_entry(hdr->pages.next);
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index db7ba542559e..051197cb9195 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -40,6 +40,12 @@
 #define MIN_POOL_WRITE		(32)
 #define MIN_POOL_COMMIT		(4)
 
+struct nfs_io_completion {
+	void (*complete)(void *data);
+	void *data;
+	struct kref refcount;
+};
+
 /*
  * Local function declarations
  */
@@ -108,6 +114,39 @@ static void nfs_writehdr_free(struct nfs_pgio_header *hdr)
 	mempool_free(hdr, nfs_wdata_mempool);
 }
 
+static struct nfs_io_completion *nfs_io_completion_alloc(gfp_t gfp_flags)
+{
+	return kmalloc(sizeof(struct nfs_io_completion), gfp_flags);
+}
+
+static void nfs_io_completion_init(struct nfs_io_completion *ioc,
+		void (*complete)(void *), void *data)
+{
+	ioc->complete = complete;
+	ioc->data = data;
+	kref_init(&ioc->refcount);
+}
+
+static void nfs_io_completion_release(struct kref *kref)
+{
+	struct nfs_io_completion *ioc = container_of(kref,
+			struct nfs_io_completion, refcount);
+	ioc->complete(ioc->data);
+	kfree(ioc);
+}
+
+static void nfs_io_completion_get(struct nfs_io_completion *ioc)
+{
+	if (ioc != NULL)
+		kref_get(&ioc->refcount);
+}
+
+static void nfs_io_completion_put(struct nfs_io_completion *ioc)
+{
+	if (ioc != NULL)
+		kref_put(&ioc->refcount, nfs_io_completion_release);
+}
+
 static void nfs_context_set_write_error(struct nfs_open_context *ctx, int error)
 {
 	ctx->error = error;
@@ -681,18 +720,29 @@ static int nfs_writepages_callback(struct page *page, struct writeback_control *
 	return ret;
 }
 
+static void nfs_io_completion_commit(void *inode)
+{
+	nfs_commit_inode(inode, 0);
+}
+
 int nfs_writepages(struct address_space *mapping, struct writeback_control *wbc)
 {
 	struct inode *inode = mapping->host;
 	struct nfs_pageio_descriptor pgio;
+	struct nfs_io_completion *ioc = nfs_io_completion_alloc(GFP_NOFS);
 	int err;
 
 	nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGES);
 
+	if (ioc)
+		nfs_io_completion_init(ioc, nfs_io_completion_commit, inode);
+
 	nfs_pageio_init_write(&pgio, inode, wb_priority(wbc), false,
 				&nfs_async_write_completion_ops);
+	pgio.pg_io_completion = ioc;
 	err = write_cache_pages(mapping, wbc, nfs_writepages_callback, &pgio);
 	nfs_pageio_complete(&pgio);
+	nfs_io_completion_put(ioc);
 
 	if (err < 0)
 		goto out_err;
@@ -940,6 +990,11 @@ int nfs_write_need_commit(struct nfs_pgio_header *hdr)
 	return hdr->verf.committed != NFS_FILE_SYNC;
 }
 
+static void nfs_async_write_init(struct nfs_pgio_header *hdr)
+{
+	nfs_io_completion_get(hdr->io_completion);
+}
+
 static void nfs_write_completion(struct nfs_pgio_header *hdr)
 {
 	struct nfs_commit_info cinfo;
@@ -973,6 +1028,7 @@ static void nfs_write_completion(struct nfs_pgio_header *hdr)
 		nfs_release_request(req);
 	}
 out:
+	nfs_io_completion_put(hdr->io_completion);
 	hdr->release(hdr);
 }
 
@@ -1378,6 +1434,7 @@ static void nfs_async_write_reschedule_io(struct nfs_pgio_header *hdr)
 }
 
 static const struct nfs_pgio_completion_ops nfs_async_write_completion_ops = {
+	.init_hdr = nfs_async_write_init,
 	.error_cleanup = nfs_async_write_error,
 	.completion = nfs_write_completion,
 	.reschedule_io = nfs_async_write_reschedule_io,
diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h
index 6138cf91346b..abbee2d15dce 100644
--- a/include/linux/nfs_page.h
+++ b/include/linux/nfs_page.h
@@ -93,6 +93,7 @@ struct nfs_pageio_descriptor {
 	const struct rpc_call_ops *pg_rpc_callops;
 	const struct nfs_pgio_completion_ops *pg_completion_ops;
 	struct pnfs_layout_segment *pg_lseg;
+	struct nfs_io_completion *pg_io_completion;
 	struct nfs_direct_req	*pg_dreq;
 	unsigned int		pg_bsize;	/* default bsize for mirrors */
 
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index ee998ed08cf6..ab6a8381459f 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1422,6 +1422,7 @@ enum {
 	NFS_IOHDR_STAT,
 };
 
+struct nfs_io_completion;
 struct nfs_pgio_header {
 	struct inode		*inode;
 	struct rpc_cred		*cred;
@@ -1435,6 +1436,7 @@ struct nfs_pgio_header {
 	void (*release) (struct nfs_pgio_header *hdr);
 	const struct nfs_pgio_completion_ops *completion_ops;
 	const struct nfs_rw_ops	*rw_ops;
+	struct nfs_io_completion *io_completion;
 	struct nfs_direct_req	*dreq;
 	spinlock_t		lock;
 	/* fields protected by lock */
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH RESEND 3/3] NFS: Fix commit policy for non-blocking calls to nfs_write_inode()
  2017-06-20 23:35   ` [PATCH RESEND 2/3] NFS: Ensure we commit after writeback is complete Trond Myklebust
@ 2017-06-20 23:35     ` Trond Myklebust
  2017-06-20 23:35       ` [PATCH] SUNRPC: Make slot allocation more reliable Trond Myklebust
  0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2017-06-20 23:35 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

Now that the writes will schedule a commit on their own, we don't
need nfs_write_inode() to schedule one if there are outstanding
writes, and we're being called in non-blocking mode.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 fs/nfs/write.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 051197cb9195..b1af5dee5e0a 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -1941,7 +1941,7 @@ int nfs_write_inode(struct inode *inode, struct writeback_control *wbc)
 		/* Don't commit yet if this is a non-blocking flush and there
 		 * are a lot of outstanding writes for this mapping.
 		 */
-		if (nfsi->commit_info.ncommit <= (nfsi->nrequests >> 1))
+		if (mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK))
 			goto out_mark_dirty;
 
 		/* don't wait for the COMMIT response */
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH] SUNRPC: Make slot allocation more reliable
  2017-06-20 23:35     ` [PATCH RESEND 3/3] NFS: Fix commit policy for non-blocking calls to nfs_write_inode() Trond Myklebust
@ 2017-06-20 23:35       ` Trond Myklebust
  0 siblings, 0 replies; 12+ messages in thread
From: Trond Myklebust @ 2017-06-20 23:35 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-nfs

In xprt_alloc_slot(), the spin lock is only needed to provide atomicity
between the atomic_add_unless() failure and the call to xprt_add_backlog().
We do not actually need to hold it across the memory allocation itself.

By dropping the lock, we can use a more resilient GFP_NOFS allocation,
just as we now do in the rest of the RPC client code.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 net/sunrpc/xprt.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 3e63c5e97ebe..4654a9934269 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1047,13 +1047,15 @@ static bool xprt_throttle_congested(struct rpc_xprt *xprt, struct rpc_task *task
 	return ret;
 }
 
-static struct rpc_rqst *xprt_dynamic_alloc_slot(struct rpc_xprt *xprt, gfp_t gfp_flags)
+static struct rpc_rqst *xprt_dynamic_alloc_slot(struct rpc_xprt *xprt)
 {
 	struct rpc_rqst *req = ERR_PTR(-EAGAIN);
 
 	if (!atomic_add_unless(&xprt->num_reqs, 1, xprt->max_reqs))
 		goto out;
-	req = kzalloc(sizeof(struct rpc_rqst), gfp_flags);
+	spin_unlock(&xprt->reserve_lock);
+	req = kzalloc(sizeof(struct rpc_rqst), GFP_NOFS);
+	spin_lock(&xprt->reserve_lock);
 	if (req != NULL)
 		goto out;
 	atomic_dec(&xprt->num_reqs);
@@ -1081,7 +1083,7 @@ void xprt_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
 		list_del(&req->rq_list);
 		goto out_init_req;
 	}
-	req = xprt_dynamic_alloc_slot(xprt, GFP_NOWAIT|__GFP_NOWARN);
+	req = xprt_dynamic_alloc_slot(xprt);
 	if (!IS_ERR(req))
 		goto out_init_req;
 	switch (PTR_ERR(req)) {
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/3] Improvements to page writeback commit policy
  2017-06-20 23:35 [PATCH RESEND 0/3] Improvements to page writeback commit policy Trond Myklebust
  2017-06-20 23:35 ` [PATCH RESEND 1/3] NFS: Remove unused fields in the page I/O structures Trond Myklebust
@ 2017-06-21 14:31 ` Chuck Lever
  2017-06-23 20:48   ` Chuck Lever
  2017-06-23 14:56 ` J. Bruce Fields
  2 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever @ 2017-06-21 14:31 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna Schumaker, Linux NFS Mailing List


> On Jun 20, 2017, at 7:35 PM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
> 
> The following patches are intended to smooth out the page writeback
> performance by ensuring that we commit the data earlier on the server.
> 
> We assume that if something is starting writeback on the pages, then
> that process wants to commit the data as soon as possible, whether it
> is an application or just the background flush process.
> We also assume that for streaming type processes, we don't want to pause
> the I/O in order to commit, so we don't want to rely on a counter of
> in-flight I/O to the entire inode going to zero.
> 
> We therefore set up a monitor that counts the number of in-flight
> writes for each call to nfs_writepages(). Once all the writes to that
> call to nfs_writepages has completed, we send the commit. Note that this
> mirrors the behaviour for O_DIRECT writes, where we similarly track the
> in-flight writes on a per-call basis.

These are the same as the patches you sent May 16th?
I am trying to get a little time to try them out.


> Trond Myklebust (3):
>  NFS: Remove unused fields in the page I/O structures
>  NFS: Ensure we commit after writeback is complete
>  NFS: Fix commit policy for non-blocking calls to nfs_write_inode()
> 
> fs/nfs/pagelist.c        |  5 ++--
> fs/nfs/write.c           | 59 +++++++++++++++++++++++++++++++++++++++++++++++-
> include/linux/nfs_page.h |  2 +-
> include/linux/nfs_xdr.h  |  3 ++-
> 4 files changed, 64 insertions(+), 5 deletions(-)
> 
> -- 
> 2.9.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/3] Improvements to page writeback commit policy
  2017-06-20 23:35 [PATCH RESEND 0/3] Improvements to page writeback commit policy Trond Myklebust
  2017-06-20 23:35 ` [PATCH RESEND 1/3] NFS: Remove unused fields in the page I/O structures Trond Myklebust
  2017-06-21 14:31 ` [PATCH RESEND 0/3] Improvements to page writeback commit policy Chuck Lever
@ 2017-06-23 14:56 ` J. Bruce Fields
  2017-06-23 15:25   ` Trond Myklebust
  2 siblings, 1 reply; 12+ messages in thread
From: J. Bruce Fields @ 2017-06-23 14:56 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna Schumaker, linux-nfs

On Tue, Jun 20, 2017 at 07:35:35PM -0400, Trond Myklebust wrote:
> The following patches are intended to smooth out the page writeback
> performance by ensuring that we commit the data earlier on the server.

Was there some particular benchmark or hardware setup that motivated
doing this now?

--b.

> 
> We assume that if something is starting writeback on the pages, then
> that process wants to commit the data as soon as possible, whether it
> is an application or just the background flush process.
> We also assume that for streaming type processes, we don't want to pause
> the I/O in order to commit, so we don't want to rely on a counter of
> in-flight I/O to the entire inode going to zero.
> 
> We therefore set up a monitor that counts the number of in-flight
> writes for each call to nfs_writepages(). Once all the writes to that
> call to nfs_writepages has completed, we send the commit. Note that this
> mirrors the behaviour for O_DIRECT writes, where we similarly track the
> in-flight writes on a per-call basis.
> 
> Trond Myklebust (3):
>   NFS: Remove unused fields in the page I/O structures
>   NFS: Ensure we commit after writeback is complete
>   NFS: Fix commit policy for non-blocking calls to nfs_write_inode()
> 
>  fs/nfs/pagelist.c        |  5 ++--
>  fs/nfs/write.c           | 59 +++++++++++++++++++++++++++++++++++++++++++++++-
>  include/linux/nfs_page.h |  2 +-
>  include/linux/nfs_xdr.h  |  3 ++-
>  4 files changed, 64 insertions(+), 5 deletions(-)
> 
> -- 
> 2.9.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/3] Improvements to page writeback commit policy
  2017-06-23 14:56 ` J. Bruce Fields
@ 2017-06-23 15:25   ` Trond Myklebust
  2017-06-23 15:29     ` Trond Myklebust
  0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2017-06-23 15:25 UTC (permalink / raw)
  To: bfields; +Cc: anna.schumaker, linux-nfs

T24gRnJpLCAyMDE3LTA2LTIzIGF0IDEwOjU2IC0wNDAwLCBKLiBCcnVjZSBGaWVsZHMgd3JvdGU6
DQo+IE9uIFR1ZSwgSnVuIDIwLCAyMDE3IGF0IDA3OjM1OjM1UE0gLTA0MDAsIFRyb25kIE15a2xl
YnVzdCB3cm90ZToNCj4gPiBUaGUgZm9sbG93aW5nIHBhdGNoZXMgYXJlIGludGVuZGVkIHRvIHNt
b290aCBvdXQgdGhlIHBhZ2Ugd3JpdGViYWNrDQo+ID4gcGVyZm9ybWFuY2UgYnkgZW5zdXJpbmcg
dGhhdCB3ZSBjb21taXQgdGhlIGRhdGEgZWFybGllciBvbiB0aGUNCj4gPiBzZXJ2ZXIuDQo+IA0K
PiBXYXMgdGhlcmUgc29tZSBwYXJ0aWN1bGFyIGJlbmNobWFyayBvciBoYXJkd2FyZSBzZXR1cCB0
aGF0IG1vdGl2YXRlZA0KPiBkb2luZyB0aGlzIG5vdz8NCg0KSSBmaW5hbGx5IGhhZCBlbm91Z2gg
ZnJlZSB0aW1lIHRvIHN0dWR5IHRoZSBwcm9ibGVtIG9mIHdoeSBteSBiYWNrdXBzDQpmcm9tIG15
IGxhcHRvcCB0byBteSBOQVMgYmFja3VwIGRldmljZSB0b29rIHNvIGxvbmcgdG8gY29tcGxldGUs
IGFuZA0Kd2h5IHRoZSAncnN5bmMnIGJhbmR3aWR0aCBkaXNwbGF5IGlzIHNvIGluY29uc2lzdGVu
dC4NCg0KPiANCj4gLS1iLg0KPiANCj4gPiANCj4gPiBXZSBhc3N1bWUgdGhhdCBpZiBzb21ldGhp
bmcgaXMgc3RhcnRpbmcgd3JpdGViYWNrIG9uIHRoZSBwYWdlcywNCj4gPiB0aGVuDQo+ID4gdGhh
dCBwcm9jZXNzIHdhbnRzIHRvIGNvbW1pdCB0aGUgZGF0YSBhcyBzb29uIGFzIHBvc3NpYmxlLCB3
aGV0aGVyDQo+ID4gaXQNCj4gPiBpcyBhbiBhcHBsaWNhdGlvbiBvciBqdXN0IHRoZSBiYWNrZ3Jv
dW5kIGZsdXNoIHByb2Nlc3MuDQo+ID4gV2UgYWxzbyBhc3N1bWUgdGhhdCBmb3Igc3RyZWFtaW5n
IHR5cGUgcHJvY2Vzc2VzLCB3ZSBkb24ndCB3YW50IHRvDQo+ID4gcGF1c2UNCj4gPiB0aGUgSS9P
IGluIG9yZGVyIHRvIGNvbW1pdCwgc28gd2UgZG9uJ3Qgd2FudCB0byByZWx5IG9uIGEgY291bnRl
cg0KPiA+IG9mDQo+ID4gaW4tZmxpZ2h0IEkvTyB0byB0aGUgZW50aXJlIGlub2RlIGdvaW5nIHRv
IHplcm8uDQo+ID4gDQo+ID4gV2UgdGhlcmVmb3JlIHNldCB1cCBhIG1vbml0b3IgdGhhdCBjb3Vu
dHMgdGhlIG51bWJlciBvZiBpbi1mbGlnaHQNCj4gPiB3cml0ZXMgZm9yIGVhY2ggY2FsbCB0byBu
ZnNfd3JpdGVwYWdlcygpLiBPbmNlIGFsbCB0aGUgd3JpdGVzIHRvDQo+ID4gdGhhdA0KPiA+IGNh
bGwgdG8gbmZzX3dyaXRlcGFnZXMgaGFzIGNvbXBsZXRlZCwgd2Ugc2VuZCB0aGUgY29tbWl0LiBO
b3RlIHRoYXQNCj4gPiB0aGlzDQo+ID4gbWlycm9ycyB0aGUgYmVoYXZpb3VyIGZvciBPX0RJUkVD
VCB3cml0ZXMsIHdoZXJlIHdlIHNpbWlsYXJseSB0cmFjaw0KPiA+IHRoZQ0KPiA+IGluLWZsaWdo
dCB3cml0ZXMgb24gYSBwZXItY2FsbCBiYXNpcy4NCj4gPiANCj4gPiBUcm9uZCBNeWtsZWJ1c3Qg
KDMpOg0KPiA+IMKgIE5GUzogUmVtb3ZlIHVudXNlZCBmaWVsZHMgaW4gdGhlIHBhZ2UgSS9PIHN0
cnVjdHVyZXMNCj4gPiDCoCBORlM6IEVuc3VyZSB3ZSBjb21taXQgYWZ0ZXIgd3JpdGViYWNrIGlz
IGNvbXBsZXRlDQo+ID4gwqAgTkZTOiBGaXggY29tbWl0IHBvbGljeSBmb3Igbm9uLWJsb2NraW5n
IGNhbGxzIHRvDQo+ID4gbmZzX3dyaXRlX2lub2RlKCkNCj4gPiANCj4gPiDCoGZzL25mcy9wYWdl
bGlzdC5jwqDCoMKgwqDCoMKgwqDCoHzCoMKgNSArKy0tDQo+ID4gwqBmcy9uZnMvd3JpdGUuY8Kg
wqDCoMKgwqDCoMKgwqDCoMKgwqB8IDU5DQo+ID4gKysrKysrKysrKysrKysrKysrKysrKysrKysr
KysrKysrKysrKysrKysrKysrKystDQo+ID4gwqBpbmNsdWRlL2xpbnV4L25mc19wYWdlLmggfMKg
wqAyICstDQo+ID4gwqBpbmNsdWRlL2xpbnV4L25mc194ZHIuaMKgwqB8wqDCoDMgKystDQo+ID4g
wqA0IGZpbGVzIGNoYW5nZWQsIDY0IGluc2VydGlvbnMoKyksIDUgZGVsZXRpb25zKC0pDQo+ID4g
DQo+ID4gLS3CoA0KPiA+IDIuOS40DQo+ID4gDQo+ID4gLS0NCj4gPiBUbyB1bnN1YnNjcmliZSBm
cm9tIHRoaXMgbGlzdDogc2VuZCB0aGUgbGluZSAidW5zdWJzY3JpYmUgbGludXgtDQo+ID4gbmZz
IiBpbg0KPiA+IHRoZSBib2R5IG9mIGEgbWVzc2FnZSB0byBtYWpvcmRvbW9Admdlci5rZXJuZWwu
b3JnDQo+ID4gTW9yZSBtYWpvcmRvbW8gaW5mbyBhdMKgwqBodHRwOi8vdmdlci5rZXJuZWwub3Jn
L21ham9yZG9tby1pbmZvLmh0bWwNCj4gDQo+IA0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4
IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgUHJpbWFyeURhdGENCnRyb25kLm15a2xlYnVzdEBwcmlt
YXJ5ZGF0YS5jb20NCg==


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/3] Improvements to page writeback commit policy
  2017-06-23 15:25   ` Trond Myklebust
@ 2017-06-23 15:29     ` Trond Myklebust
  0 siblings, 0 replies; 12+ messages in thread
From: Trond Myklebust @ 2017-06-23 15:29 UTC (permalink / raw)
  To: bfields; +Cc: anna.schumaker, linux-nfs

T24gRnJpLCAyMDE3LTA2LTIzIGF0IDExOjI1IC0wNDAwLCBUcm9uZCBNeWtsZWJ1c3Qgd3JvdGU6
DQo+IE9uIEZyaSwgMjAxNy0wNi0yMyBhdCAxMDo1NiAtMDQwMCwgSi4gQnJ1Y2UgRmllbGRzIHdy
b3RlOg0KPiA+IE9uIFR1ZSwgSnVuIDIwLCAyMDE3IGF0IDA3OjM1OjM1UE0gLTA0MDAsIFRyb25k
IE15a2xlYnVzdCB3cm90ZToNCj4gPiA+IFRoZSBmb2xsb3dpbmcgcGF0Y2hlcyBhcmUgaW50ZW5k
ZWQgdG8gc21vb3RoIG91dCB0aGUgcGFnZQ0KPiA+ID4gd3JpdGViYWNrDQo+ID4gPiBwZXJmb3Jt
YW5jZSBieSBlbnN1cmluZyB0aGF0IHdlIGNvbW1pdCB0aGUgZGF0YSBlYXJsaWVyIG9uIHRoZQ0K
PiA+ID4gc2VydmVyLg0KPiA+IA0KPiA+IFdhcyB0aGVyZSBzb21lIHBhcnRpY3VsYXIgYmVuY2ht
YXJrIG9yIGhhcmR3YXJlIHNldHVwIHRoYXQNCj4gPiBtb3RpdmF0ZWQNCj4gPiBkb2luZyB0aGlz
IG5vdz8NCj4gDQo+IEkgZmluYWxseSBoYWQgZW5vdWdoIGZyZWUgdGltZSB0byBzdHVkeSB0aGUg
cHJvYmxlbSBvZiB3aHkgbXkgYmFja3Vwcw0KPiBmcm9tIG15IGxhcHRvcCB0byBteSBOQVMgYmFj
a3VwIGRldmljZSB0b29rIHNvIGxvbmcgdG8gY29tcGxldGUsIGFuZA0KPiB3aHkgdGhlICdyc3lu
YycgYmFuZHdpZHRoIGRpc3BsYXkgaXMgc28gaW5jb25zaXN0ZW50Lg0KDQpUaGUgTkFTIGRldmlj
ZSBpbiBxdWVzdGlvbiBpcyBhIG5vLW5hbWUgYnJhbmQgTkZTdjMgYm94LCBCVFcuIE5vIGZhbmN5
DQpwTkZTIG9yIGV2ZW4gU1NEcy4NCg0KPiANCj4gPiANCj4gPiAtLWIuDQo+ID4gDQo+ID4gPiAN
Cj4gPiA+IFdlIGFzc3VtZSB0aGF0IGlmIHNvbWV0aGluZyBpcyBzdGFydGluZyB3cml0ZWJhY2sg
b24gdGhlIHBhZ2VzLA0KPiA+ID4gdGhlbg0KPiA+ID4gdGhhdCBwcm9jZXNzIHdhbnRzIHRvIGNv
bW1pdCB0aGUgZGF0YSBhcyBzb29uIGFzIHBvc3NpYmxlLA0KPiA+ID4gd2hldGhlcg0KPiA+ID4g
aXQNCj4gPiA+IGlzIGFuIGFwcGxpY2F0aW9uIG9yIGp1c3QgdGhlIGJhY2tncm91bmQgZmx1c2gg
cHJvY2Vzcy4NCj4gPiA+IFdlIGFsc28gYXNzdW1lIHRoYXQgZm9yIHN0cmVhbWluZyB0eXBlIHBy
b2Nlc3Nlcywgd2UgZG9uJ3Qgd2FudA0KPiA+ID4gdG8NCj4gPiA+IHBhdXNlDQo+ID4gPiB0aGUg
SS9PIGluIG9yZGVyIHRvIGNvbW1pdCwgc28gd2UgZG9uJ3Qgd2FudCB0byByZWx5IG9uIGEgY291
bnRlcg0KPiA+ID4gb2YNCj4gPiA+IGluLWZsaWdodCBJL08gdG8gdGhlIGVudGlyZSBpbm9kZSBn
b2luZyB0byB6ZXJvLg0KPiA+ID4gDQo+ID4gPiBXZSB0aGVyZWZvcmUgc2V0IHVwIGEgbW9uaXRv
ciB0aGF0IGNvdW50cyB0aGUgbnVtYmVyIG9mIGluLWZsaWdodA0KPiA+ID4gd3JpdGVzIGZvciBl
YWNoIGNhbGwgdG8gbmZzX3dyaXRlcGFnZXMoKS4gT25jZSBhbGwgdGhlIHdyaXRlcyB0bw0KPiA+
ID4gdGhhdA0KPiA+ID4gY2FsbCB0byBuZnNfd3JpdGVwYWdlcyBoYXMgY29tcGxldGVkLCB3ZSBz
ZW5kIHRoZSBjb21taXQuIE5vdGUNCj4gPiA+IHRoYXQNCj4gPiA+IHRoaXMNCj4gPiA+IG1pcnJv
cnMgdGhlIGJlaGF2aW91ciBmb3IgT19ESVJFQ1Qgd3JpdGVzLCB3aGVyZSB3ZSBzaW1pbGFybHkN
Cj4gPiA+IHRyYWNrDQo+ID4gPiB0aGUNCj4gPiA+IGluLWZsaWdodCB3cml0ZXMgb24gYSBwZXIt
Y2FsbCBiYXNpcy4NCj4gPiA+IA0KPiA+ID4gVHJvbmQgTXlrbGVidXN0ICgzKToNCj4gPiA+IMKg
IE5GUzogUmVtb3ZlIHVudXNlZCBmaWVsZHMgaW4gdGhlIHBhZ2UgSS9PIHN0cnVjdHVyZXMNCj4g
PiA+IMKgIE5GUzogRW5zdXJlIHdlIGNvbW1pdCBhZnRlciB3cml0ZWJhY2sgaXMgY29tcGxldGUN
Cj4gPiA+IMKgIE5GUzogRml4IGNvbW1pdCBwb2xpY3kgZm9yIG5vbi1ibG9ja2luZyBjYWxscyB0
bw0KPiA+ID4gbmZzX3dyaXRlX2lub2RlKCkNCj4gPiA+IA0KPiA+ID4gwqBmcy9uZnMvcGFnZWxp
c3QuY8KgwqDCoMKgwqDCoMKgwqB8wqDCoDUgKystLQ0KPiA+ID4gwqBmcy9uZnMvd3JpdGUuY8Kg
wqDCoMKgwqDCoMKgwqDCoMKgwqB8IDU5DQo+ID4gPiArKysrKysrKysrKysrKysrKysrKysrKysr
KysrKysrKysrKysrKysrKysrKysrKy0NCj4gPiA+IMKgaW5jbHVkZS9saW51eC9uZnNfcGFnZS5o
IHzCoMKgMiArLQ0KPiA+ID4gwqBpbmNsdWRlL2xpbnV4L25mc194ZHIuaMKgwqB8wqDCoDMgKyst
DQo+ID4gPiDCoDQgZmlsZXMgY2hhbmdlZCwgNjQgaW5zZXJ0aW9ucygrKSwgNSBkZWxldGlvbnMo
LSkNCj4gPiA+IA0KPiA+ID4gLS3CoA0KPiA+ID4gMi45LjQNCj4gPiA+IA0KPiA+ID4gLS0NCj4g
PiA+IFRvIHVuc3Vic2NyaWJlIGZyb20gdGhpcyBsaXN0OiBzZW5kIHRoZSBsaW5lICJ1bnN1YnNj
cmliZSBsaW51eC0NCj4gPiA+IG5mcyIgaW4NCj4gPiA+IHRoZSBib2R5IG9mIGEgbWVzc2FnZSB0
byBtYWpvcmRvbW9Admdlci5rZXJuZWwub3JnDQo+ID4gPiBNb3JlIG1ham9yZG9tbyBpbmZvIGF0
wqDCoGh0dHA6Ly92Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWluZm8uaHRtDQo+ID4gPiBsDQo+
ID4gDQo+ID4gDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBtYWludGFp
bmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/3] Improvements to page writeback commit policy
  2017-06-21 14:31 ` [PATCH RESEND 0/3] Improvements to page writeback commit policy Chuck Lever
@ 2017-06-23 20:48   ` Chuck Lever
  2017-06-23 21:17     ` Trond Myklebust
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever @ 2017-06-23 20:48 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna Schumaker, Linux NFS Mailing List


> On Jun 21, 2017, at 10:31 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
>> 
>> On Jun 20, 2017, at 7:35 PM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
>> 
>> The following patches are intended to smooth out the page writeback
>> performance by ensuring that we commit the data earlier on the server.
>> 
>> We assume that if something is starting writeback on the pages, then
>> that process wants to commit the data as soon as possible, whether it
>> is an application or just the background flush process.
>> We also assume that for streaming type processes, we don't want to pause
>> the I/O in order to commit, so we don't want to rely on a counter of
>> in-flight I/O to the entire inode going to zero.
>> 
>> We therefore set up a monitor that counts the number of in-flight
>> writes for each call to nfs_writepages(). Once all the writes to that
>> call to nfs_writepages has completed, we send the commit. Note that this
>> mirrors the behaviour for O_DIRECT writes, where we similarly track the
>> in-flight writes on a per-call basis.
> 
> These are the same as the patches you sent May 16th?
> I am trying to get a little time to try them out.

After applying these four patches, I ran a series of iozone
benchmarks with buffered and direct I/O. NFSv3 and NFSv4.0
on RDMA. Exports were tmpfs and xfs on NVMe.

I see about a 10% improvement in buffered write throughput,
no degradation elsewhere, and no crashes or other misbehav-
ior.

xfstests passes with the usual few failures.

Buffered write throughput is still limited to 1GBps when
targeting a tmpfs export on a 5.6GBps network. The server
isn't breaking a sweat, but the client appears to be hit-
ting some spin locks pretty hard. This is similar behavior
to before the patches were applied.


>> Trond Myklebust (3):
>> NFS: Remove unused fields in the page I/O structures
>> NFS: Ensure we commit after writeback is complete
>> NFS: Fix commit policy for non-blocking calls to nfs_write_inode()
>> 
>> fs/nfs/pagelist.c        |  5 ++--
>> fs/nfs/write.c           | 59 +++++++++++++++++++++++++++++++++++++++++++++++-
>> include/linux/nfs_page.h |  2 +-
>> include/linux/nfs_xdr.h  |  3 ++-
>> 4 files changed, 64 insertions(+), 5 deletions(-)
>> 
>> -- 
>> 2.9.4
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/3] Improvements to page writeback commit policy
  2017-06-23 20:48   ` Chuck Lever
@ 2017-06-23 21:17     ` Trond Myklebust
  2017-06-23 21:35       ` Chuck Lever
  0 siblings, 1 reply; 12+ messages in thread
From: Trond Myklebust @ 2017-06-23 21:17 UTC (permalink / raw)
  To: Trond Myklebust, chuck.lever; +Cc: anna.schumaker, linux-nfs

T24gRnJpLCAyMDE3LTA2LTIzIGF0IDE2OjQ4IC0wNDAwLCBDaHVjayBMZXZlciB3cm90ZToNCj4g
PiBPbiBKdW4gMjEsIDIwMTcsIGF0IDEwOjMxIEFNLCBDaHVjayBMZXZlciA8Y2h1Y2subGV2ZXJA
b3JhY2xlLmNvbT4NCj4gPiB3cm90ZToNCj4gPiANCj4gPiA+IA0KPiA+ID4gT24gSnVuIDIwLCAy
MDE3LCBhdCA3OjM1IFBNLCBUcm9uZCBNeWtsZWJ1c3QgPHRyb25kLm15a2xlYnVzdEBwcmkNCj4g
PiA+IG1hcnlkYXRhLmNvbT4gd3JvdGU6DQo+ID4gPiANCj4gPiA+IFRoZSBmb2xsb3dpbmcgcGF0
Y2hlcyBhcmUgaW50ZW5kZWQgdG8gc21vb3RoIG91dCB0aGUgcGFnZQ0KPiA+ID4gd3JpdGViYWNr
DQo+ID4gPiBwZXJmb3JtYW5jZSBieSBlbnN1cmluZyB0aGF0IHdlIGNvbW1pdCB0aGUgZGF0YSBl
YXJsaWVyIG9uIHRoZQ0KPiA+ID4gc2VydmVyLg0KPiA+ID4gDQo+ID4gPiBXZSBhc3N1bWUgdGhh
dCBpZiBzb21ldGhpbmcgaXMgc3RhcnRpbmcgd3JpdGViYWNrIG9uIHRoZSBwYWdlcywNCj4gPiA+
IHRoZW4NCj4gPiA+IHRoYXQgcHJvY2VzcyB3YW50cyB0byBjb21taXQgdGhlIGRhdGEgYXMgc29v
biBhcyBwb3NzaWJsZSwNCj4gPiA+IHdoZXRoZXIgaXQNCj4gPiA+IGlzIGFuIGFwcGxpY2F0aW9u
IG9yIGp1c3QgdGhlIGJhY2tncm91bmQgZmx1c2ggcHJvY2Vzcy4NCj4gPiA+IFdlIGFsc28gYXNz
dW1lIHRoYXQgZm9yIHN0cmVhbWluZyB0eXBlIHByb2Nlc3Nlcywgd2UgZG9uJ3Qgd2FudA0KPiA+
ID4gdG8gcGF1c2UNCj4gPiA+IHRoZSBJL08gaW4gb3JkZXIgdG8gY29tbWl0LCBzbyB3ZSBkb24n
dCB3YW50IHRvIHJlbHkgb24gYSBjb3VudGVyDQo+ID4gPiBvZg0KPiA+ID4gaW4tZmxpZ2h0IEkv
TyB0byB0aGUgZW50aXJlIGlub2RlIGdvaW5nIHRvIHplcm8uDQo+ID4gPiANCj4gPiA+IFdlIHRo
ZXJlZm9yZSBzZXQgdXAgYSBtb25pdG9yIHRoYXQgY291bnRzIHRoZSBudW1iZXIgb2YgaW4tZmxp
Z2h0DQo+ID4gPiB3cml0ZXMgZm9yIGVhY2ggY2FsbCB0byBuZnNfd3JpdGVwYWdlcygpLiBPbmNl
IGFsbCB0aGUgd3JpdGVzIHRvDQo+ID4gPiB0aGF0DQo+ID4gPiBjYWxsIHRvIG5mc193cml0ZXBh
Z2VzIGhhcyBjb21wbGV0ZWQsIHdlIHNlbmQgdGhlIGNvbW1pdC4gTm90ZQ0KPiA+ID4gdGhhdCB0
aGlzDQo+ID4gPiBtaXJyb3JzIHRoZSBiZWhhdmlvdXIgZm9yIE9fRElSRUNUIHdyaXRlcywgd2hl
cmUgd2Ugc2ltaWxhcmx5DQo+ID4gPiB0cmFjayB0aGUNCj4gPiA+IGluLWZsaWdodCB3cml0ZXMg
b24gYSBwZXItY2FsbCBiYXNpcy4NCj4gPiANCj4gPiBUaGVzZSBhcmUgdGhlIHNhbWUgYXMgdGhl
IHBhdGNoZXMgeW91IHNlbnQgTWF5IDE2dGg/DQo+ID4gSSBhbSB0cnlpbmcgdG8gZ2V0IGEgbGl0
dGxlIHRpbWUgdG8gdHJ5IHRoZW0gb3V0Lg0KPiANCj4gQWZ0ZXIgYXBwbHlpbmcgdGhlc2UgZm91
ciBwYXRjaGVzLCBJIHJhbiBhIHNlcmllcyBvZiBpb3pvbmUNCj4gYmVuY2htYXJrcyB3aXRoIGJ1
ZmZlcmVkIGFuZCBkaXJlY3QgSS9PLiBORlN2MyBhbmQgTkZTdjQuMA0KPiBvbiBSRE1BLiBFeHBv
cnRzIHdlcmUgdG1wZnMgYW5kIHhmcyBvbiBOVk1lLg0KPiANCj4gSSBzZWUgYWJvdXQgYSAxMCUg
aW1wcm92ZW1lbnQgaW4gYnVmZmVyZWQgd3JpdGUgdGhyb3VnaHB1dCwNCj4gbm8gZGVncmFkYXRp
b24gZWxzZXdoZXJlLCBhbmQgbm8gY3Jhc2hlcyBvciBvdGhlciBtaXNiZWhhdi0NCj4gaW9yLg0K
DQpDb29sISBUaGFua3MgZm9yIHRlc3RpbmcuDQoNCj4gDQo+IHhmc3Rlc3RzIHBhc3NlcyB3aXRo
IHRoZSB1c3VhbCBmZXcgZmFpbHVyZXMuDQo+IA0KPiBCdWZmZXJlZCB3cml0ZSB0aHJvdWdocHV0
IGlzIHN0aWxsIGxpbWl0ZWQgdG8gMUdCcHMgd2hlbg0KPiB0YXJnZXRpbmcgYSB0bXBmcyBleHBv
cnQgb24gYSA1LjZHQnBzIG5ldHdvcmsuIFRoZSBzZXJ2ZXINCj4gaXNuJ3QgYnJlYWtpbmcgYSBz
d2VhdCwgYnV0IHRoZSBjbGllbnQgYXBwZWFycyB0byBiZSBoaXQtDQo+IHRpbmcgc29tZSBzcGlu
IGxvY2tzIHByZXR0eSBoYXJkLiBUaGlzIGlzIHNpbWlsYXIgYmVoYXZpb3INCj4gdG8gYmVmb3Jl
IHRoZSBwYXRjaGVzIHdlcmUgYXBwbGllZC4NCg0KSnVzdCBvdXQgb2YgY3VyaW9zaXR5LCBkbyB5
b3Ugc2VlIHRoZSBzYW1lIGJlaGF2aW91ciB3aXRoIE9fRElSRUNUDQphZ2FpbnN0IHRoZSB0bXBm
cz8gVGhlcmUgYXJlIDIgZGlmZmVyZW5jZXMgdGhlcmU6DQoxKSBubyBpbm9kZV9sb2NrKGlub2Rl
KSBjb250ZW50aW9uLg0KMikgc2xpZ2hseSBsZXNzIGlub2RlLT5pX2xvY2sgc3BpbmxvY2sgY29u
dGVudGlvbi4NCg0KPiANCj4gPiA+IFRyb25kIE15a2xlYnVzdCAoMyk6DQo+ID4gPiBORlM6IFJl
bW92ZSB1bnVzZWQgZmllbGRzIGluIHRoZSBwYWdlIEkvTyBzdHJ1Y3R1cmVzDQo+ID4gPiBORlM6
IEVuc3VyZSB3ZSBjb21taXQgYWZ0ZXIgd3JpdGViYWNrIGlzIGNvbXBsZXRlDQo+ID4gPiBORlM6
IEZpeCBjb21taXQgcG9saWN5IGZvciBub24tYmxvY2tpbmcgY2FsbHMgdG8NCj4gPiA+IG5mc193
cml0ZV9pbm9kZSgpDQo+ID4gPiANCj4gPiA+IGZzL25mcy9wYWdlbGlzdC5jwqDCoMKgwqDCoMKg
wqDCoHzCoMKgNSArKy0tDQo+ID4gPiBmcy9uZnMvd3JpdGUuY8KgwqDCoMKgwqDCoMKgwqDCoMKg
wqB8IDU5DQo+ID4gPiArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr
KysrKy0NCj4gPiA+IGluY2x1ZGUvbGludXgvbmZzX3BhZ2UuaCB8wqDCoDIgKy0NCj4gPiA+IGlu
Y2x1ZGUvbGludXgvbmZzX3hkci5owqDCoHzCoMKgMyArKy0NCj4gPiA+IDQgZmlsZXMgY2hhbmdl
ZCwgNjQgaW5zZXJ0aW9ucygrKSwgNSBkZWxldGlvbnMoLSkNCj4gPiA+IA0KPiA+ID4gLS3CoA0K
PiA+ID4gMi45LjQNCj4gPiA+IA0KPiA+ID4gLS0NCj4gPiA+IFRvIHVuc3Vic2NyaWJlIGZyb20g
dGhpcyBsaXN0OiBzZW5kIHRoZSBsaW5lICJ1bnN1YnNjcmliZSBsaW51eC0NCj4gPiA+IG5mcyIg
aW4NCj4gPiA+IHRoZSBib2R5IG9mIGEgbWVzc2FnZSB0byBtYWpvcmRvbW9Admdlci5rZXJuZWwu
b3JnDQo+ID4gPiBNb3JlIG1ham9yZG9tbyBpbmZvIGF0wqDCoGh0dHA6Ly92Z2VyLmtlcm5lbC5v
cmcvbWFqb3Jkb21vLWluZm8uaHRtDQo+ID4gPiBsDQo+ID4gDQo+ID4gLS0NCj4gPiBDaHVjayBM
ZXZlcg0KPiA+IA0KPiA+IA0KPiA+IA0KPiA+IC0tDQo+ID4gVG8gdW5zdWJzY3JpYmUgZnJvbSB0
aGlzIGxpc3Q6IHNlbmQgdGhlIGxpbmUgInVuc3Vic2NyaWJlIGxpbnV4LQ0KPiA+IG5mcyIgaW4N
Cj4gPiB0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8gbWFqb3Jkb21vQHZnZXIua2VybmVsLm9yZw0K
PiA+IE1vcmUgbWFqb3Jkb21vIGluZm8gYXTCoMKgaHR0cDovL3ZnZXIua2VybmVsLm9yZy9tYWpv
cmRvbW8taW5mby5odG1sDQo+IA0KPiAtLQ0KPiBDaHVjayBMZXZlcg0KPiANCj4gDQo+IA0KLS0g
DQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgUHJpbWFyeURh
dGENCnRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20NCg==


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RESEND 0/3] Improvements to page writeback commit policy
  2017-06-23 21:17     ` Trond Myklebust
@ 2017-06-23 21:35       ` Chuck Lever
  0 siblings, 0 replies; 12+ messages in thread
From: Chuck Lever @ 2017-06-23 21:35 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Anna Schumaker, Linux NFS Mailing List


> On Jun 23, 2017, at 5:17 PM, Trond Myklebust <trondmy@primarydata.com> wrote:
> 
> On Fri, 2017-06-23 at 16:48 -0400, Chuck Lever wrote:
>>> On Jun 21, 2017, at 10:31 AM, Chuck Lever <chuck.lever@oracle.com>
>>> wrote:
>>> 
>>>> 
>>>> On Jun 20, 2017, at 7:35 PM, Trond Myklebust <trond.myklebust@pri
>>>> marydata.com> wrote:
>>>> 
>>>> The following patches are intended to smooth out the page
>>>> writeback
>>>> performance by ensuring that we commit the data earlier on the
>>>> server.
>>>> 
>>>> We assume that if something is starting writeback on the pages,
>>>> then
>>>> that process wants to commit the data as soon as possible,
>>>> whether it
>>>> is an application or just the background flush process.
>>>> We also assume that for streaming type processes, we don't want
>>>> to pause
>>>> the I/O in order to commit, so we don't want to rely on a counter
>>>> of
>>>> in-flight I/O to the entire inode going to zero.
>>>> 
>>>> We therefore set up a monitor that counts the number of in-flight
>>>> writes for each call to nfs_writepages(). Once all the writes to
>>>> that
>>>> call to nfs_writepages has completed, we send the commit. Note
>>>> that this
>>>> mirrors the behaviour for O_DIRECT writes, where we similarly
>>>> track the
>>>> in-flight writes on a per-call basis.
>>> 
>>> These are the same as the patches you sent May 16th?
>>> I am trying to get a little time to try them out.
>> 
>> After applying these four patches, I ran a series of iozone
>> benchmarks with buffered and direct I/O. NFSv3 and NFSv4.0
>> on RDMA. Exports were tmpfs and xfs on NVMe.
>> 
>> I see about a 10% improvement in buffered write throughput,
>> no degradation elsewhere, and no crashes or other misbehav-
>> ior.
> 
> Cool! Thanks for testing.
> 
>> 
>> xfstests passes with the usual few failures.
>> 
>> Buffered write throughput is still limited to 1GBps when
>> targeting a tmpfs export on a 5.6GBps network. The server
>> isn't breaking a sweat, but the client appears to be hit-
>> ting some spin locks pretty hard. This is similar behavior
>> to before the patches were applied.
> 
> Just out of curiosity, do you see the same behaviour with O_DIRECT
> against the tmpfs?

No.


> There are 2 differences there:
> 1) no inode_lock(inode) contention.
> 2) slighly less inode->i_lock spinlock contention.

Here's buffered I/O, 1MB rsize/wsize:

	Include close in write timing
	Command line used: /home/cel/bin/iozone -i0 -i1 -s4g -y1k -az -c
	Output is in kBytes/sec
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 kBytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.

              kB  reclen    write  rewrite    read    reread
         4194304       1   534253   570782  1445754  1354491
         4194304       2   734277   853665  2204343  2023764
         4194304       4   960679  1097920  3364254  2935551
         4194304       8   966103  1167984  4105734  3508967
         4194304      16  1035137  1218580  4251939  3626800
         4194304      32  1071914  1263524  4529706  3813485
         4194304      64  1078425  1221345  4631985  3865276
         4194304     128  1088618  1292963  4516240  3755776
         4194304     256  1076105  1236686  4148944  3535090
         4194304     512  1055872  1285594  4236854  3588770
         4194304    1024  1074738  1257684  4248442  3598040
         4194304    2048  1080189  1232026  4283919  3622818
         4194304    4096  1060772  1282839  4268281  3605311
         4194304    8192  1035067  1216913  3409080  2977354
         4194304   16384  1027003  1206250  2671951  2396517  

Here's direct I/O, 1MB rsize/wsize:

	O_DIRECT feature enabled
	Command line used: /home/cel/bin/iozone -i0 -i1 -s128m -y1k -az -I
	Output is in kBytes/sec
	Time Resolution = 0.000001 seconds.
	Processor cache size set to 1024 kBytes.
	Processor cache line size set to 32 bytes.
	File stride size set to 17 * record size.
                                    
              kB  reclen    write  rewrite    read    reread
          131072       1    23010    23523    25882    25831
          131072       2    45174    46255    51357    51406
          131072       4    69723    71039    93943    93880
          131072       8   131892   135438   179759   182036
          131072      16   245077   252067   335448   335486
          131072      32   415335   445705   600465   606896
          131072      64   647643   702595   923036   960093
          131072     128   910638   914057  1291528  1356444
          131072     256  1164078  1164266  1534979  1585828
          131072     512  1088692  1312085  1871873  1856387
          131072    1024  1243072  1363032  1858835  1925179
          131072    2048  1664066  1831074  2538926  2598939
          131072    4096  2205889  2392262  3608012  3686869
          131072    8192  2544002  2310414  4546863  4493238
          131072   16384  2597748  2164045  3629498  5016898


>>>> Trond Myklebust (3):
>>>> NFS: Remove unused fields in the page I/O structures
>>>> NFS: Ensure we commit after writeback is complete
>>>> NFS: Fix commit policy for non-blocking calls to
>>>> nfs_write_inode()
>>>> 
>>>> fs/nfs/pagelist.c聽聽聽聽聽聽聽聽|聽聽5 ++--
>>>> fs/nfs/write.c聽聽聽聽聽聽聽聽聽聽聽| 59
>>>> +++++++++++++++++++++++++++++++++++++++++++++++-
>>>> include/linux/nfs_page.h |聽聽2 +-
>>>> include/linux/nfs_xdr.h聽聽|聽聽3 ++-
>>>> 4 files changed, 64 insertions(+), 5 deletions(-)
>>>> 
>>>> --聽
>>>> 2.9.4
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-
>>>> nfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at聽聽http://vger.kernel.org/majordomo-info.htm
>>>> l
>>> 
>>> --
>>> Chuck Lever
>>> 
>>> 
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-
>>> nfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at聽聽http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, PrimaryData
> trond.myklebust@primarydata.com
> ��N嫥叉靣笡y氊b瞂千v豝�)藓{.n�+壏{睗�"炟^n噐■��侂h櫒璀�&Ⅷ�瓽珴閔��(殠娸"濟���m��飦赇z罐枈帼f"穐殘坢

--
Chuck Lever




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-06-23 21:35 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-20 23:35 [PATCH RESEND 0/3] Improvements to page writeback commit policy Trond Myklebust
2017-06-20 23:35 ` [PATCH RESEND 1/3] NFS: Remove unused fields in the page I/O structures Trond Myklebust
2017-06-20 23:35   ` [PATCH RESEND 2/3] NFS: Ensure we commit after writeback is complete Trond Myklebust
2017-06-20 23:35     ` [PATCH RESEND 3/3] NFS: Fix commit policy for non-blocking calls to nfs_write_inode() Trond Myklebust
2017-06-20 23:35       ` [PATCH] SUNRPC: Make slot allocation more reliable Trond Myklebust
2017-06-21 14:31 ` [PATCH RESEND 0/3] Improvements to page writeback commit policy Chuck Lever
2017-06-23 20:48   ` Chuck Lever
2017-06-23 21:17     ` Trond Myklebust
2017-06-23 21:35       ` Chuck Lever
2017-06-23 14:56 ` J. Bruce Fields
2017-06-23 15:25   ` Trond Myklebust
2017-06-23 15:29     ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.