linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4 V2] Avoid quadratic search when freeing delegations
@ 2018-04-30  4:31 NeilBrown
  2018-04-30  4:31 ` [PATCH 3/4] rculist: add list_for_each_entry_from_rcu() NeilBrown
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: NeilBrown @ 2018-04-30  4:31 UTC (permalink / raw)
  To: Paul E. McKenney, Trond Myklebust, Mathieu Desnoyers, Anna Schumaker
  Cc: linux-nfs, Lai Jiangshan, Josh Triplett, Steven Rostedt, linux-kernel

Following review from Mathieu (thanks) I've made some revisions
and split this into four patches.  The RCU change is now in patch 3
by itself.
I've also revised the description in the main (final) patch quite
a bit.

Thanks,
NeilBrown


---

NeilBrown (4):
      NFS: slight optimization for walking list for delegations
      NFS: use cond_resched() when restarting walk of delegation list.
      rculist: add list_for_each_entry_from_rcu()
      NFS: Avoid quadratic search when freeing delegations.


 fs/nfs/delegation.c     |   66 ++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/rculist.h |   13 +++++++++
 2 files changed, 72 insertions(+), 7 deletions(-)

--
Signature

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4] NFS: slight optimization for walking list for delegations
  2018-04-30  4:31 [PATCH 0/4 V2] Avoid quadratic search when freeing delegations NeilBrown
  2018-04-30  4:31 ` [PATCH 3/4] rculist: add list_for_each_entry_from_rcu() NeilBrown
@ 2018-04-30  4:31 ` NeilBrown
  2018-04-30 15:17   ` Steven Rostedt
  2018-04-30  4:31 ` [PATCH 2/4] NFS: use cond_resched() when restarting walk of delegation list NeilBrown
  2018-04-30  4:31 ` [PATCH 4/4] NFS: Avoid quadratic search when freeing delegations NeilBrown
  3 siblings, 1 reply; 15+ messages in thread
From: NeilBrown @ 2018-04-30  4:31 UTC (permalink / raw)
  To: Paul E. McKenney, Trond Myklebust, Mathieu Desnoyers, Anna Schumaker
  Cc: linux-nfs, Lai Jiangshan, Josh Triplett, Steven Rostedt, linux-kernel

There are 3 places where we walk the list of delegations
for an nfs_client.
In each case there are two nested loops, one for nfs_servers
and one for nfs_delegations.

When we find an interesting delegation we try to get an active
reference to the server.  If that fails, it is pointless to
continue to look at the other delegation for the server as
we will never be able to get an active reference.
So instead of continuing in the inner loop, break out
and continue in the outer loop.
---
 fs/nfs/delegation.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 1819d0d0ba4b..af32365894c8 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -495,7 +495,7 @@ int nfs_client_return_marked_delegations(struct nfs_client *clp)
 			if (!nfs_delegation_need_return(delegation))
 				continue;
 			if (!nfs_sb_active(server->super))
-				continue;
+				break;
 			inode = nfs_delegation_grab_inode(delegation);
 			if (inode == NULL) {
 				rcu_read_unlock();
@@ -887,7 +887,7 @@ void nfs_delegation_reap_unclaimed(struct nfs_client *clp)
 						&delegation->flags) == 0)
 				continue;
 			if (!nfs_sb_active(server->super))
-				continue;
+				break;
 			inode = nfs_delegation_grab_inode(delegation);
 			if (inode == NULL) {
 				rcu_read_unlock();
@@ -995,7 +995,7 @@ void nfs_reap_expired_delegations(struct nfs_client *clp)
 						&delegation->flags) == 0)
 				continue;
 			if (!nfs_sb_active(server->super))
-				continue;
+				break;
 			inode = nfs_delegation_grab_inode(delegation);
 			if (inode == NULL) {
 				rcu_read_unlock();

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/4] NFS: use cond_resched() when restarting walk of delegation list.
  2018-04-30  4:31 [PATCH 0/4 V2] Avoid quadratic search when freeing delegations NeilBrown
  2018-04-30  4:31 ` [PATCH 3/4] rculist: add list_for_each_entry_from_rcu() NeilBrown
  2018-04-30  4:31 ` [PATCH 1/4] NFS: slight optimization for walking list for delegations NeilBrown
@ 2018-04-30  4:31 ` NeilBrown
  2018-04-30  4:31 ` [PATCH 4/4] NFS: Avoid quadratic search when freeing delegations NeilBrown
  3 siblings, 0 replies; 15+ messages in thread
From: NeilBrown @ 2018-04-30  4:31 UTC (permalink / raw)
  To: Paul E. McKenney, Trond Myklebust, Mathieu Desnoyers, Anna Schumaker
  Cc: linux-nfs, Lai Jiangshan, Josh Triplett, Steven Rostedt, linux-kernel

In three places we walk the list of delegations for an nfs_client
until an interesting one is found, then we act of that delegation
and restart the walk.

New delegations are added to the end of a list and the interesting
delegations are usually old, so in many case we won't repeat
a long walk over and over again, but it is possible - particularly if
the first server in the list has a large number of uninteresting
delegations.

In each cache the work done on interesting delegations will often
complete without sleeping, so this could loop many times without
giving up the CPU.

So add a cond_resched() at an appropriate point to avoid hogging the
CPU for too long.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 fs/nfs/delegation.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index af32365894c8..76ab1374f589 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -508,6 +508,7 @@ int nfs_client_return_marked_delegations(struct nfs_client *clp)
 			err = nfs_end_delegation_return(inode, delegation, 0);
 			iput(inode);
 			nfs_sb_deactive(server->super);
+			cond_resched();
 			if (!err)
 				goto restart;
 			set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
@@ -904,6 +905,7 @@ void nfs_delegation_reap_unclaimed(struct nfs_client *clp)
 			}
 			iput(inode);
 			nfs_sb_deactive(server->super);
+			cond_resched();
 			goto restart;
 		}
 	}
@@ -1020,6 +1022,7 @@ void nfs_reap_expired_delegations(struct nfs_client *clp)
 			}
 			iput(inode);
 			nfs_sb_deactive(server->super);
+			cond_resched();
 			goto restart;
 		}
 	}

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/4] rculist: add list_for_each_entry_from_rcu()
  2018-04-30  4:31 [PATCH 0/4 V2] Avoid quadratic search when freeing delegations NeilBrown
@ 2018-04-30  4:31 ` NeilBrown
  2018-04-30  5:20   ` Josh Triplett
  2018-04-30  4:31 ` [PATCH 1/4] NFS: slight optimization for walking list for delegations NeilBrown
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: NeilBrown @ 2018-04-30  4:31 UTC (permalink / raw)
  To: Paul E. McKenney, Trond Myklebust, Mathieu Desnoyers, Anna Schumaker
  Cc: linux-nfs, Lai Jiangshan, Josh Triplett, Steven Rostedt, linux-kernel

list_for_each_entry_from_rcu() is an RCU version of
list_for_each_entry_from().  It walks a linked list under rcu
protection, from a given start point.

It is similar to list_for_each_entry_continue_rcu() but starts *at*
the given position rather than *after* it.

Naturally, the start point must be known to be in the list.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 include/linux/rculist.h |   13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 127f534fec94..36df6ccbc874 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -403,6 +403,19 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
 	     &pos->member != (head);	\
 	     pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
 
+/**
+ * list_for_each_entry_from_rcu - iterate over a list from current point
+ * @pos:	the type * to use as a loop cursor.
+ * @head:	the head for your list.
+ * @member:	the name of the list_node within the struct.
+ *
+ * Iterate over the tail of a list starting from a given position,
+ * which must have been in the list when the RCU read lock was taken.
+ */
+#define list_for_each_entry_from_rcu(pos, head, member)			\
+	for (; &(pos)->member != (head);					\
+		pos = list_entry_rcu(pos->member.next, typeof(*(pos)), member))
+
 /**
  * hlist_del_rcu - deletes entry from hash list without re-initialization
  * @n: the element to delete from the hash list.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/4] NFS: Avoid quadratic search when freeing delegations.
  2018-04-30  4:31 [PATCH 0/4 V2] Avoid quadratic search when freeing delegations NeilBrown
                   ` (2 preceding siblings ...)
  2018-04-30  4:31 ` [PATCH 2/4] NFS: use cond_resched() when restarting walk of delegation list NeilBrown
@ 2018-04-30  4:31 ` NeilBrown
  3 siblings, 0 replies; 15+ messages in thread
From: NeilBrown @ 2018-04-30  4:31 UTC (permalink / raw)
  To: Paul E. McKenney, Trond Myklebust, Mathieu Desnoyers, Anna Schumaker
  Cc: linux-nfs, Lai Jiangshan, Josh Triplett, Steven Rostedt, linux-kernel

There are three places that walk all delegation for an nfs_client and
restart whenever they find something interesting - potentially
resulting in a quadratic search:  If there are 10,000 uninteresting
delegations followed by 10,000 interesting one, then the code
skips over 100,000,000 delegations, which can take a noticeable amount
of time.

Of these nfs_delegation_reap_unclaimed() and
nfs_reap_expired_delegations() are only called during unusual events:
a server reboots or reports expired delegations, probably due to a
network partition.  Optimizing these is not particularly important.

The third, nfs_client_return_marked_delegations(), is called
periodically via nfs_expire_unreferenced_delegations().  It could
cause periodic problems on a busy server.

New delegations are added to the end of the list, so if there are
10,000 open files with delegations, and 10,000 more recently opened files
that received delegations but are now closed, then
nfs_client_return_marked_delegations() can take seconds to skip over
the 10,000 open files 10,000 times.  That is a waste of time.

The avoid this waste a place-holder (an inode) is kept when locks are
dropped, so that the place can usually be found again after taking
rcu_readlock().  This place holder ensure that we find the right
starting point in the list of nfs_servers, and makes is probable that
we find the right starting point in the list of delegations.
We might need to occasionally restart at the head of that list.

It might be possible that the place_holder inode could lose its
delegation separately, and then get a new one using the same (freed
and then reallocated) 'struct nfs_delegation'.  Were this to happen,
the new delegation would be at the end of the list and we would miss
returning some other delegations.  This would have the effect of
unnecessarily delaying the return of some unused delegations until the
next time this function is called - typically 90 seconds later.  As
this is not a correctness issue and is vanishingly unlikely to happen,
it does not seem worth addressing.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 fs/nfs/delegation.c |   57 +++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 53 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 76ab1374f589..cfec852c8bad 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -483,28 +483,73 @@ static bool nfs_delegation_need_return(struct nfs_delegation *delegation)
 int nfs_client_return_marked_delegations(struct nfs_client *clp)
 {
 	struct nfs_delegation *delegation;
+	struct nfs_delegation *prev;
 	struct nfs_server *server;
 	struct inode *inode;
+	struct inode *place_holder = NULL;
+	struct nfs_delegation *place_holder_deleg = NULL;
 	int err = 0;
 
 restart:
+	/*
+	 * To avoid quadratic looping we hold a reference
+	 * to an inode place_holder.  Each time we restart, we
+	 * list nfs_servers from the server of that inode, and
+	 * delegation in the server from the delegations of that
+	 * inode.
+	 * prev is an RCU-protected pointer to a delegation which
+	 * wasn't marked for return and might be a good choice for
+	 * the next place_holder.
+	 */
 	rcu_read_lock();
-	list_for_each_entry_rcu(server, &clp->cl_superblocks, client_link) {
-		list_for_each_entry_rcu(delegation, &server->delegations,
-								super_list) {
-			if (!nfs_delegation_need_return(delegation))
+	prev = NULL;
+	if (place_holder)
+		server = NFS_SERVER(place_holder);
+	else
+		server = list_entry_rcu(clp->cl_superblocks.next,
+					struct nfs_server, client_link);
+	list_for_each_entry_from_rcu(server, &clp->cl_superblocks, client_link) {
+		delegation = NULL;
+		if (place_holder && server == NFS_SERVER(place_holder))
+			delegation = rcu_dereference(NFS_I(place_holder)->delegation);
+		if (!delegation || delegation != place_holder_deleg)
+			delegation = list_entry_rcu(server->delegations.next,
+						    struct nfs_delegation, super_list);
+		list_for_each_entry_from_rcu(delegation, &server->delegations, super_list) {
+			struct inode *to_put = NULL;
+
+			if (!nfs_delegation_need_return(delegation)) {
+				prev = delegation;
 				continue;
+			}
 			if (!nfs_sb_active(server->super))
 				break;
+
+			if (prev) {
+				struct inode *tmp;
+
+				tmp = nfs_delegation_grab_inode(prev);
+				if (tmp) {
+					to_put = place_holder;
+					place_holder = tmp;
+					place_holder_deleg = prev;
+				}
+			}
+
 			inode = nfs_delegation_grab_inode(delegation);
 			if (inode == NULL) {
 				rcu_read_unlock();
+				if (to_put)
+					iput(to_put);
 				nfs_sb_deactive(server->super);
 				goto restart;
 			}
 			delegation = nfs_start_delegation_return_locked(NFS_I(inode));
 			rcu_read_unlock();
 
+			if (to_put)
+				iput(to_put);
+
 			err = nfs_end_delegation_return(inode, delegation, 0);
 			iput(inode);
 			nfs_sb_deactive(server->super);
@@ -512,10 +557,14 @@ int nfs_client_return_marked_delegations(struct nfs_client *clp)
 			if (!err)
 				goto restart;
 			set_bit(NFS4CLNT_DELEGRETURN, &clp->cl_state);
+			if (place_holder)
+				iput(place_holder);
 			return err;
 		}
 	}
 	rcu_read_unlock();
+	if (place_holder)
+		iput(place_holder);
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] rculist: add list_for_each_entry_from_rcu()
  2018-04-30  4:31 ` [PATCH 3/4] rculist: add list_for_each_entry_from_rcu() NeilBrown
@ 2018-04-30  5:20   ` Josh Triplett
  2018-04-30 13:43     ` Paul E. McKenney
  0 siblings, 1 reply; 15+ messages in thread
From: Josh Triplett @ 2018-04-30  5:20 UTC (permalink / raw)
  To: NeilBrown
  Cc: Paul E. McKenney, Trond Myklebust, Mathieu Desnoyers,
	Anna Schumaker, linux-nfs, Lai Jiangshan, Steven Rostedt,
	linux-kernel

On Mon, Apr 30, 2018 at 02:31:30PM +1000, NeilBrown wrote:
> list_for_each_entry_from_rcu() is an RCU version of
> list_for_each_entry_from().  It walks a linked list under rcu
> protection, from a given start point.
> 
> It is similar to list_for_each_entry_continue_rcu() but starts *at*
> the given position rather than *after* it.
> 
> Naturally, the start point must be known to be in the list.

I'd suggest giving an explicit advisory comment to clarify and suggest
correct usage:

"This would typically require either that you obtained the node from a
previous walk of the list in the same RCU read-side critical section, or
that you held some sort of non-RCU reference (such as a reference count)
to keep the node alive *and* in the list."

(Feel free to wordsmith the exact wording, but something like that seems
like it would help people understand how to use this correctly, and make
it less likely that they'd use it incorrectly.)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] rculist: add list_for_each_entry_from_rcu()
  2018-04-30  5:20   ` Josh Triplett
@ 2018-04-30 13:43     ` Paul E. McKenney
  2018-04-30 15:14       ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2018-04-30 13:43 UTC (permalink / raw)
  To: Josh Triplett
  Cc: NeilBrown, Trond Myklebust, Mathieu Desnoyers, Anna Schumaker,
	linux-nfs, Lai Jiangshan, Steven Rostedt, linux-kernel

On Sun, Apr 29, 2018 at 10:20:33PM -0700, Josh Triplett wrote:
> On Mon, Apr 30, 2018 at 02:31:30PM +1000, NeilBrown wrote:
> > list_for_each_entry_from_rcu() is an RCU version of
> > list_for_each_entry_from().  It walks a linked list under rcu
> > protection, from a given start point.
> > 
> > It is similar to list_for_each_entry_continue_rcu() but starts *at*
> > the given position rather than *after* it.
> > 
> > Naturally, the start point must be known to be in the list.
> 
> I'd suggest giving an explicit advisory comment to clarify and suggest
> correct usage:
> 
> "This would typically require either that you obtained the node from a
> previous walk of the list in the same RCU read-side critical section, or
> that you held some sort of non-RCU reference (such as a reference count)
> to keep the node alive *and* in the list."
> 
> (Feel free to wordsmith the exact wording, but something like that seems
> like it would help people understand how to use this correctly, and make
> it less likely that they'd use it incorrectly.)

What Josh said!  Could you also contrast this with the existing
list_for_each_entry_continue_rcu() macro in the header comment as well
as in the commit log?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] rculist: add list_for_each_entry_from_rcu()
  2018-04-30 13:43     ` Paul E. McKenney
@ 2018-04-30 15:14       ` Steven Rostedt
  2018-04-30 15:35         ` Paul E. McKenney
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2018-04-30 15:14 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Josh Triplett, NeilBrown, Trond Myklebust, Mathieu Desnoyers,
	Anna Schumaker, linux-nfs, Lai Jiangshan, linux-kernel

On Mon, 30 Apr 2018 06:43:08 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Sun, Apr 29, 2018 at 10:20:33PM -0700, Josh Triplett wrote:
> > On Mon, Apr 30, 2018 at 02:31:30PM +1000, NeilBrown wrote:  
> > > list_for_each_entry_from_rcu() is an RCU version of
> > > list_for_each_entry_from().  It walks a linked list under rcu
> > > protection, from a given start point.
> > > 
> > > It is similar to list_for_each_entry_continue_rcu() but starts *at*
> > > the given position rather than *after* it.
> > > 
> > > Naturally, the start point must be known to be in the list.  
> > 
> > I'd suggest giving an explicit advisory comment to clarify and suggest
> > correct usage:
> > 
> > "This would typically require either that you obtained the node from a
> > previous walk of the list in the same RCU read-side critical section, or
> > that you held some sort of non-RCU reference (such as a reference count)
> > to keep the node alive *and* in the list."
> > 
> > (Feel free to wordsmith the exact wording, but something like that seems
> > like it would help people understand how to use this correctly, and make
> > it less likely that they'd use it incorrectly.)  
> 
> What Josh said!  Could you also contrast this with the existing
> list_for_each_entry_continue_rcu() macro in the header comment as well
> as in the commit log?
> 

Should Documentation/RCU/whatisRCU.txt be updated?

Specifically section: 7.  FULL LIST OF RCU APIs

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4] NFS: slight optimization for walking list for delegations
  2018-04-30  4:31 ` [PATCH 1/4] NFS: slight optimization for walking list for delegations NeilBrown
@ 2018-04-30 15:17   ` Steven Rostedt
  2018-05-31  5:23     ` [PATCH 1/4 v2] " NeilBrown
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2018-04-30 15:17 UTC (permalink / raw)
  To: NeilBrown
  Cc: Paul E. McKenney, Trond Myklebust, Mathieu Desnoyers,
	Anna Schumaker, linux-nfs, Lai Jiangshan, Josh Triplett,
	linux-kernel

On Mon, 30 Apr 2018 14:31:30 +1000
NeilBrown <neilb@suse.com> wrote:

> There are 3 places where we walk the list of delegations
> for an nfs_client.
> In each case there are two nested loops, one for nfs_servers
> and one for nfs_delegations.
> 
> When we find an interesting delegation we try to get an active
> reference to the server.  If that fails, it is pointless to
> continue to look at the other delegation for the server as
> we will never be able to get an active reference.
> So instead of continuing in the inner loop, break out
> and continue in the outer loop.
> ---
>  fs/nfs/delegation.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
> index 1819d0d0ba4b..af32365894c8 100644
> --- a/fs/nfs/delegation.c
> +++ b/fs/nfs/delegation.c
> @@ -495,7 +495,7 @@ int nfs_client_return_marked_delegations(struct nfs_client *clp)
>  			if (!nfs_delegation_need_return(delegation))
>  				continue;
>  			if (!nfs_sb_active(server->super))
> -				continue;
> +				break;

For documentation purposes, what I usually do when using a break inside
a inner loop to continue the outer loop, I add a comment:

				break; /* continue outer loop */

Such that in the future, people will know exactly what you mean and
don't think it's a bug (thinking it breaks out of all loops).

-- Steve


>  			inode = nfs_delegation_grab_inode(delegation);
>  			if (inode == NULL) {
>  				rcu_read_unlock();
> @@ -887,7 +887,7 @@ void nfs_delegation_reap_unclaimed(struct nfs_client *clp)
>  						&delegation->flags) == 0)
>  				continue;
>  			if (!nfs_sb_active(server->super))
> -				continue;
> +				break;
>  			inode = nfs_delegation_grab_inode(delegation);
>  			if (inode == NULL) {
>  				rcu_read_unlock();
> @@ -995,7 +995,7 @@ void nfs_reap_expired_delegations(struct nfs_client *clp)
>  						&delegation->flags) == 0)
>  				continue;
>  			if (!nfs_sb_active(server->super))
> -				continue;
> +				break;
>  			inode = nfs_delegation_grab_inode(delegation);
>  			if (inode == NULL) {
>  				rcu_read_unlock();
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4] rculist: add list_for_each_entry_from_rcu()
  2018-04-30 15:14       ` Steven Rostedt
@ 2018-04-30 15:35         ` Paul E. McKenney
  2018-05-01  3:11           ` [PATCH 3/4 v2] " NeilBrown
  0 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2018-04-30 15:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Josh Triplett, NeilBrown, Trond Myklebust, Mathieu Desnoyers,
	Anna Schumaker, linux-nfs, Lai Jiangshan, linux-kernel

On Mon, Apr 30, 2018 at 11:14:54AM -0400, Steven Rostedt wrote:
> On Mon, 30 Apr 2018 06:43:08 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Sun, Apr 29, 2018 at 10:20:33PM -0700, Josh Triplett wrote:
> > > On Mon, Apr 30, 2018 at 02:31:30PM +1000, NeilBrown wrote:  
> > > > list_for_each_entry_from_rcu() is an RCU version of
> > > > list_for_each_entry_from().  It walks a linked list under rcu
> > > > protection, from a given start point.
> > > > 
> > > > It is similar to list_for_each_entry_continue_rcu() but starts *at*
> > > > the given position rather than *after* it.
> > > > 
> > > > Naturally, the start point must be known to be in the list.  
> > > 
> > > I'd suggest giving an explicit advisory comment to clarify and suggest
> > > correct usage:
> > > 
> > > "This would typically require either that you obtained the node from a
> > > previous walk of the list in the same RCU read-side critical section, or
> > > that you held some sort of non-RCU reference (such as a reference count)
> > > to keep the node alive *and* in the list."
> > > 
> > > (Feel free to wordsmith the exact wording, but something like that seems
> > > like it would help people understand how to use this correctly, and make
> > > it less likely that they'd use it incorrectly.)  
> > 
> > What Josh said!  Could you also contrast this with the existing
> > list_for_each_entry_continue_rcu() macro in the header comment as well
> > as in the commit log?
> 
> Should Documentation/RCU/whatisRCU.txt be updated?
> 
> Specifically section: 7.  FULL LIST OF RCU APIs

Very much so, thank you for catching this!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/4 v2] rculist: add list_for_each_entry_from_rcu()
  2018-04-30 15:35         ` Paul E. McKenney
@ 2018-05-01  3:11           ` NeilBrown
  2018-05-01 14:14             ` Paul E. McKenney
  0 siblings, 1 reply; 15+ messages in thread
From: NeilBrown @ 2018-05-01  3:11 UTC (permalink / raw)
  To: paulmck, Steven Rostedt
  Cc: Josh Triplett, Trond Myklebust, Mathieu Desnoyers,
	Anna Schumaker, linux-nfs, Lai Jiangshan, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3815 bytes --]


list_for_each_entry_from_rcu() is an RCU version of
list_for_each_entry_from().  It walks a linked list under rcu
protection, from a given start point.

It is similar to list_for_each_entry_continue_rcu() but starts *at*
the given position rather than *after* it.

Naturally, the start point must be known to be in the list.

Also update the documentation for list_for_each_entry_continue_rcu()
to match the documentation for the new list_for_each_entry_from_rcu(),
and add list_for_each_entry_from_rcu() and the already existing
hlist_for_each_entry_from_rcu() to section 7 of whatisRCU.txt.

Signed-off-by: NeilBrown <neilb@suse.com>
---
 Documentation/RCU/whatisRCU.txt |  2 ++
 include/linux/rculist.h         | 32 +++++++++++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index a27fbfb0efb8..b7d38bd212d2 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -814,11 +814,13 @@ RCU list traversal:
 	list_next_rcu
 	list_for_each_entry_rcu
 	list_for_each_entry_continue_rcu
+	list_for_each_entry_from_rcu
 	hlist_first_rcu
 	hlist_next_rcu
 	hlist_pprev_rcu
 	hlist_for_each_entry_rcu
 	hlist_for_each_entry_rcu_bh
+	hlist_for_each_entry_from_rcu
 	hlist_for_each_entry_continue_rcu
 	hlist_for_each_entry_continue_rcu_bh
 	hlist_nulls_first_rcu
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 127f534fec94..4786c2235b98 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -396,13 +396,43 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
  * @member:	the name of the list_head within the struct.
  *
  * Continue to iterate over list of given type, continuing after
- * the current position.
+ * the current position which must have been in the list when the RCU read
+ * lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_from_rcu() except
+ * this starts after the given position and that one starts at the given
+ * position.
  */
 #define list_for_each_entry_continue_rcu(pos, head, member) 		\
 	for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
 	     &pos->member != (head);	\
 	     pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
 
+/**
+ * list_for_each_entry_from_rcu - iterate over a list from current point
+ * @pos:	the type * to use as a loop cursor.
+ * @head:	the head for your list.
+ * @member:	the name of the list_node within the struct.
+ *
+ * Iterate over the tail of a list starting from a given position,
+ * which must have been in the list when the RCU read lock was taken.
+ * This would typically require either that you obtained the node from a
+ * previous walk of the list in the same RCU read-side critical section, or
+ * that you held some sort of non-RCU reference (such as a reference count)
+ * to keep the node alive *and* in the list.
+ *
+ * This iterator is similar to list_for_each_entry_continue_rcu() except
+ * this starts from the given position and that one starts from the position
+ * after the given position.
+ */
+#define list_for_each_entry_from_rcu(pos, head, member)			\
+	for (; &(pos)->member != (head);					\
+		pos = list_entry_rcu(pos->member.next, typeof(*(pos)), member))
+
 /**
  * hlist_del_rcu - deletes entry from hash list without re-initialization
  * @n: the element to delete from the hash list.
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4 v2] rculist: add list_for_each_entry_from_rcu()
  2018-05-01  3:11           ` [PATCH 3/4 v2] " NeilBrown
@ 2018-05-01 14:14             ` Paul E. McKenney
  2018-05-01 21:34               ` NeilBrown
  0 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2018-05-01 14:14 UTC (permalink / raw)
  To: NeilBrown
  Cc: Steven Rostedt, Josh Triplett, Trond Myklebust,
	Mathieu Desnoyers, Anna Schumaker, linux-nfs, Lai Jiangshan,
	linux-kernel

On Tue, May 01, 2018 at 01:11:29PM +1000, NeilBrown wrote:
> 
> list_for_each_entry_from_rcu() is an RCU version of
> list_for_each_entry_from().  It walks a linked list under rcu
> protection, from a given start point.
> 
> It is similar to list_for_each_entry_continue_rcu() but starts *at*
> the given position rather than *after* it.
> 
> Naturally, the start point must be known to be in the list.
> 
> Also update the documentation for list_for_each_entry_continue_rcu()
> to match the documentation for the new list_for_each_entry_from_rcu(),
> and add list_for_each_entry_from_rcu() and the already existing
> hlist_for_each_entry_from_rcu() to section 7 of whatisRCU.txt.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

I am guessing that it would be more convenient for you to carry this
with the patches using it, so:

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

If not, please let me know.

							Thanx, Paul

> ---
>  Documentation/RCU/whatisRCU.txt |  2 ++
>  include/linux/rculist.h         | 32 +++++++++++++++++++++++++++++++-
>  2 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
> index a27fbfb0efb8..b7d38bd212d2 100644
> --- a/Documentation/RCU/whatisRCU.txt
> +++ b/Documentation/RCU/whatisRCU.txt
> @@ -814,11 +814,13 @@ RCU list traversal:
>  	list_next_rcu
>  	list_for_each_entry_rcu
>  	list_for_each_entry_continue_rcu
> +	list_for_each_entry_from_rcu
>  	hlist_first_rcu
>  	hlist_next_rcu
>  	hlist_pprev_rcu
>  	hlist_for_each_entry_rcu
>  	hlist_for_each_entry_rcu_bh
> +	hlist_for_each_entry_from_rcu
>  	hlist_for_each_entry_continue_rcu
>  	hlist_for_each_entry_continue_rcu_bh
>  	hlist_nulls_first_rcu
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 127f534fec94..4786c2235b98 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -396,13 +396,43 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
>   * @member:	the name of the list_head within the struct.
>   *
>   * Continue to iterate over list of given type, continuing after
> - * the current position.
> + * the current position which must have been in the list when the RCU read
> + * lock was taken.
> + * This would typically require either that you obtained the node from a
> + * previous walk of the list in the same RCU read-side critical section, or
> + * that you held some sort of non-RCU reference (such as a reference count)
> + * to keep the node alive *and* in the list.
> + *
> + * This iterator is similar to list_for_each_entry_from_rcu() except
> + * this starts after the given position and that one starts at the given
> + * position.
>   */
>  #define list_for_each_entry_continue_rcu(pos, head, member) 		\
>  	for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
>  	     &pos->member != (head);	\
>  	     pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
>  
> +/**
> + * list_for_each_entry_from_rcu - iterate over a list from current point
> + * @pos:	the type * to use as a loop cursor.
> + * @head:	the head for your list.
> + * @member:	the name of the list_node within the struct.
> + *
> + * Iterate over the tail of a list starting from a given position,
> + * which must have been in the list when the RCU read lock was taken.
> + * This would typically require either that you obtained the node from a
> + * previous walk of the list in the same RCU read-side critical section, or
> + * that you held some sort of non-RCU reference (such as a reference count)
> + * to keep the node alive *and* in the list.
> + *
> + * This iterator is similar to list_for_each_entry_continue_rcu() except
> + * this starts from the given position and that one starts from the position
> + * after the given position.
> + */
> +#define list_for_each_entry_from_rcu(pos, head, member)			\
> +	for (; &(pos)->member != (head);					\
> +		pos = list_entry_rcu(pos->member.next, typeof(*(pos)), member))
> +
>  /**
>   * hlist_del_rcu - deletes entry from hash list without re-initialization
>   * @n: the element to delete from the hash list.
> -- 
> 2.14.0.rc0.dirty
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/4 v2] rculist: add list_for_each_entry_from_rcu()
  2018-05-01 14:14             ` Paul E. McKenney
@ 2018-05-01 21:34               ` NeilBrown
  0 siblings, 0 replies; 15+ messages in thread
From: NeilBrown @ 2018-05-01 21:34 UTC (permalink / raw)
  To: paulmck
  Cc: Steven Rostedt, Josh Triplett, Trond Myklebust,
	Mathieu Desnoyers, Anna Schumaker, linux-nfs, Lai Jiangshan,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4635 bytes --]

On Tue, May 01 2018, Paul E. McKenney wrote:

> On Tue, May 01, 2018 at 01:11:29PM +1000, NeilBrown wrote:
>> 
>> list_for_each_entry_from_rcu() is an RCU version of
>> list_for_each_entry_from().  It walks a linked list under rcu
>> protection, from a given start point.
>> 
>> It is similar to list_for_each_entry_continue_rcu() but starts *at*
>> the given position rather than *after* it.
>> 
>> Naturally, the start point must be known to be in the list.
>> 
>> Also update the documentation for list_for_each_entry_continue_rcu()
>> to match the documentation for the new list_for_each_entry_from_rcu(),
>> and add list_for_each_entry_from_rcu() and the already existing
>> hlist_for_each_entry_from_rcu() to section 7 of whatisRCU.txt.
>> 
>> Signed-off-by: NeilBrown <neilb@suse.com>
>
> I am guessing that it would be more convenient for you to carry this
> with the patches using it, so:
>
> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Thanks Paul,
 yes I do hope this patch will go in with the other NFS patches.
 I'll wait to see what Trond and Anna think before proceeding with
 anything myself.

Thanks a lot,
NeilBrown


>
> If not, please let me know.
>
> 							Thanx, Paul
>
>> ---
>>  Documentation/RCU/whatisRCU.txt |  2 ++
>>  include/linux/rculist.h         | 32 +++++++++++++++++++++++++++++++-
>>  2 files changed, 33 insertions(+), 1 deletion(-)
>> 
>> diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
>> index a27fbfb0efb8..b7d38bd212d2 100644
>> --- a/Documentation/RCU/whatisRCU.txt
>> +++ b/Documentation/RCU/whatisRCU.txt
>> @@ -814,11 +814,13 @@ RCU list traversal:
>>  	list_next_rcu
>>  	list_for_each_entry_rcu
>>  	list_for_each_entry_continue_rcu
>> +	list_for_each_entry_from_rcu
>>  	hlist_first_rcu
>>  	hlist_next_rcu
>>  	hlist_pprev_rcu
>>  	hlist_for_each_entry_rcu
>>  	hlist_for_each_entry_rcu_bh
>> +	hlist_for_each_entry_from_rcu
>>  	hlist_for_each_entry_continue_rcu
>>  	hlist_for_each_entry_continue_rcu_bh
>>  	hlist_nulls_first_rcu
>> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
>> index 127f534fec94..4786c2235b98 100644
>> --- a/include/linux/rculist.h
>> +++ b/include/linux/rculist.h
>> @@ -396,13 +396,43 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
>>   * @member:	the name of the list_head within the struct.
>>   *
>>   * Continue to iterate over list of given type, continuing after
>> - * the current position.
>> + * the current position which must have been in the list when the RCU read
>> + * lock was taken.
>> + * This would typically require either that you obtained the node from a
>> + * previous walk of the list in the same RCU read-side critical section, or
>> + * that you held some sort of non-RCU reference (such as a reference count)
>> + * to keep the node alive *and* in the list.
>> + *
>> + * This iterator is similar to list_for_each_entry_from_rcu() except
>> + * this starts after the given position and that one starts at the given
>> + * position.
>>   */
>>  #define list_for_each_entry_continue_rcu(pos, head, member) 		\
>>  	for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
>>  	     &pos->member != (head);	\
>>  	     pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
>>  
>> +/**
>> + * list_for_each_entry_from_rcu - iterate over a list from current point
>> + * @pos:	the type * to use as a loop cursor.
>> + * @head:	the head for your list.
>> + * @member:	the name of the list_node within the struct.
>> + *
>> + * Iterate over the tail of a list starting from a given position,
>> + * which must have been in the list when the RCU read lock was taken.
>> + * This would typically require either that you obtained the node from a
>> + * previous walk of the list in the same RCU read-side critical section, or
>> + * that you held some sort of non-RCU reference (such as a reference count)
>> + * to keep the node alive *and* in the list.
>> + *
>> + * This iterator is similar to list_for_each_entry_continue_rcu() except
>> + * this starts from the given position and that one starts from the position
>> + * after the given position.
>> + */
>> +#define list_for_each_entry_from_rcu(pos, head, member)			\
>> +	for (; &(pos)->member != (head);					\
>> +		pos = list_entry_rcu(pos->member.next, typeof(*(pos)), member))
>> +
>>  /**
>>   * hlist_del_rcu - deletes entry from hash list without re-initialization
>>   * @n: the element to delete from the hash list.
>> -- 
>> 2.14.0.rc0.dirty
>> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/4 v2] NFS: slight optimization for walking list for delegations
  2018-04-30 15:17   ` Steven Rostedt
@ 2018-05-31  5:23     ` NeilBrown
  2018-06-04 21:31       ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: NeilBrown @ 2018-05-31  5:23 UTC (permalink / raw)
  To: Steven Rostedt, Trond Myklebust, Anna Schumaker, linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1996 bytes --]


There are 3 places where we walk the list of delegations
for an nfs_client.
In each case there are two nested loops, one for nfs_servers
and one for nfs_delegations.

When we find an interesting delegation we try to get an active
reference to the server.  If that fails, it is pointless to
continue to look at the other delegation for the server as
we will never be able to get an active reference.
So instead of continuing in the inner loop, break out
and continue in the outer loop.

Signed-off-by: NeilBrown <neilb@suse.com>
---

This time with a Signed-off-by - sorry.
I took the opportunity to follow Steven's suggestion
of adding a comment to the new 'break' statements.

Thanks,
NeilBrown

 fs/nfs/delegation.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 1819d0d0ba4b..b9cd3864335b 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -495,7 +495,7 @@ int nfs_client_return_marked_delegations(struct nfs_client *clp)
 			if (!nfs_delegation_need_return(delegation))
 				continue;
 			if (!nfs_sb_active(server->super))
-				continue;
+				break; /* continue in outer loop */
 			inode = nfs_delegation_grab_inode(delegation);
 			if (inode == NULL) {
 				rcu_read_unlock();
@@ -887,7 +887,7 @@ void nfs_delegation_reap_unclaimed(struct nfs_client *clp)
 						&delegation->flags) == 0)
 				continue;
 			if (!nfs_sb_active(server->super))
-				continue;
+				break; /* continue in outer loop */
 			inode = nfs_delegation_grab_inode(delegation);
 			if (inode == NULL) {
 				rcu_read_unlock();
@@ -995,7 +995,7 @@ void nfs_reap_expired_delegations(struct nfs_client *clp)
 						&delegation->flags) == 0)
 				continue;
 			if (!nfs_sb_active(server->super))
-				continue;
+				break; /* continue in outer loop */
 			inode = nfs_delegation_grab_inode(delegation);
 			if (inode == NULL) {
 				rcu_read_unlock();
-- 
2.14.0.rc0.dirty


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/4 v2] NFS: slight optimization for walking list for delegations
  2018-05-31  5:23     ` [PATCH 1/4 v2] " NeilBrown
@ 2018-06-04 21:31       ` Steven Rostedt
  0 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2018-06-04 21:31 UTC (permalink / raw)
  To: NeilBrown; +Cc: Trond Myklebust, Anna Schumaker, linux-nfs, linux-kernel

On Thu, 31 May 2018 15:23:22 +1000
NeilBrown <neilb@suse.com> wrote:

> There are 3 places where we walk the list of delegations
> for an nfs_client.
> In each case there are two nested loops, one for nfs_servers
> and one for nfs_delegations.
> 
> When we find an interesting delegation we try to get an active
> reference to the server.  If that fails, it is pointless to
> continue to look at the other delegation for the server as
> we will never be able to get an active reference.
> So instead of continuing in the inner loop, break out
> and continue in the outer loop.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>
> ---
> 
> This time with a Signed-off-by - sorry.
> I took the opportunity to follow Steven's suggestion
> of adding a comment to the new 'break' statements.
> 

Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-06-04 21:31 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-30  4:31 [PATCH 0/4 V2] Avoid quadratic search when freeing delegations NeilBrown
2018-04-30  4:31 ` [PATCH 3/4] rculist: add list_for_each_entry_from_rcu() NeilBrown
2018-04-30  5:20   ` Josh Triplett
2018-04-30 13:43     ` Paul E. McKenney
2018-04-30 15:14       ` Steven Rostedt
2018-04-30 15:35         ` Paul E. McKenney
2018-05-01  3:11           ` [PATCH 3/4 v2] " NeilBrown
2018-05-01 14:14             ` Paul E. McKenney
2018-05-01 21:34               ` NeilBrown
2018-04-30  4:31 ` [PATCH 1/4] NFS: slight optimization for walking list for delegations NeilBrown
2018-04-30 15:17   ` Steven Rostedt
2018-05-31  5:23     ` [PATCH 1/4 v2] " NeilBrown
2018-06-04 21:31       ` Steven Rostedt
2018-04-30  4:31 ` [PATCH 2/4] NFS: use cond_resched() when restarting walk of delegation list NeilBrown
2018-04-30  4:31 ` [PATCH 4/4] NFS: Avoid quadratic search when freeing delegations NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).