All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
@ 2010-03-18 13:33 David Howells
  2010-03-19  2:25   ` Paul E. McKenney
  2010-03-29 19:02 ` David Howells
  0 siblings, 2 replies; 48+ messages in thread
From: David Howells @ 2010-03-18 13:33 UTC (permalink / raw)
  To: Trond.Myklebust, paulmck; +Cc: linux-nfs, linux-kernel, David Howells

Fix a number of RCU warnings in nfs_inode_return_delegation_noreclaim().
nfs_inode_return_delegation_noreclaim() and nfs_inode_return_delegation() don't
need to use rcu_dereference() outside the spinlocked region as they merely
examin the pointer and don't follow it, thus rendering unnecessary the need to
impose a partial ordering over the one item of interest.

nfs_detach_delegation_locked() doesn't need rcu_derefence() because it can only
be called if nfs_client::cl_lock is held, and that guards against anyone
changing nfsi->delegation under it.  Furthermore, the barrier in
rcu_derefence() is superfluous, given that the spin_lock() is also a barrier.

nfs_free_delegation() should be using rcu_dereference_check() to validate the
state that the data is in (the delegation inode must have been cleared).  By
this point, the delegation is being released, so no one else should be
attempting to use the saved credentials, and they can be cleared.  However,
rcu_assign_pointer() should be used to clear them, and the delegation itself
must still use call_rcu() as the list of delegations could be being traversed.


[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
fs/nfs/delegation.c:332 invoked rcu_dereference_check() without protection!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
2 locks held by mount.nfs4/2281:
 #0:  (&type->s_umount_key#34){+.+...}, at: [<ffffffff810b25b4>] deactivate_super+0x60/0x80
 #1:  (iprune_sem){+.+...}, at: [<ffffffff810c332a>] invalidate_inodes+0x39/0x13a

stack backtrace:
Pid: 2281, comm: mount.nfs4 Not tainted 2.6.34-rc1-cachefs #110
Call Trace:
 [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
 [<ffffffffa00b4591>] nfs_inode_return_delegation_noreclaim+0x5b/0xa0 [nfs]
 [<ffffffffa0095d63>] nfs4_clear_inode+0x11/0x1e [nfs]
 [<ffffffff810c2d92>] clear_inode+0x9e/0xf8
 [<ffffffff810c3028>] dispose_list+0x67/0x10e
 [<ffffffff810c340d>] invalidate_inodes+0x11c/0x13a
 [<ffffffff810b1dc1>] generic_shutdown_super+0x42/0xf4
 [<ffffffff810b1ebe>] kill_anon_super+0x11/0x4f
 [<ffffffffa009893c>] nfs4_kill_super+0x3f/0x72 [nfs]
 [<ffffffff810b25bc>] deactivate_super+0x68/0x80
 [<ffffffff810c6744>] mntput_no_expire+0xbb/0xf8
 [<ffffffff810c681b>] release_mounts+0x9a/0xb0
 [<ffffffff810c689b>] put_mnt_ns+0x6a/0x79
 [<ffffffffa00983a1>] nfs_follow_remote_path+0x5a/0x146 [nfs]
 [<ffffffffa0098334>] ? nfs_do_root_mount+0x82/0x95 [nfs]
 [<ffffffffa00985a9>] nfs4_try_mount+0x75/0xaf [nfs]
 [<ffffffffa0098874>] nfs4_get_sb+0x291/0x31a [nfs]
 [<ffffffff810b2059>] vfs_kern_mount+0xb8/0x177
 [<ffffffff810b2176>] do_kern_mount+0x48/0xe8
 [<ffffffff810c810b>] do_mount+0x782/0x7f9
 [<ffffffff810c8205>] sys_mount+0x83/0xbe
 [<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b

Also on:

fs/nfs/delegation.c:215 invoked rcu_dereference_check() without protection!
 [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
 [<ffffffffa00b4223>] nfs_inode_set_delegation+0xfe/0x219 [nfs]
 [<ffffffffa00a9c6f>] nfs4_opendata_to_nfs4_state+0x2c2/0x30d [nfs]
 [<ffffffffa00aa15d>] nfs4_do_open+0x2a6/0x3a6 [nfs]
 ...

And:

fs/nfs/delegation.c:40 invoked rcu_dereference_check() without protection!
 [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
 [<ffffffffa00b3bef>] nfs_free_delegation+0x3d/0x6e [nfs]
 [<ffffffffa00b3e71>] nfs_do_return_delegation+0x26/0x30 [nfs]
 [<ffffffffa00b406a>] __nfs_inode_return_delegation+0x1ef/0x1fe [nfs]
 [<ffffffffa00b448a>] nfs_client_return_marked_delegations+0xc9/0x124 [nfs]
 ...


Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/nfs/delegation.c |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 2563beb..fa9b7c5 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -37,7 +37,8 @@ static void nfs_free_delegation(struct nfs_delegation *delegation)
 {
 	struct rpc_cred *cred;
 
-	cred = rcu_dereference(delegation->cred);
+	cred = rcu_dereference_check(delegation->cred,
+				     delegation->inode == NULL);
 	rcu_assign_pointer(delegation->cred, NULL);
 	call_rcu(&delegation->rcu, nfs_free_delegation_callback);
 	if (cred)
@@ -167,7 +168,7 @@ static struct inode *nfs_delegation_grab_inode(struct nfs_delegation *delegation
 
 static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
 {
-	struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
+	struct nfs_delegation *delegation = nfsi->delegation;
 
 	if (delegation == NULL)
 		goto nomatch;
@@ -212,7 +213,7 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
 	spin_lock_init(&delegation->lock);
 
 	spin_lock(&clp->cl_lock);
-	if (rcu_dereference(nfsi->delegation) != NULL) {
+	if (nfsi->delegation != NULL) {
 		if (memcmp(&delegation->stateid, &nfsi->delegation->stateid,
 					sizeof(delegation->stateid)) == 0 &&
 				delegation->type == nfsi->delegation->type) {
@@ -329,7 +330,7 @@ void nfs_inode_return_delegation_noreclaim(struct inode *inode)
 	struct nfs_inode *nfsi = NFS_I(inode);
 	struct nfs_delegation *delegation;
 
-	if (rcu_dereference(nfsi->delegation) != NULL) {
+	if (nfsi->delegation != NULL) {
 		spin_lock(&clp->cl_lock);
 		delegation = nfs_detach_delegation_locked(nfsi, NULL);
 		spin_unlock(&clp->cl_lock);
@@ -345,7 +346,7 @@ int nfs_inode_return_delegation(struct inode *inode)
 	struct nfs_delegation *delegation;
 	int err = 0;
 
-	if (rcu_dereference(nfsi->delegation) != NULL) {
+	if (nfsi->delegation != NULL) {
 		spin_lock(&clp->cl_lock);
 		delegation = nfs_detach_delegation_locked(nfsi, NULL);
 		spin_unlock(&clp->cl_lock);


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
@ 2010-03-19  2:25   ` Paul E. McKenney
  0 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-19  2:25 UTC (permalink / raw)
  To: David Howells; +Cc: Trond.Myklebust, linux-nfs, linux-kernel

On Thu, Mar 18, 2010 at 01:33:02PM +0000, David Howells wrote:
> Fix a number of RCU warnings in nfs_inode_return_delegation_noreclaim().
> nfs_inode_return_delegation_noreclaim() and nfs_inode_return_delegation() don't
> need to use rcu_dereference() outside the spinlocked region as they merely
> examin the pointer and don't follow it, thus rendering unnecessary the need to
> impose a partial ordering over the one item of interest.
> 
> nfs_detach_delegation_locked() doesn't need rcu_derefence() because it can only
> be called if nfs_client::cl_lock is held, and that guards against anyone
> changing nfsi->delegation under it.  Furthermore, the barrier in
> rcu_derefence() is superfluous, given that the spin_lock() is also a barrier.
> 
> nfs_free_delegation() should be using rcu_dereference_check() to validate the
> state that the data is in (the delegation inode must have been cleared).  By
> this point, the delegation is being released, so no one else should be
> attempting to use the saved credentials, and they can be cleared.  However,
> rcu_assign_pointer() should be used to clear them, and the delegation itself
> must still use call_rcu() as the list of delegations could be being traversed.

Thank you for fixing these up!

This looks good at the moment, however, the sparse changes that Arnd
Bergmann is working on will invalidate a couple of the changes below.
Of course, better a future problem than a here-and-now problem, but
is there an easy way to fix both?

							Thanx, Paul

> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> fs/nfs/delegation.c:332 invoked rcu_dereference_check() without protection!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 1, debug_locks = 0
> 2 locks held by mount.nfs4/2281:
>  #0:  (&type->s_umount_key#34){+.+...}, at: [<ffffffff810b25b4>] deactivate_super+0x60/0x80
>  #1:  (iprune_sem){+.+...}, at: [<ffffffff810c332a>] invalidate_inodes+0x39/0x13a
> 
> stack backtrace:
> Pid: 2281, comm: mount.nfs4 Not tainted 2.6.34-rc1-cachefs #110
> Call Trace:
>  [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
>  [<ffffffffa00b4591>] nfs_inode_return_delegation_noreclaim+0x5b/0xa0 [nfs]
>  [<ffffffffa0095d63>] nfs4_clear_inode+0x11/0x1e [nfs]
>  [<ffffffff810c2d92>] clear_inode+0x9e/0xf8
>  [<ffffffff810c3028>] dispose_list+0x67/0x10e
>  [<ffffffff810c340d>] invalidate_inodes+0x11c/0x13a
>  [<ffffffff810b1dc1>] generic_shutdown_super+0x42/0xf4
>  [<ffffffff810b1ebe>] kill_anon_super+0x11/0x4f
>  [<ffffffffa009893c>] nfs4_kill_super+0x3f/0x72 [nfs]
>  [<ffffffff810b25bc>] deactivate_super+0x68/0x80
>  [<ffffffff810c6744>] mntput_no_expire+0xbb/0xf8
>  [<ffffffff810c681b>] release_mounts+0x9a/0xb0
>  [<ffffffff810c689b>] put_mnt_ns+0x6a/0x79
>  [<ffffffffa00983a1>] nfs_follow_remote_path+0x5a/0x146 [nfs]
>  [<ffffffffa0098334>] ? nfs_do_root_mount+0x82/0x95 [nfs]
>  [<ffffffffa00985a9>] nfs4_try_mount+0x75/0xaf [nfs]
>  [<ffffffffa0098874>] nfs4_get_sb+0x291/0x31a [nfs]
>  [<ffffffff810b2059>] vfs_kern_mount+0xb8/0x177
>  [<ffffffff810b2176>] do_kern_mount+0x48/0xe8
>  [<ffffffff810c810b>] do_mount+0x782/0x7f9
>  [<ffffffff810c8205>] sys_mount+0x83/0xbe
>  [<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b
> 
> Also on:
> 
> fs/nfs/delegation.c:215 invoked rcu_dereference_check() without protection!
>  [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
>  [<ffffffffa00b4223>] nfs_inode_set_delegation+0xfe/0x219 [nfs]
>  [<ffffffffa00a9c6f>] nfs4_opendata_to_nfs4_state+0x2c2/0x30d [nfs]
>  [<ffffffffa00aa15d>] nfs4_do_open+0x2a6/0x3a6 [nfs]
>  ...
> 
> And:
> 
> fs/nfs/delegation.c:40 invoked rcu_dereference_check() without protection!
>  [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
>  [<ffffffffa00b3bef>] nfs_free_delegation+0x3d/0x6e [nfs]
>  [<ffffffffa00b3e71>] nfs_do_return_delegation+0x26/0x30 [nfs]
>  [<ffffffffa00b406a>] __nfs_inode_return_delegation+0x1ef/0x1fe [nfs]
>  [<ffffffffa00b448a>] nfs_client_return_marked_delegations+0xc9/0x124 [nfs]
>  ...
> 
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  fs/nfs/delegation.c |   11 ++++++-----
>  1 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
> index 2563beb..fa9b7c5 100644
> --- a/fs/nfs/delegation.c
> +++ b/fs/nfs/delegation.c
> @@ -37,7 +37,8 @@ static void nfs_free_delegation(struct nfs_delegation *delegation)
>  {
>  	struct rpc_cred *cred;
> 
> -	cred = rcu_dereference(delegation->cred);
> +	cred = rcu_dereference_check(delegation->cred,
> +				     delegation->inode == NULL);
>  	rcu_assign_pointer(delegation->cred, NULL);
>  	call_rcu(&delegation->rcu, nfs_free_delegation_callback);
>  	if (cred)
> @@ -167,7 +168,7 @@ static struct inode *nfs_delegation_grab_inode(struct nfs_delegation *delegation
> 
>  static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
>  {
> -	struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
> +	struct nfs_delegation *delegation = nfsi->delegation;

Arnd's work will flag this one.

> 
>  	if (delegation == NULL)
>  		goto nomatch;
> @@ -212,7 +213,7 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
>  	spin_lock_init(&delegation->lock);
> 
>  	spin_lock(&clp->cl_lock);
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {

And this one.  I thought that Trond said that clp->cl_lock protects
this one, in which case this should work:

	if (rcu_dereference_check(nfsi->delegation,
				  lockdep_is_held(&clp->cl_lock)) != NULL) {

>  		if (memcmp(&delegation->stateid, &nfsi->delegation->stateid,
>  					sizeof(delegation->stateid)) == 0 &&
>  				delegation->type == nfsi->delegation->type) {
> @@ -329,7 +330,7 @@ void nfs_inode_return_delegation_noreclaim(struct inode *inode)
>  	struct nfs_inode *nfsi = NFS_I(inode);
>  	struct nfs_delegation *delegation;
> 
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {

And this one, although the check for cp->cl_lock obviously won't work here.

>  		spin_lock(&clp->cl_lock);
>  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
>  		spin_unlock(&clp->cl_lock);
> @@ -345,7 +346,7 @@ int nfs_inode_return_delegation(struct inode *inode)
>  	struct nfs_delegation *delegation;
>  	int err = 0;
> 
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {

Ditto...

>  		spin_lock(&clp->cl_lock);
>  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
>  		spin_unlock(&clp->cl_lock);
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
@ 2010-03-19  2:25   ` Paul E. McKenney
  0 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-19  2:25 UTC (permalink / raw)
  To: David Howells; +Cc: Trond.Myklebust, linux-nfs, linux-kernel

On Thu, Mar 18, 2010 at 01:33:02PM +0000, David Howells wrote:
> Fix a number of RCU warnings in nfs_inode_return_delegation_noreclaim().
> nfs_inode_return_delegation_noreclaim() and nfs_inode_return_delegation() don't
> need to use rcu_dereference() outside the spinlocked region as they merely
> examin the pointer and don't follow it, thus rendering unnecessary the need to
> impose a partial ordering over the one item of interest.
> 
> nfs_detach_delegation_locked() doesn't need rcu_derefence() because it can only
> be called if nfs_client::cl_lock is held, and that guards against anyone
> changing nfsi->delegation under it.  Furthermore, the barrier in
> rcu_derefence() is superfluous, given that the spin_lock() is also a barrier.
> 
> nfs_free_delegation() should be using rcu_dereference_check() to validate the
> state that the data is in (the delegation inode must have been cleared).  By
> this point, the delegation is being released, so no one else should be
> attempting to use the saved credentials, and they can be cleared.  However,
> rcu_assign_pointer() should be used to clear them, and the delegation itself
> must still use call_rcu() as the list of delegations could be being traversed.

Thank you for fixing these up!

This looks good at the moment, however, the sparse changes that Arnd
Bergmann is working on will invalidate a couple of the changes below.
Of course, better a future problem than a here-and-now problem, but
is there an easy way to fix both?

							Thanx, Paul

> [ INFO: suspicious rcu_dereference_check() usage. ]
> ---------------------------------------------------
> fs/nfs/delegation.c:332 invoked rcu_dereference_check() without protection!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 1, debug_locks = 0
> 2 locks held by mount.nfs4/2281:
>  #0:  (&type->s_umount_key#34){+.+...}, at: [<ffffffff810b25b4>] deactivate_super+0x60/0x80
>  #1:  (iprune_sem){+.+...}, at: [<ffffffff810c332a>] invalidate_inodes+0x39/0x13a
> 
> stack backtrace:
> Pid: 2281, comm: mount.nfs4 Not tainted 2.6.34-rc1-cachefs #110
> Call Trace:
>  [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
>  [<ffffffffa00b4591>] nfs_inode_return_delegation_noreclaim+0x5b/0xa0 [nfs]
>  [<ffffffffa0095d63>] nfs4_clear_inode+0x11/0x1e [nfs]
>  [<ffffffff810c2d92>] clear_inode+0x9e/0xf8
>  [<ffffffff810c3028>] dispose_list+0x67/0x10e
>  [<ffffffff810c340d>] invalidate_inodes+0x11c/0x13a
>  [<ffffffff810b1dc1>] generic_shutdown_super+0x42/0xf4
>  [<ffffffff810b1ebe>] kill_anon_super+0x11/0x4f
>  [<ffffffffa009893c>] nfs4_kill_super+0x3f/0x72 [nfs]
>  [<ffffffff810b25bc>] deactivate_super+0x68/0x80
>  [<ffffffff810c6744>] mntput_no_expire+0xbb/0xf8
>  [<ffffffff810c681b>] release_mounts+0x9a/0xb0
>  [<ffffffff810c689b>] put_mnt_ns+0x6a/0x79
>  [<ffffffffa00983a1>] nfs_follow_remote_path+0x5a/0x146 [nfs]
>  [<ffffffffa0098334>] ? nfs_do_root_mount+0x82/0x95 [nfs]
>  [<ffffffffa00985a9>] nfs4_try_mount+0x75/0xaf [nfs]
>  [<ffffffffa0098874>] nfs4_get_sb+0x291/0x31a [nfs]
>  [<ffffffff810b2059>] vfs_kern_mount+0xb8/0x177
>  [<ffffffff810b2176>] do_kern_mount+0x48/0xe8
>  [<ffffffff810c810b>] do_mount+0x782/0x7f9
>  [<ffffffff810c8205>] sys_mount+0x83/0xbe
>  [<ffffffff81001eeb>] system_call_fastpath+0x16/0x1b
> 
> Also on:
> 
> fs/nfs/delegation.c:215 invoked rcu_dereference_check() without protection!
>  [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
>  [<ffffffffa00b4223>] nfs_inode_set_delegation+0xfe/0x219 [nfs]
>  [<ffffffffa00a9c6f>] nfs4_opendata_to_nfs4_state+0x2c2/0x30d [nfs]
>  [<ffffffffa00aa15d>] nfs4_do_open+0x2a6/0x3a6 [nfs]
>  ...
> 
> And:
> 
> fs/nfs/delegation.c:40 invoked rcu_dereference_check() without protection!
>  [<ffffffff8105149f>] lockdep_rcu_dereference+0xaa/0xb2
>  [<ffffffffa00b3bef>] nfs_free_delegation+0x3d/0x6e [nfs]
>  [<ffffffffa00b3e71>] nfs_do_return_delegation+0x26/0x30 [nfs]
>  [<ffffffffa00b406a>] __nfs_inode_return_delegation+0x1ef/0x1fe [nfs]
>  [<ffffffffa00b448a>] nfs_client_return_marked_delegations+0xc9/0x124 [nfs]
>  ...
> 
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  fs/nfs/delegation.c |   11 ++++++-----
>  1 files changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
> index 2563beb..fa9b7c5 100644
> --- a/fs/nfs/delegation.c
> +++ b/fs/nfs/delegation.c
> @@ -37,7 +37,8 @@ static void nfs_free_delegation(struct nfs_delegation *delegation)
>  {
>  	struct rpc_cred *cred;
> 
> -	cred = rcu_dereference(delegation->cred);
> +	cred = rcu_dereference_check(delegation->cred,
> +				     delegation->inode == NULL);
>  	rcu_assign_pointer(delegation->cred, NULL);
>  	call_rcu(&delegation->rcu, nfs_free_delegation_callback);
>  	if (cred)
> @@ -167,7 +168,7 @@ static struct inode *nfs_delegation_grab_inode(struct nfs_delegation *delegation
> 
>  static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
>  {
> -	struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
> +	struct nfs_delegation *delegation = nfsi->delegation;

Arnd's work will flag this one.

> 
>  	if (delegation == NULL)
>  		goto nomatch;
> @@ -212,7 +213,7 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
>  	spin_lock_init(&delegation->lock);
> 
>  	spin_lock(&clp->cl_lock);
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {

And this one.  I thought that Trond said that clp->cl_lock protects
this one, in which case this should work:

	if (rcu_dereference_check(nfsi->delegation,
				  lockdep_is_held(&clp->cl_lock)) != NULL) {

>  		if (memcmp(&delegation->stateid, &nfsi->delegation->stateid,
>  					sizeof(delegation->stateid)) == 0 &&
>  				delegation->type == nfsi->delegation->type) {
> @@ -329,7 +330,7 @@ void nfs_inode_return_delegation_noreclaim(struct inode *inode)
>  	struct nfs_inode *nfsi = NFS_I(inode);
>  	struct nfs_delegation *delegation;
> 
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {

And this one, although the check for cp->cl_lock obviously won't work here.

>  		spin_lock(&clp->cl_lock);
>  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
>  		spin_unlock(&clp->cl_lock);
> @@ -345,7 +346,7 @@ int nfs_inode_return_delegation(struct inode *inode)
>  	struct nfs_delegation *delegation;
>  	int err = 0;
> 
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {

Ditto...

>  		spin_lock(&clp->cl_lock);
>  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
>  		spin_unlock(&clp->cl_lock);
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-18 13:33 [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2] David Howells
  2010-03-19  2:25   ` Paul E. McKenney
@ 2010-03-29 19:02 ` David Howells
  2010-03-29 19:21   ` Paul E. McKenney
  2010-03-29 20:15   ` David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: David Howells @ 2010-03-29 19:02 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> >  	spin_lock(&clp->cl_lock);
> > -	if (rcu_dereference(nfsi->delegation) != NULL) {
> > +	if (nfsi->delegation != NULL) {
> 
> And this one.  I thought that Trond said that clp->cl_lock protects
> this one, in which case this should work:
> 
> 	if (rcu_dereference_check(nfsi->delegation,
> 				  lockdep_is_held(&clp->cl_lock)) != NULL) {

If clp->cl_lock protects this pointer, why the need for rcu_dereference_check()
at all?  The check is redundant since the line above gets the very lock we're
checking for.

> > -	if (rcu_dereference(nfsi->delegation) != NULL) {
> > +	if (nfsi->delegation != NULL) {
> 
> And this one, although the check for cp->cl_lock obviously won't work here.
> 
> >  		spin_lock(&clp->cl_lock);
> >  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
> >  		spin_unlock(&clp->cl_lock);

On this one, why does nfsi->delegation need a memory barrier interpolating
afterwards?  It has an implicit one in the form of the spin_lock() immediately
after, if the value of the pointer wasn't NULL.  What two memory accesses is
the memory barrier ordering?

Ditto on the next one.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 19:02 ` David Howells
@ 2010-03-29 19:21   ` Paul E. McKenney
  2010-03-29 20:15   ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-29 19:21 UTC (permalink / raw)
  To: David Howells; +Cc: Trond.Myklebust, linux-nfs, linux-kernel

On Mon, Mar 29, 2010 at 08:02:28PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > >  	spin_lock(&clp->cl_lock);
> > > -	if (rcu_dereference(nfsi->delegation) != NULL) {
> > > +	if (nfsi->delegation != NULL) {
> > 
> > And this one.  I thought that Trond said that clp->cl_lock protects
> > this one, in which case this should work:
> > 
> > 	if (rcu_dereference_check(nfsi->delegation,
> > 				  lockdep_is_held(&clp->cl_lock)) != NULL) {
> 
> If clp->cl_lock protects this pointer, why the need for rcu_dereference_check()
> at all?  The check is redundant since the line above gets the very lock we're
> checking for.

Because Arnd Bergmann is working on a set of patches that makes sparse
complain if you access an RCU-protected pointer directly, without using
some flavor of rcu_dereference().

So your approach would work for the moment, but would need another
change, probably in the 2.6.35 timeframe.

> > > -	if (rcu_dereference(nfsi->delegation) != NULL) {
> > > +	if (nfsi->delegation != NULL) {
> > 
> > And this one, although the check for cp->cl_lock obviously won't work here.
> > 
> > >  		spin_lock(&clp->cl_lock);
> > >  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
> > >  		spin_unlock(&clp->cl_lock);
> 
> On this one, why does nfsi->delegation need a memory barrier interpolating
> afterwards?  It has an implicit one in the form of the spin_lock() immediately
> after, if the value of the pointer wasn't NULL.  What two memory accesses is
> the memory barrier ordering?
> 
> Ditto on the next one.

I must defer to Trond on this one.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 19:02 ` David Howells
  2010-03-29 19:21   ` Paul E. McKenney
@ 2010-03-29 20:15   ` David Howells
  2010-03-29 20:26       ` Eric Dumazet
                       ` (2 more replies)
  1 sibling, 3 replies; 48+ messages in thread
From: David Howells @ 2010-03-29 20:15 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> > > 	if (rcu_dereference_check(nfsi->delegation,
> > > 				  lockdep_is_held(&clp->cl_lock)) != NULL) {
> > 
> > If clp->cl_lock protects this pointer, why the need for
> > rcu_dereference_check() at all?  The check is redundant since the line
> > above gets the very lock we're checking for.
> 
> Because Arnd Bergmann is working on a set of patches that makes sparse
> complain if you access an RCU-protected pointer directly, without using
> some flavor of rcu_dereference().
> 
> So your approach would work for the moment, but would need another
> change, probably in the 2.6.35 timeframe.

My objection to using rcu_dereference_check() here is that it's a dynamic
check: the compiler emits code to do it, since the lock/unlock status of what
the pointer points to cannot be determined easily at compiler time - and then
the barrier is interpolated anyway unnecessarily.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 20:15   ` David Howells
@ 2010-03-29 20:26       ` Eric Dumazet
  2010-03-29 21:05     ` Paul E. McKenney
  2010-03-29 22:22     ` David Howells
  2 siblings, 0 replies; 48+ messages in thread
From: Eric Dumazet @ 2010-03-29 20:26 UTC (permalink / raw)
  To: David Howells; +Cc: paulmck, Trond.Myklebust, linux-nfs, linux-kernel

Le lundi 29 mars 2010 à 21:15 +0100, David Howells a écrit :
> My objection to using rcu_dereference_check() here is that it's a dynamic
> check: the compiler emits code to do it, since the lock/unlock status of what
> the pointer points to cannot be determined easily at compiler time - and then
> the barrier is interpolated anyway unnecessarily.
> 

Then in this case, use either a condition ORing all possibilities,
or just use rcu_dereference_raw() because its just too complex ?

ptr = rcu_dereference_check(xxx->ptr,
			    rcu_read_lock_held() ||
			    lockdep_is_held(&some_global_lock) ||
			    lockdep_is_held(&xxx->lock));

OR

/* This is too complex to check what protects us at this point */
ptr = rcu_dereference_raw(xxx->ptr);




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
@ 2010-03-29 20:26       ` Eric Dumazet
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Dumazet @ 2010-03-29 20:26 UTC (permalink / raw)
  To: David Howells; +Cc: paulmck, Trond.Myklebust, linux-nfs, linux-kernel

Le lundi 29 mars 2010 =C3=A0 21:15 +0100, David Howells a =C3=A9crit :
> My objection to using rcu_dereference_check() here is that it's a dyn=
amic
> check: the compiler emits code to do it, since the lock/unlock status=
 of what
> the pointer points to cannot be determined easily at compiler time - =
and then
> the barrier is interpolated anyway unnecessarily.
>=20

Then in this case, use either a condition ORing all possibilities,
or just use rcu_dereference_raw() because its just too complex ?

ptr =3D rcu_dereference_check(xxx->ptr,
			    rcu_read_lock_held() ||
			    lockdep_is_held(&some_global_lock) ||
			    lockdep_is_held(&xxx->lock));

OR

/* This is too complex to check what protects us at this point */
ptr =3D rcu_dereference_raw(xxx->ptr);




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 20:15   ` David Howells
  2010-03-29 20:26       ` Eric Dumazet
@ 2010-03-29 21:05     ` Paul E. McKenney
  2010-03-29 22:22     ` David Howells
  2 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-29 21:05 UTC (permalink / raw)
  To: David Howells; +Cc: Trond.Myklebust, linux-nfs, linux-kernel

On Mon, Mar 29, 2010 at 09:15:06PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > > 	if (rcu_dereference_check(nfsi->delegation,
> > > > 				  lockdep_is_held(&clp->cl_lock)) != NULL) {
> > > 
> > > If clp->cl_lock protects this pointer, why the need for
> > > rcu_dereference_check() at all?  The check is redundant since the line
> > > above gets the very lock we're checking for.
> > 
> > Because Arnd Bergmann is working on a set of patches that makes sparse
> > complain if you access an RCU-protected pointer directly, without using
> > some flavor of rcu_dereference().
> > 
> > So your approach would work for the moment, but would need another
> > change, probably in the 2.6.35 timeframe.
> 
> My objection to using rcu_dereference_check() here is that it's a dynamic
> check: the compiler emits code to do it, since the lock/unlock status of what
> the pointer points to cannot be determined easily at compiler time - and then
> the barrier is interpolated anyway unnecessarily.

But for !CONFIG_PROVE_RCU, rcu_dereference_check() is compiled out:

	#define rcu_dereference_check(p, c)     rcu_dereference_raw(p)

And rcu_dereference_raw() is the same as the old rcu_dereference().

So this should not be a problem, given that CONFIG_PROVE_RCU should not
be used for production systems.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 20:15   ` David Howells
  2010-03-29 20:26       ` Eric Dumazet
  2010-03-29 21:05     ` Paul E. McKenney
@ 2010-03-29 22:22     ` David Howells
  2010-03-29 22:36       ` Paul E. McKenney
  2010-03-29 22:59       ` David Howells
  2 siblings, 2 replies; 48+ messages in thread
From: David Howells @ 2010-03-29 22:22 UTC (permalink / raw)
  To: paulmck, Eric Dumazet; +Cc: dhowells, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> And rcu_dereference_raw() is the same as the old rcu_dereference().

Exactly.  That's the other half of the issue - that interpolates a redundant
memory barrier.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 22:22     ` David Howells
@ 2010-03-29 22:36       ` Paul E. McKenney
  2010-03-29 22:59       ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-29 22:36 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Mon, Mar 29, 2010 at 11:22:30PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > And rcu_dereference_raw() is the same as the old rcu_dereference().
> 
> Exactly.  That's the other half of the issue - that interpolates a redundant
> memory barrier.

Only on Alpha.  Otherwise only a volatile access.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 22:22     ` David Howells
  2010-03-29 22:36       ` Paul E. McKenney
@ 2010-03-29 22:59       ` David Howells
  2010-03-29 23:26         ` Paul E. McKenney
  2010-03-30 16:37         ` [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2] David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: David Howells @ 2010-03-29 22:59 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> Only on Alpha.  Otherwise only a volatile access.

Whilst that is true, it's the principle of the thing.  The extra barrier
shouldn't be emitted on Alpha.  If Alpha's no longer important, then can we
scrap smp_read_barrier_depends()?

My point is that some of these rcu_dereference*()'s are unnecessary.  If
there're required for correctness tracking purposes, fine; but can we have a
macro that is just a dummy for the purpose of stripping the pointer Sparse
annotation?  One that doesn't invoke rcu_dereference_raw() and interpolate a
barrier, pretend or otherwise, when there's no second reference to order
against.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 22:59       ` David Howells
@ 2010-03-29 23:26         ` Paul E. McKenney
  2010-03-30 15:40           ` Paul E. McKenney
  2010-03-30 16:39           ` David Howells
  2010-03-30 16:37         ` [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2] David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-29 23:26 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Mon, Mar 29, 2010 at 11:59:03PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Only on Alpha.  Otherwise only a volatile access.
> 
> Whilst that is true, it's the principle of the thing.  The extra barrier
> shouldn't be emitted on Alpha.  If Alpha's no longer important, then can we
> scrap smp_read_barrier_depends()?
> 
> My point is that some of these rcu_dereference*()'s are unnecessary.  If
> there're required for correctness tracking purposes, fine; but can we have a
> macro that is just a dummy for the purpose of stripping the pointer Sparse
> annotation?  One that doesn't invoke rcu_dereference_raw() and interpolate a
> barrier, pretend or otherwise, when there's no second reference to order
> against.

Interesting point.  Perhaps an rcu_dereference_update(p, c) for the
cases where the data structure cannot change.  Also, such a name makes
it more clear that this is an update-side access, and it further documents
the update-side lock.

Something like the following, then?  Untested, probably does not even
compile.

							Thanx, Paul

------------------------------------------------------------------------

rcu: Add update-side variant of rcu_dereference()

Upcoming consistency-checking features are requiring that even update-side
accesses to RCU-protected pointers use some variant of rcu_dereference().
Even though rcu_dereference() is quite lightweight, it does constrain the
compiler, thus producing code that is worse than required.  This patch
therefore adds rcu_dereference_update(), which allows lockdep-style
checks for holding the correct update-side lock, but which does not
constrain the compiler.

Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

 rcupdate.h |   17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 872a98e..0a6047f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -209,9 +209,26 @@ static inline int rcu_read_lock_sched_held(void)
 		rcu_dereference_raw(p); \
 	})
 
+/**
+ * rcu_dereference_update - rcu_dereference on structure that cannot change
+ *
+ * Do rcu_dereference() checking, but just pick up the pointer without
+ * the normal RCU read-side precautions.  These precautions are only
+ * needed if the data structure can change.  The caller is responsible
+ * for doing whatever is necessary (such as holding locks) to prevent
+ * such changes from occurring.
+ */
+#define rcu_dereference_update(p, c) \
+	({ \
+		if (debug_lockdep_rcu_enabled() && !(c)) \
+			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		(p); \
+	})
+
 #else /* #ifdef CONFIG_PROVE_RCU */
 
 #define rcu_dereference_check(p, c)	rcu_dereference_raw(p)
+#define rcu_dereference_update(p, c)	(p)
 
 #endif /* #else #ifdef CONFIG_PROVE_RCU */
 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 23:26         ` Paul E. McKenney
@ 2010-03-30 15:40           ` Paul E. McKenney
  2010-03-30 16:39           ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-30 15:40 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Mon, Mar 29, 2010 at 04:26:36PM -0700, Paul E. McKenney wrote:
> On Mon, Mar 29, 2010 at 11:59:03PM +0100, David Howells wrote:
> > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > Only on Alpha.  Otherwise only a volatile access.
> > 
> > Whilst that is true, it's the principle of the thing.  The extra barrier
> > shouldn't be emitted on Alpha.  If Alpha's no longer important, then can we
> > scrap smp_read_barrier_depends()?
> > 
> > My point is that some of these rcu_dereference*()'s are unnecessary.  If
> > there're required for correctness tracking purposes, fine; but can we have a
> > macro that is just a dummy for the purpose of stripping the pointer Sparse
> > annotation?  One that doesn't invoke rcu_dereference_raw() and interpolate a
> > barrier, pretend or otherwise, when there's no second reference to order
> > against.
> 
> Interesting point.  Perhaps an rcu_dereference_update(p, c) for the
> cases where the data structure cannot change.  Also, such a name makes
> it more clear that this is an update-side access, and it further documents
> the update-side lock.
> 
> Something like the following, then?  Untested, probably does not even
> compile.

Scrap this one -- Arnd has it covered, under the much better name
of rcu_dereference_const().

							Thanx, Paul

> ------------------------------------------------------------------------
> 
> rcu: Add update-side variant of rcu_dereference()
> 
> Upcoming consistency-checking features are requiring that even update-side
> accesses to RCU-protected pointers use some variant of rcu_dereference().
> Even though rcu_dereference() is quite lightweight, it does constrain the
> compiler, thus producing code that is worse than required.  This patch
> therefore adds rcu_dereference_update(), which allows lockdep-style
> checks for holding the correct update-side lock, but which does not
> constrain the compiler.
> 
> Suggested-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
> 
>  rcupdate.h |   17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 872a98e..0a6047f 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -209,9 +209,26 @@ static inline int rcu_read_lock_sched_held(void)
>  		rcu_dereference_raw(p); \
>  	})
>  
> +/**
> + * rcu_dereference_update - rcu_dereference on structure that cannot change
> + *
> + * Do rcu_dereference() checking, but just pick up the pointer without
> + * the normal RCU read-side precautions.  These precautions are only
> + * needed if the data structure can change.  The caller is responsible
> + * for doing whatever is necessary (such as holding locks) to prevent
> + * such changes from occurring.
> + */
> +#define rcu_dereference_update(p, c) \
> +	({ \
> +		if (debug_lockdep_rcu_enabled() && !(c)) \
> +			lockdep_rcu_dereference(__FILE__, __LINE__); \
> +		(p); \
> +	})
> +
>  #else /* #ifdef CONFIG_PROVE_RCU */
>  
>  #define rcu_dereference_check(p, c)	rcu_dereference_raw(p)
> +#define rcu_dereference_update(p, c)	(p)
>  
>  #endif /* #else #ifdef CONFIG_PROVE_RCU */
>  

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 22:59       ` David Howells
  2010-03-29 23:26         ` Paul E. McKenney
@ 2010-03-30 16:37         ` David Howells
  2010-03-30 17:01           ` Paul E. McKenney
  1 sibling, 1 reply; 48+ messages in thread
From: David Howells @ 2010-03-30 16:37 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> rcu: Add update-side variant of rcu_dereference()
> 
> Upcoming consistency-checking features are requiring that even update-side
> accesses to RCU-protected pointers use some variant of rcu_dereference().
> Even though rcu_dereference() is quite lightweight, it does constrain the
> compiler, thus producing code that is worse than required.  This patch
> therefore adds rcu_dereference_update(), which allows lockdep-style
> checks for holding the correct update-side lock, but which does not
> constrain the compiler.

Ummm...  I'm not so keen on the name for two reasons.  Firstly, why shouldn't
the read side do:

	struct foo {
		struct bar *b;
	};

	void manage_bar(struct foo *f)
	{
		struct bar *b;

		rcu_read_lock();
		b = rcu_dereference(f->b);
		if (b)
			do_something_to_bar(b);
		rcu_read_unlock();
	}

	void manage_foo(struct foo *f)
	{
		...
		if (f->b)
			manage_bar(f);
		...
	}

Why should this be limited to the update side?


Secondly, the name rcu_dereference_update() seems to imply that this function
itself does an update, perhaps after having done an rcu_dereference().

Perhaps rcu_pointer_valid()?

	if (rcu_pointer_valid(f->b))
		manage_bar(f);

or if you really do want to limit this sort of thing to the update side:

	if (rcu_destination_for_update(f->b)) {
		spin_lock(&f->lock);
		update_bar(f);
		spin_unlock(&f->lock);
	}

Another possibility is have an 'RCU write lock' that just does the lockdep
thing and doesn't interpolate a barrier:

	rcu_write_lock();
	if (rcu_dereference_for_update(f->b)) {
		spin_lock(&f->lock);
		update_bar(f->b);
		spin_unlock(&f->lock);
	}
	rcu_write_unlock();

Or might it make sense to roll together with the lock primitive:

	if (rcu_dereference_and_lock(f->b, &f->lock)) {
		update_bar(f);
		spin_unlock(&f->lock);
	}

(I'm not keen on that one because you might not want to take the lock
 immediately, and you have a wide choice of locks).

Sorry to be picky.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-29 23:26         ` Paul E. McKenney
  2010-03-30 15:40           ` Paul E. McKenney
@ 2010-03-30 16:39           ` David Howells
  2010-03-30 16:49             ` Paul E. McKenney
  2010-03-30 23:51             ` David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: David Howells @ 2010-03-30 16:39 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> Scrap this one -- Arnd has it covered, under the much better name
> of rcu_dereference_const().

Not convinced of that name either.  That sounds like the RCU dereference of
constant (R/O) data.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-30 16:39           ` David Howells
@ 2010-03-30 16:49             ` Paul E. McKenney
  2010-03-30 17:04                 ` Eric Dumazet
  2010-03-30 23:51             ` David Howells
  1 sibling, 1 reply; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-30 16:49 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Tue, Mar 30, 2010 at 05:39:11PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Scrap this one -- Arnd has it covered, under the much better name
> > of rcu_dereference_const().
> 
> Not convinced of that name either.  That sounds like the RCU dereference of
> constant (R/O) data.

Which it is, as long as the lock is held.

But what name would you suggest?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-30 16:37         ` [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2] David Howells
@ 2010-03-30 17:01           ` Paul E. McKenney
  0 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-30 17:01 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Tue, Mar 30, 2010 at 05:37:49PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > rcu: Add update-side variant of rcu_dereference()
> > 
> > Upcoming consistency-checking features are requiring that even update-side
> > accesses to RCU-protected pointers use some variant of rcu_dereference().
> > Even though rcu_dereference() is quite lightweight, it does constrain the
> > compiler, thus producing code that is worse than required.  This patch
> > therefore adds rcu_dereference_update(), which allows lockdep-style
> > checks for holding the correct update-side lock, but which does not
> > constrain the compiler.
> 
> Ummm...  I'm not so keen on the name for two reasons.  Firstly, why shouldn't
> the read side do:
> 
> 	struct foo {
> 		struct bar *b;
> 	};
> 
> 	void manage_bar(struct foo *f)
> 	{
> 		struct bar *b;
> 
> 		rcu_read_lock();
> 		b = rcu_dereference(f->b);
> 		if (b)
> 			do_something_to_bar(b);
> 		rcu_read_unlock();
> 	}
> 
> 	void manage_foo(struct foo *f)
> 	{
> 		...
> 		if (f->b)
> 			manage_bar(f);
> 		...
> 	}
> 
> Why should this be limited to the update side?

The main criterion is that you have to have done something to prevent
the data structure from changing, which normally only happens on the
update side.

> Secondly, the name rcu_dereference_update() seems to imply that this function
> itself does an update, perhaps after having done an rcu_dereference().

Indeed, that is rather ugly.

> Perhaps rcu_pointer_valid()?
> 
> 	if (rcu_pointer_valid(f->b))
> 		manage_bar(f);

Hmmm...  That sounds like something that checks the pointer for
correct format or for pointing to valid data.

> or if you really do want to limit this sort of thing to the update side:
> 
> 	if (rcu_destination_for_update(f->b)) {
> 		spin_lock(&f->lock);
> 		update_bar(f);
> 		spin_unlock(&f->lock);
> 	}

What we are doing is dereferencing an RCU-protected pointer that we
are somehow preventing from being concurrently updated, usually by
holding the update-side lock.

So maybe rcu_dereference_no_updates()?  Or rcu_dereference_locked()?

But not rcu_dereference_with_concurrent_updates_somehow_prevented().  ;-)

> Another possibility is have an 'RCU write lock' that just does the lockdep
> thing and doesn't interpolate a barrier:
> 
> 	rcu_write_lock();
> 	if (rcu_dereference_for_update(f->b)) {
> 		spin_lock(&f->lock);
> 		update_bar(f->b);
> 		spin_unlock(&f->lock);
> 	}
> 	rcu_write_unlock();

This would work for the moment, but I do not believe that sparse can
deal with tracking rcu_write_lock() across compilation units.  Besides, I
really really don't want anything that looks like it is somehow protecting
the update.  Too many newcomers to RCU somehow get the impression that
rcu_assign_pointer() provides that protection as it is.  :-/

> Or might it make sense to roll together with the lock primitive:
> 
> 	if (rcu_dereference_and_lock(f->b, &f->lock)) {
> 		update_bar(f);
> 		spin_unlock(&f->lock);
> 	}
> 
> (I'm not keen on that one because you might not want to take the lock
>  immediately, and you have a wide choice of locks).

Indeed, the API explosion might be a bit unpleasant.  ;-)

So, how about rcu_dereference_locked()?

> Sorry to be picky.

Given that I worked quite happily with RCU for quite some time without
any words for any of the underlying concepts, I cannot claim that I
don't need at least some help with name selection.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-30 16:49             ` Paul E. McKenney
@ 2010-03-30 17:04                 ` Eric Dumazet
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Dumazet @ 2010-03-30 17:04 UTC (permalink / raw)
  To: paulmck; +Cc: David Howells, Trond.Myklebust, linux-nfs, linux-kernel

Le mardi 30 mars 2010 à 09:49 -0700, Paul E. McKenney a écrit :
> On Tue, Mar 30, 2010 at 05:39:11PM +0100, David Howells wrote:
> > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > Scrap this one -- Arnd has it covered, under the much better name
> > > of rcu_dereference_const().
> > 
> > Not convinced of that name either.  That sounds like the RCU dereference of
> > constant (R/O) data.
> 
> Which it is, as long as the lock is held.
> 
> But what name would you suggest?
> 

Maybe use 'protected' word or something like that, or 'owned', ...

rcu_dereference_protected() or rcu_dereference_owned()




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
@ 2010-03-30 17:04                 ` Eric Dumazet
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Dumazet @ 2010-03-30 17:04 UTC (permalink / raw)
  To: paulmck; +Cc: David Howells, Trond.Myklebust, linux-nfs, linux-kernel

Le mardi 30 mars 2010 =C3=A0 09:49 -0700, Paul E. McKenney a =C3=A9crit=
 :
> On Tue, Mar 30, 2010 at 05:39:11PM +0100, David Howells wrote:
> > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> >=20
> > > Scrap this one -- Arnd has it covered, under the much better name
> > > of rcu_dereference_const().
> >=20
> > Not convinced of that name either.  That sounds like the RCU derefe=
rence of
> > constant (R/O) data.
>=20
> Which it is, as long as the lock is held.
>=20
> But what name would you suggest?
>=20

Maybe use 'protected' word or something like that, or 'owned', ...

rcu_dereference_protected() or rcu_dereference_owned()




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-30 17:04                 ` Eric Dumazet
@ 2010-03-30 17:25                   ` Paul E. McKenney
  -1 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-30 17:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Howells, Trond.Myklebust, linux-nfs, linux-kernel

On Tue, Mar 30, 2010 at 07:04:31PM +0200, Eric Dumazet wrote:
> Le mardi 30 mars 2010 à 09:49 -0700, Paul E. McKenney a écrit :
> > On Tue, Mar 30, 2010 at 05:39:11PM +0100, David Howells wrote:
> > > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > > > Scrap this one -- Arnd has it covered, under the much better name
> > > > of rcu_dereference_const().
> > > 
> > > Not convinced of that name either.  That sounds like the RCU dereference of
> > > constant (R/O) data.
> > 
> > Which it is, as long as the lock is held.
> > 
> > But what name would you suggest?
> > 
> 
> Maybe use 'protected' word or something like that, or 'owned', ...
> 
> rcu_dereference_protected() or rcu_dereference_owned()

I do like rcu_dereference_protected() -- a bit longer than
rcu_dereference_locked(), but covers the initialization and cleanup
accesses that might be protected by privatization rather than by locking.

Other thoughts?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
@ 2010-03-30 17:25                   ` Paul E. McKenney
  0 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-30 17:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Howells, Trond.Myklebust, linux-nfs, linux-kernel

On Tue, Mar 30, 2010 at 07:04:31PM +0200, Eric Dumazet wrote:
> Le mardi 30 mars 2010 =E0 09:49 -0700, Paul E. McKenney a =E9crit :
> > On Tue, Mar 30, 2010 at 05:39:11PM +0100, David Howells wrote:
> > > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > >=20
> > > > Scrap this one -- Arnd has it covered, under the much better na=
me
> > > > of rcu_dereference_const().
> > >=20
> > > Not convinced of that name either.  That sounds like the RCU dere=
ference of
> > > constant (R/O) data.
> >=20
> > Which it is, as long as the lock is held.
> >=20
> > But what name would you suggest?
> >=20
>=20
> Maybe use 'protected' word or something like that, or 'owned', ...
>=20
> rcu_dereference_protected() or rcu_dereference_owned()

I do like rcu_dereference_protected() -- a bit longer than
rcu_dereference_locked(), but covers the initialization and cleanup
accesses that might be protected by privatization rather than by lockin=
g.

Other thoughts?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-30 16:39           ` David Howells
  2010-03-30 16:49             ` Paul E. McKenney
@ 2010-03-30 23:51             ` David Howells
  2010-03-31  0:08               ` Paul E. McKenney
  2010-03-31 14:04               ` David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: David Howells @ 2010-03-30 23:51 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> Which it is, as long as the lock is held.

However, in one of the situations I'm thinking of, no lock is held.  All that
is being tested is whether the pointer to some RCU-protected data is either
NULL or non-NULL.  For example:

	@@ -345,7 +346,7 @@ int nfs_inode_return_delegation(struct inode *inode)
		struct nfs_delegation *delegation;
		int err = 0;

	-	if (rcu_dereference(nfsi->delegation) != NULL) {
	+	if (nfsi->delegation != NULL) {
			spin_lock(&clp->cl_lock);
			delegation = nfs_detach_delegation_locked(nfsi, NULL);
			spin_unlock(&clp->cl_lock);

No lock - RCU or spinlock - is held over the check of nfsi->delegation - which
causes lockdep to complain about an unguarded rcu_dereference().

However, the use of rcu_dereference() here is unnecessary with respect to the
interpolation (where appropriate) of a memory barrier because there is no
second memory access against which to order.

That said, the memory access is repeated inside nfs_detach_delegation_locked(),
which again was wrapped in an rcu_dereference():

	 static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
	 {
	-	struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
	+	struct nfs_delegation *delegation = nfsi->delegation;

		if (delegation == NULL)
			goto nomatch;

which was not necessary for its memory barrier interpolation properties in this
case because of the spin_lock() the caller now holds.


Your suggestion of using rcu_dereference_check() in both these places would
result in two unnecessary memory barriers on something like an Alpha CPU.


How about:

	static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
	{
		struct nfs_delegation *delegation =
			rcu_locked_dereference(nfsi->delegation);
		...
	}

where rcu_locked_dereference() only does the lockdep magic and the dereference,
and does not include a memory barrier.  The documentation of such a function
would note this may only be used when the pointer is guarded by an exclusive
lock to prevent it from changing.

And then:

	int nfs_inode_return_delegation(struct inode *inode)
	{
		struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
		struct nfs_inode *nfsi = NFS_I(inode);
		struct nfs_delegation *delegation;
		int err = 0;

		if (rcu_pointer_not_null(nfsi->delegation)) {
			spin_lock(&clp->cl_lock);
			delegation = nfs_detach_delegation_locked(nfsi, NULL);
			spin_unlock(&clp->cl_lock);
			if (delegation != NULL) {
				nfs_msync_inode(inode);
				err = __nfs_inode_return_delegation(inode, delegation, 1);
			}
		}
		return err;
	}

where rcu_pointer_not_null() simply tests the value of the pointer, casting
away the sparse RCU annotation and not doing the lockdep check and not
including a barrier.  It would not return the value of the pointer, thus
preventing you from needing the barrier as a result.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-30 23:51             ` David Howells
@ 2010-03-31  0:08               ` Paul E. McKenney
  2010-03-31 14:04               ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-31  0:08 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Wed, Mar 31, 2010 at 12:51:04AM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Which it is, as long as the lock is held.
> 
> However, in one of the situations I'm thinking of, no lock is held.  All that
> is being tested is whether the pointer to some RCU-protected data is either
> NULL or non-NULL.  For example:
> 
> 	@@ -345,7 +346,7 @@ int nfs_inode_return_delegation(struct inode *inode)
> 		struct nfs_delegation *delegation;
> 		int err = 0;
> 
> 	-	if (rcu_dereference(nfsi->delegation) != NULL) {
> 	+	if (nfsi->delegation != NULL) {
> 			spin_lock(&clp->cl_lock);
> 			delegation = nfs_detach_delegation_locked(nfsi, NULL);
> 			spin_unlock(&clp->cl_lock);
> 
> No lock - RCU or spinlock - is held over the check of nfsi->delegation - which
> causes lockdep to complain about an unguarded rcu_dereference().
> 
> However, the use of rcu_dereference() here is unnecessary with respect to the
> interpolation (where appropriate) of a memory barrier because there is no
> second memory access against which to order.
> 
> That said, the memory access is repeated inside nfs_detach_delegation_locked(),
> which again was wrapped in an rcu_dereference():
> 
> 	 static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
> 	 {
> 	-	struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
> 	+	struct nfs_delegation *delegation = nfsi->delegation;
> 
> 		if (delegation == NULL)
> 			goto nomatch;
> 
> which was not necessary for its memory barrier interpolation properties in this
> case because of the spin_lock() the caller now holds.
> 
> 
> Your suggestion of using rcu_dereference_check() in both these places would
> result in two unnecessary memory barriers on something like an Alpha CPU.
> 
> 
> How about:
> 
> 	static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
> 	{
> 		struct nfs_delegation *delegation =
> 			rcu_locked_dereference(nfsi->delegation);
> 		...
> 	}
> 
> where rcu_locked_dereference() only does the lockdep magic and the dereference,
> and does not include a memory barrier.  The documentation of such a function
> would note this may only be used when the pointer is guarded by an exclusive
> lock to prevent it from changing.
> 
> And then:
> 
> 	int nfs_inode_return_delegation(struct inode *inode)
> 	{
> 		struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
> 		struct nfs_inode *nfsi = NFS_I(inode);
> 		struct nfs_delegation *delegation;
> 		int err = 0;
> 
> 		if (rcu_pointer_not_null(nfsi->delegation)) {
> 			spin_lock(&clp->cl_lock);
> 			delegation = nfs_detach_delegation_locked(nfsi, NULL);
> 			spin_unlock(&clp->cl_lock);
> 			if (delegation != NULL) {
> 				nfs_msync_inode(inode);
> 				err = __nfs_inode_return_delegation(inode, delegation, 1);
> 			}
> 		}
> 		return err;
> 	}
> 
> where rcu_pointer_not_null() simply tests the value of the pointer, casting
> away the sparse RCU annotation and not doing the lockdep check and not
> including a barrier.  It would not return the value of the pointer, thus
> preventing you from needing the barrier as a result.

How about Eric's suggestion of rcu_dereference_protected()?  That name
doesn't imply a lock, which as you say above, isn't always needed to
keep the structure from changing.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-30 23:51             ` David Howells
  2010-03-31  0:08               ` Paul E. McKenney
@ 2010-03-31 14:04               ` David Howells
  2010-03-31 15:16                 ` Paul E. McKenney
  2010-03-31 17:37                 ` David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: David Howells @ 2010-03-31 14:04 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> How about Eric's suggestion of rcu_dereference_protected()?  That name
> doesn't imply a lock, which as you say above, isn't always needed to
> keep the structure from changing.

But 'protected' from what or by what?

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-31 14:04               ` David Howells
@ 2010-03-31 15:16                 ` Paul E. McKenney
  2010-03-31 17:37                 ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-31 15:16 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Wed, Mar 31, 2010 at 03:04:33PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > How about Eric's suggestion of rcu_dereference_protected()?  That name
> > doesn't imply a lock, which as you say above, isn't always needed to
> > keep the structure from changing.
> 
> But 'protected' from what or by what?

Protected by something that the caller did, be it holding the the correct
lock, operating on it during initialization before other CPUs have access
to it, operating on it during cleanup after other CPUs' access has been
revoked, or whatever.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-31 14:04               ` David Howells
  2010-03-31 15:16                 ` Paul E. McKenney
@ 2010-03-31 17:37                 ` David Howells
  2010-03-31 18:30                   ` Paul E. McKenney
                                     ` (2 more replies)
  1 sibling, 3 replies; 48+ messages in thread
From: David Howells @ 2010-03-31 17:37 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> Protected by something that the caller did, be it holding the the correct
> lock, operating on it during initialization before other CPUs have access
> to it, operating on it during cleanup after other CPUs' access has been
> revoked, or whatever.

But the point I made very early this morning still stands:  What if someone
simply wants to test the pointer, not actually to dereference it?

NFS was using rcu_dereference() for this in a couple of places - which is
overkill.  I suggested stripping this off and you countered with the
suggestion that it should be using rcu_dereference_check().

Why do I need anything at all?

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-31 17:37                 ` David Howells
@ 2010-03-31 18:30                   ` Paul E. McKenney
  2010-03-31 18:32                     ` Eric Dumazet
  2010-03-31 22:53                   ` David Howells
  2 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-03-31 18:30 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Wed, Mar 31, 2010 at 06:37:34PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Protected by something that the caller did, be it holding the the correct
> > lock, operating on it during initialization before other CPUs have access
> > to it, operating on it during cleanup after other CPUs' access has been
> > revoked, or whatever.
> 
> But the point I made very early this morning still stands:  What if someone
> simply wants to test the pointer, not actually to dereference it?

OK, I was missing your point.  And this is what you were proposing
rcu_pointer_not_null() for.

> NFS was using rcu_dereference() for this in a couple of places - which is
> overkill.  I suggested stripping this off and you countered with the
> suggestion that it should be using rcu_dereference_check().
> 
> Why do I need anything at all?

Right now you don't.  We will need something as part of Arnd's patches to
shut sparse up, and I was hoping not to have to go through the find/fix
cycle twice.  But that turned out to be kind of pointless, since we
have burned far more time discussing than it would have taken to fix
it twice.  ;-)

But the discussion has been helpful, as I was previously just fine with
sprinkling otherwise-unneed rcu_dereference_whatever() calls thoughout
the kernel!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-31 17:37                 ` David Howells
@ 2010-03-31 18:32                     ` Eric Dumazet
  2010-03-31 18:32                     ` Eric Dumazet
  2010-03-31 22:53                   ` David Howells
  2 siblings, 0 replies; 48+ messages in thread
From: Eric Dumazet @ 2010-03-31 18:32 UTC (permalink / raw)
  To: David Howells; +Cc: paulmck, Trond.Myklebust, linux-nfs, linux-kernel

Le mercredi 31 mars 2010 à 18:37 +0100, David Howells a écrit :
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Protected by something that the caller did, be it holding the the correct
> > lock, operating on it during initialization before other CPUs have access
> > to it, operating on it during cleanup after other CPUs' access has been
> > revoked, or whatever.
> 
> But the point I made very early this morning still stands:  What if someone
> simply wants to test the pointer, not actually to dereference it?
> 
> NFS was using rcu_dereference() for this in a couple of places - which is
> overkill.  I suggested stripping this off and you countered with the
> suggestion that it should be using rcu_dereference_check().
> 

If pointer has the rcu mark, and somehing access this pointer without
proper locking, then automatic checkers (sparse...) will trigger a
warning, this is what Paul said.

Example of such checks, 

# define __percpu   __attribute__((noderef, address_space(3))) 

If someone tries to manipulate a __percpu marked ptr without proper API,
sparse loudly complains.


> Why do I need anything at all?
> 

If you dont own a lock, and test a pointer, what guarantee do you have
this pointer doesnt change right after you tested it ?

If *something* protects the pointer from being changed, then how can be
expressed this fact ?

If nothing protects the pointer, why test it then, as result of test is
unreliable ?

If NFS was using rcu_dereference(), it probably was for a reason, but if
nobody can recall it, it was a wrong reason ?

Sorry, too many questions and no answer I guess...




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
@ 2010-03-31 18:32                     ` Eric Dumazet
  0 siblings, 0 replies; 48+ messages in thread
From: Eric Dumazet @ 2010-03-31 18:32 UTC (permalink / raw)
  To: David Howells; +Cc: paulmck, Trond.Myklebust, linux-nfs, linux-kernel

Le mercredi 31 mars 2010 =C3=A0 18:37 +0100, David Howells a =C3=A9crit=
 :
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
>=20
> > Protected by something that the caller did, be it holding the the c=
orrect
> > lock, operating on it during initialization before other CPUs have =
access
> > to it, operating on it during cleanup after other CPUs' access has =
been
> > revoked, or whatever.
>=20
> But the point I made very early this morning still stands:  What if s=
omeone
> simply wants to test the pointer, not actually to dereference it?
>=20
> NFS was using rcu_dereference() for this in a couple of places - whic=
h is
> overkill.  I suggested stripping this off and you countered with the
> suggestion that it should be using rcu_dereference_check().
>=20

If pointer has the rcu mark, and somehing access this pointer without
proper locking, then automatic checkers (sparse...) will trigger a
warning, this is what Paul said.

Example of such checks,=20

# define __percpu   __attribute__((noderef, address_space(3)))=20

If someone tries to manipulate a __percpu marked ptr without proper API=
,
sparse loudly complains.


> Why do I need anything at all?
>=20

If you dont own a lock, and test a pointer, what guarantee do you have
this pointer doesnt change right after you tested it ?

If *something* protects the pointer from being changed, then how can be
expressed this fact ?

If nothing protects the pointer, why test it then, as result of test is
unreliable ?

If NFS was using rcu_dereference(), it probably was for a reason, but i=
f
nobody can recall it, it was a wrong reason ?

Sorry, too many questions and no answer I guess...




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-31 17:37                 ` David Howells
  2010-03-31 18:30                   ` Paul E. McKenney
  2010-03-31 18:32                     ` Eric Dumazet
@ 2010-03-31 22:53                   ` David Howells
  2010-04-01  1:29                     ` Paul E. McKenney
  2010-04-01 11:45                     ` David Howells
  2 siblings, 2 replies; 48+ messages in thread
From: David Howells @ 2010-03-31 22:53 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: dhowells, paulmck, Trond.Myklebust, linux-nfs, linux-kernel

Eric Dumazet <eric.dumazet@gmail.com> wrote:

> If you dont own a lock, and test a pointer, what guarantee do you have
> this pointer doesnt change right after you tested it ?

There are five possibilities:

 (1) A pointer points to something when you check, and still points to the
     same thing after you've gained the lock.

 (2) A pointer points to something when you check, and points to something
     else after you've gained the lock.

 (3) A pointer points to something when you check, and is NULL after you've
     gained the lock.

 (4) A pointer points to NULL when you check, and points to something after
     you've gained the lock.

 (5) A pointer points to NULL when you check, and points to NULL after you've
     gained the lock.

However, what if you _know_ that the pointer can only ever be made non-NULL
during initialisation, and may even be left unset?  That means possibility (4)
can never happen, and that possibility (5) can be detected by testing before
taking the lock.  Now, what if (5) is a common occurrence?  It might make
sense to make the test.

And what matter if the pointer _does_ change after you test it.  If it was
NULL before, it can only be NULL now - by the semantics defined for that
particular pointer.

> If *something* protects the pointer from being changed, then how can be
> expressed this fact ?
> 
> If nothing protects the pointer, why test it then, as result of test is
> unreliable ?

I think you may be misunderstanding the purpose of rcu_dereference().  It is
to make sure the reading and dereferencing of the pointer are correctly
ordered with respect to the setting up of the pointed to record and the
changing of the pointer.

There must be two memory accesses for the barrier implied to be of use.  In
nfs_inode_return_delegation() there aren't two memory accesses to order,
therefore the barrier is pointless.

> If NFS was using rcu_dereference(), it probably was for a reason, but if
> nobody can recall it, it was a wrong reason ?

I think it is incorrectly used.  Given that the rcu_dereference() in:

	if (rcu_dereference(nfsi->delegation) != NULL) {
		spin_lock(&clp->cl_lock);
		delegation = nfs_detach_delegation_locked(nfsi, NULL);
		spin_unlock(&clp->cl_lock);
		if (delegation != NULL)
			nfs_do_return_delegation(inode, delegation, 0);
	}

resolves to:

	_________p1 = nfsi->delegation;
	smp_read_barrier_depends();
	if (_________p1) {
		spin_lock(&clp->cl_lock); // implicit LOCK-class barrier
		==>nfs_detach_delegation_locked(nfsi, NULL);
		  [dereference nfsi->delegation]
		...
	}

do you actually need the smp_read_barrier_depends()?  You _have_ a barrier in
the form of the spin_lock().  In fact, the spin_lock() is avowedly sufficient
to protect accesses to and dereferences of nfsi->delegation, which means that:

	static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
	{
		struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
		...
	}

has no need of the internal barrier provided by rcu_dereference() either.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-31 22:53                   ` David Howells
@ 2010-04-01  1:29                     ` Paul E. McKenney
  2010-04-01 11:45                     ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-01  1:29 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Wed, Mar 31, 2010 at 11:53:28PM +0100, David Howells wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > If you dont own a lock, and test a pointer, what guarantee do you have
> > this pointer doesnt change right after you tested it ?
> 
> There are five possibilities:
> 
>  (1) A pointer points to something when you check, and still points to the
>      same thing after you've gained the lock.
> 
>  (2) A pointer points to something when you check, and points to something
>      else after you've gained the lock.
> 
>  (3) A pointer points to something when you check, and is NULL after you've
>      gained the lock.
> 
>  (4) A pointer points to NULL when you check, and points to something after
>      you've gained the lock.
> 
>  (5) A pointer points to NULL when you check, and points to NULL after you've
>      gained the lock.
> 
> However, what if you _know_ that the pointer can only ever be made non-NULL
> during initialisation, and may even be left unset?  That means possibility (4)
> can never happen, and that possibility (5) can be detected by testing before
> taking the lock.  Now, what if (5) is a common occurrence?  It might make
> sense to make the test.
> 
> And what matter if the pointer _does_ change after you test it.  If it was
> NULL before, it can only be NULL now - by the semantics defined for that
> particular pointer.
> 
> > If *something* protects the pointer from being changed, then how can be
> > expressed this fact ?
> > 
> > If nothing protects the pointer, why test it then, as result of test is
> > unreliable ?
> 
> I think you may be misunderstanding the purpose of rcu_dereference().  It is
> to make sure the reading and dereferencing of the pointer are correctly
> ordered with respect to the setting up of the pointed to record and the
> changing of the pointer.
> 
> There must be two memory accesses for the barrier implied to be of use.  In
> nfs_inode_return_delegation() there aren't two memory accesses to order,
> therefore the barrier is pointless.
> 
> > If NFS was using rcu_dereference(), it probably was for a reason, but if
> > nobody can recall it, it was a wrong reason ?
> 
> I think it is incorrectly used.  Given that the rcu_dereference() in:
> 
> 	if (rcu_dereference(nfsi->delegation) != NULL) {
> 		spin_lock(&clp->cl_lock);
> 		delegation = nfs_detach_delegation_locked(nfsi, NULL);
> 		spin_unlock(&clp->cl_lock);
> 		if (delegation != NULL)
> 			nfs_do_return_delegation(inode, delegation, 0);
> 	}

And nfs_detach_delegation_locked() rechecks nfsi->delegation() under
the lock, so this is a legitimate use.

The pointer is not held constant, but any changes will be accounted
for and handled correctly.  So I would argue that the pointer value is
in fact protected by the recheck-under-lock algorithm used here.

							Thanx, Paul

> resolves to:
> 
> 	_________p1 = nfsi->delegation;
> 	smp_read_barrier_depends();
> 	if (_________p1) {
> 		spin_lock(&clp->cl_lock); // implicit LOCK-class barrier
> 		==>nfs_detach_delegation_locked(nfsi, NULL);
> 		  [dereference nfsi->delegation]
> 		...
> 	}
> 
> do you actually need the smp_read_barrier_depends()?  You _have_ a barrier in
> the form of the spin_lock().  In fact, the spin_lock() is avowedly sufficient
> to protect accesses to and dereferences of nfsi->delegation, which means that:
> 
> 	static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
> 	{
> 		struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
> 		...
> 	}
> 
> has no need of the internal barrier provided by rcu_dereference() either.
> 
> David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-03-31 22:53                   ` David Howells
  2010-04-01  1:29                     ` Paul E. McKenney
@ 2010-04-01 11:45                     ` David Howells
  2010-04-01 14:39                       ` Paul E. McKenney
  2010-04-01 14:46                       ` David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: David Howells @ 2010-04-01 11:45 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> > I think it is incorrectly used.  Given that the rcu_dereference() in:
> > 
> > 	if (rcu_dereference(nfsi->delegation) != NULL) {
> > 		spin_lock(&clp->cl_lock);
> > 		delegation = nfs_detach_delegation_locked(nfsi, NULL);
> > 		spin_unlock(&clp->cl_lock);
> > 		if (delegation != NULL)
> > 			nfs_do_return_delegation(inode, delegation, 0);
> > 	}
> 
> And nfs_detach_delegation_locked() rechecks nfsi->delegation() under
> the lock, so this is a legitimate use.
> 
> The pointer is not held constant, but any changes will be accounted
> for and handled correctly.  So I would argue that the pointer value is
> in fact protected by the recheck-under-lock algorithm used here.

A legitimate use of what?

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-01 11:45                     ` David Howells
@ 2010-04-01 14:39                       ` Paul E. McKenney
  2010-04-01 14:46                       ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-01 14:39 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Thu, Apr 01, 2010 at 12:45:14PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > I think it is incorrectly used.  Given that the rcu_dereference() in:
> > > 
> > > 	if (rcu_dereference(nfsi->delegation) != NULL) {
> > > 		spin_lock(&clp->cl_lock);
> > > 		delegation = nfs_detach_delegation_locked(nfsi, NULL);
> > > 		spin_unlock(&clp->cl_lock);
> > > 		if (delegation != NULL)
> > > 			nfs_do_return_delegation(inode, delegation, 0);
> > > 	}
> > 
> > And nfs_detach_delegation_locked() rechecks nfsi->delegation() under
> > the lock, so this is a legitimate use.
> > 
> > The pointer is not held constant, but any changes will be accounted
> > for and handled correctly.  So I would argue that the pointer value is
> > in fact protected by the recheck-under-lock algorithm used here.
> 
> A legitimate use of what?

A legitimate use of loading an RCU-protected pointer without
smp_read_barrier_depends().  However, I could imagine some situations
where the ACCESS_ONCE() semantics were required -- though in this
particular situation, I am having a hard time seeing how the compiler
could mess us up.  That said, my time on the C++ standards committee
has given me new respect for the perversity of compiler writers.

So you have objected to needless memory barriers.  How do you feel
about possibly needless ACCESS_ONCE() calls?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-01 11:45                     ` David Howells
  2010-04-01 14:39                       ` Paul E. McKenney
@ 2010-04-01 14:46                       ` David Howells
  2010-04-05 17:57                         ` Paul E. McKenney
                                           ` (2 more replies)
  1 sibling, 3 replies; 48+ messages in thread
From: David Howells @ 2010-04-01 14:46 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> So you have objected to needless memory barriers.  How do you feel
> about possibly needless ACCESS_ONCE() calls?

That would work here since it shouldn't emit any excess instructions.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-01 14:46                       ` David Howells
@ 2010-04-05 17:57                         ` Paul E. McKenney
  2010-04-06  9:30                         ` David Howells
  2010-04-06 16:14                         ` David Howells
  2 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-05 17:57 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Thu, Apr 01, 2010 at 03:46:51PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > So you have objected to needless memory barriers.  How do you feel
> > about possibly needless ACCESS_ONCE() calls?
> 
> That would work here since it shouldn't emit any excess instructions.

And here is the corresponding patch.  Seem reasonable?

							Thanx, Paul

------------------------------------------------------------------------

 rcupdate.h |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

commit 61ad1405cd442fe54f87ff97febf52610817903e
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Apr 5 10:52:53 2010 -0700

    rcu: add rcu_dereference_protect to avoid smp_read_barrier_depends overhead
    
    This patch adds a variant of rcu_dereference() that handles situations
    where the RCU-protected data structure cannot change, perhaps due to
    our holding the update-side lock, or where the RCU-protected pointer is
    only to be tested, not dereferenced.
    
    Suggested-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 872a98e..123b834 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -209,9 +209,28 @@ static inline int rcu_read_lock_sched_held(void)
 		rcu_dereference_raw(p); \
 	})
 
+/**
+ * rcu_dereference_protected - fetch RCU pointer when protected by algorithm
+ *
+ * Return the value of the specified RCU-protected pointer, but omit
+ * the smp_read_barrier_depends() and keep the ACCESS_ONCE().  This
+ * is useful in cases where update-side locks prevent the value of the
+ * pointer from changing, and is also useful in cases where the value
+ * of this pointer is accessed, but the pointer is not dereferenced.
+ * An example of this latter occurs when testing an RCU-protected
+ * pointer against NULL.
+ */
+#define rcu_dereference_protected(p, c) \
+	({ \
+		if (debug_lockdep_rcu_enabled() && !(c)) \
+			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		ACCESS_ONCE(p); \
+	})
+
 #else /* #ifdef CONFIG_PROVE_RCU */
 
 #define rcu_dereference_check(p, c)	rcu_dereference_raw(p)
+#define rcu_dereference_protected(p, c)	ACCESS_ONCE(p)
 
 #endif /* #else #ifdef CONFIG_PROVE_RCU */
 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-01 14:46                       ` David Howells
  2010-04-05 17:57                         ` Paul E. McKenney
@ 2010-04-06  9:30                         ` David Howells
  2010-04-06 16:14                         ` David Howells
  2 siblings, 0 replies; 48+ messages in thread
From: David Howells @ 2010-04-06  9:30 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> And here is the corresponding patch.  Seem reasonable?

Okay.  Still not keen on the name, but it'll do.

I think the problem is that rcu_dereference() is itself a misnomer: It doesn't
actually dereference.

Anyway:

Acked-by: David Howells <dhowells@redhat.com>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-01 14:46                       ` David Howells
  2010-04-05 17:57                         ` Paul E. McKenney
  2010-04-06  9:30                         ` David Howells
@ 2010-04-06 16:14                         ` David Howells
  2010-04-06 17:29                           ` Paul E. McKenney
  2010-04-06 19:34                           ` David Howells
  2 siblings, 2 replies; 48+ messages in thread
From: David Howells @ 2010-04-06 16:14 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> > > So you have objected to needless memory barriers.  How do you feel
> > > about possibly needless ACCESS_ONCE() calls?
> > 
> > That would work here since it shouldn't emit any excess instructions.
> 
> And here is the corresponding patch.  Seem reasonable?

Actually, now I've thought about it some more.  No, it's not reasonable.
You've written:

    This patch adds a variant of rcu_dereference() that handles situations
    where the RCU-protected data structure cannot change, perhaps due to
    our holding the update-side lock, or where the RCU-protected pointer is
    only to be tested, not dereferenced.

But if we hold the update-side lock, then why should we be forced to use
ACCESS_ONCE()?

In fact, if we don't hold the lock, but we want to test the pointer twice in
succession, why should we be required to use ACCESS_LOCK()?

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-06 16:14                         ` David Howells
@ 2010-04-06 17:29                           ` Paul E. McKenney
  2010-04-06 19:34                           ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-06 17:29 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Tue, Apr 06, 2010 at 05:14:03PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > > So you have objected to needless memory barriers.  How do you feel
> > > > about possibly needless ACCESS_ONCE() calls?
> > > 
> > > That would work here since it shouldn't emit any excess instructions.
> > 
> > And here is the corresponding patch.  Seem reasonable?
> 
> Actually, now I've thought about it some more.  No, it's not reasonable.
> You've written:
> 
>     This patch adds a variant of rcu_dereference() that handles situations
>     where the RCU-protected data structure cannot change, perhaps due to
>     our holding the update-side lock, or where the RCU-protected pointer is
>     only to be tested, not dereferenced.
> 
> But if we hold the update-side lock, then why should we be forced to use
> ACCESS_ONCE()?
> 
> In fact, if we don't hold the lock, but we want to test the pointer twice in
> succession, why should we be required to use ACCESS_LOCK()?

OK, just to make sure I understand you...  You are asking for two additional
RCU API members:

1.	rcu_access_pointer() or some such that includes ACCESS_ONCE(),
	but not smp_read_barrier_depends(), which may be used when
	we are simply examining the value of the RCU-protected pointer
	(as in the NFS case).  It could also be used when the
	appropriate update-side lock is held, but for that we have:

2.	rcu_dereference_protected() or some such that includes neither
	ACCESS_ONCE() nor smp_read_barrier_depends(), and that may
	only be used if updates are prevented, for example, by holding
	the appropriate update-side lock.

Does this fit?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-06 16:14                         ` David Howells
  2010-04-06 17:29                           ` Paul E. McKenney
@ 2010-04-06 19:34                           ` David Howells
  2010-04-07  0:02                             ` Paul E. McKenney
  2010-04-07 13:22                             ` David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: David Howells @ 2010-04-06 19:34 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> OK, just to make sure I understand you...  You are asking for two additional
> RCU API members:
> 
> 1.	rcu_access_pointer() or some such that includes ACCESS_ONCE(),
> 	but not smp_read_barrier_depends(), which may be used when
> 	we are simply examining the value of the RCU-protected pointer
> 	(as in the NFS case).  It could also be used when the
> 	appropriate update-side lock is held, but for that we have:
> 
> 2.	rcu_dereference_protected() or some such that includes neither
> 	ACCESS_ONCE() nor smp_read_barrier_depends(), and that may
> 	only be used if updates are prevented, for example, by holding
> 	the appropriate update-side lock.
> 
> Does this fit?

Yep.  I think so.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-06 19:34                           ` David Howells
@ 2010-04-07  0:02                             ` Paul E. McKenney
  2010-04-07 13:22                             ` David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-07  0:02 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Tue, Apr 06, 2010 at 08:34:06PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > OK, just to make sure I understand you...  You are asking for two additional
> > RCU API members:
> > 
> > 1.	rcu_access_pointer() or some such that includes ACCESS_ONCE(),
> > 	but not smp_read_barrier_depends(), which may be used when
> > 	we are simply examining the value of the RCU-protected pointer
> > 	(as in the NFS case).  It could also be used when the
> > 	appropriate update-side lock is held, but for that we have:
> > 
> > 2.	rcu_dereference_protected() or some such that includes neither
> > 	ACCESS_ONCE() nor smp_read_barrier_depends(), and that may
> > 	only be used if updates are prevented, for example, by holding
> > 	the appropriate update-side lock.
> > 
> > Does this fit?
> 
> Yep.  I think so.

OK.  How about the following?

							Thanx, Paul

------------------------------------------------------------------------

commit 104e6c5f16ca5fc23af1b3f3e6197d3f599f934a
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Mon Apr 5 10:52:53 2010 -0700

    rcu: add rcu_access_pointer and rcu_dereference_protect
    
    This patch adds variants of rcu_dereference() that handle situations
    where the RCU-protected data structure cannot change, perhaps due to
    our holding the update-side lock, or where the RCU-protected pointer is
    only to be fetched, not dereferenced.
    
    The new rcu_access_pointer() primitive is for the case where the pointer
    is be fetch and not dereferenced.  This primitive may be used without
    protection, RCU or otherwise, due to the fact that it uses ACCESS_ONCE().
    
    The new rcu_dereference_protect() primitive is for the case where updates
    are prevented, for example, due to holding the update-side lock.  This
    primitive does neither ACCESS_ONCE() nor smp_read_barrier_depends(), so
    can only be used when updates are somehow prevented.
    
    Suggested-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 872a98e..ed7539f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -209,9 +209,47 @@ static inline int rcu_read_lock_sched_held(void)
 		rcu_dereference_raw(p); \
 	})
 
+/**
+ * rcu_access_pointer - fetch RCU pointer with no dereferencing
+ *
+ * Return the value of the specified RCU-protected pointer, but omit the
+ * smp_read_barrier_depends() and keep the ACCESS_ONCE().  This is useful
+ * when the value of this pointer is accessed, but the pointer is not
+ * dereferenced, for example, when testing an RCU-protected pointer against
+ * NULL.  This may also be used in cases where update-side locks prevent
+ * the value of the pointer from changing, but rcu_dereference_protect()
+ * is a lighter-weight primitive for this use case.
+ */
+#define rcu_access_pointer(p, c) \
+	({ \
+		if (debug_lockdep_rcu_enabled() && !(c)) \
+			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		ACCESS_ONCE(p); \
+	})
+
+/**
+ * rcu_dereference_protect - fetch RCU pointer when updates prevented
+ *
+ * Return the value of the specified RCU-protected pointer, but omit
+ * both the smp_read_barrier_depends() and the ACCESS_ONCE().  This
+ * is useful in cases where update-side locks prevent the value of the
+ * pointer from changing.  Please note that this primitive does -not-
+ * prevent the compiler from repeating this reference or combining it
+ * with other references, so it should not be used without protection
+ * of appropriate locks.
+ */
+#define rcu_dereference_protect(p, c) \
+	({ \
+		if (debug_lockdep_rcu_enabled() && !(c)) \
+			lockdep_rcu_dereference(__FILE__, __LINE__); \
+		(p); \
+	})
+
 #else /* #ifdef CONFIG_PROVE_RCU */
 
 #define rcu_dereference_check(p, c)	rcu_dereference_raw(p)
+#define rcu_access_pointer(p, c)	ACCESS_ONCE(p)
+#define rcu_dereference_protect(p, c) (p)
 
 #endif /* #else #ifdef CONFIG_PROVE_RCU */
 

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-06 19:34                           ` David Howells
  2010-04-07  0:02                             ` Paul E. McKenney
@ 2010-04-07 13:22                             ` David Howells
  2010-04-07 15:57                               ` Paul E. McKenney
  2010-04-07 16:35                               ` RCU condition checks David Howells
  1 sibling, 2 replies; 48+ messages in thread
From: David Howells @ 2010-04-07 13:22 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> +#define rcu_access_pointer(p, c) \

Why is there a need for 'c'?

> +#define rcu_dereference_protect(p, c) \

I'd prefer rcu_dereference_protected(), I think.  This macro doesn't protect
anything.  Also, again, why the need for 'c'?

For instance, in:

	static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
	{
		struct nfs_delegation *delegation =
			rcu_dereference_protected(nfsi->delegation, ????);

what would be the condition?  That the spinlock is held?  That's a condition
for calling the function.

And in:

	void nfs_inode_return_delegation_noreclaim(struct inode *inode)
	{
		struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
		struct nfs_inode *nfsi = NFS_I(inode);
		struct nfs_delegation *delegation;

		if (rcu_access_pointer(nfsi->delegation, ????) != NULL) {

what would be the condition here?  There's no lock to check - that's the whole
point of the macro.  I also can't give it nfsi->delegation to check as the
value may change between the two accesses.

David

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]
  2010-04-07 13:22                             ` David Howells
@ 2010-04-07 15:57                               ` Paul E. McKenney
  2010-04-07 16:35                               ` RCU condition checks David Howells
  1 sibling, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-07 15:57 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Wed, Apr 07, 2010 at 02:22:41PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > +#define rcu_access_pointer(p, c) \
> 
> Why is there a need for 'c'?

An example use is where rcu_access_pointer() is legal because we are
either initializing or cleaning up, so that no other CPU has access
to the pointer.  In these cases, you might do something like:

	q = rcu_access_pointer(p->a, p->refcnt == 0);

> > +#define rcu_dereference_protect(p, c) \
> 
> I'd prefer rcu_dereference_protected(), I think.  This macro doesn't protect
> anything.  Also, again, why the need for 'c'?

Agreed on rcu_dereference_protected().  I succumbed to a fit of "make
the identifier shorter", please accept my apologies.

> For instance, in:
> 
> 	static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
> 	{
> 		struct nfs_delegation *delegation =
> 			rcu_dereference_protected(nfsi->delegation, ????);
> 
> what would be the condition?  That the spinlock is held?  That's a condition
> for calling the function.

Yep, that the spinlock is held.  I agree that it is a bit obvious in
this case, but I have come across a number of RCU uses where the lock
in question was acquired many function calls removed from the access,
and where there other locks were held for other purposes.

> And in:
> 
> 	void nfs_inode_return_delegation_noreclaim(struct inode *inode)
> 	{
> 		struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
> 		struct nfs_inode *nfsi = NFS_I(inode);
> 		struct nfs_delegation *delegation;
> 
> 		if (rcu_access_pointer(nfsi->delegation, ????) != NULL) {
> 
> what would be the condition here?  There's no lock to check - that's the whole
> point of the macro.  I also can't give it nfsi->delegation to check as the
> value may change between the two accesses.

I suggest something like the following:

		/* protected by double-check lock pattern. */
		if (rcu_access_pointer(nfsi->delegation, 1) != NULL) {

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RCU condition checks
  2010-04-07 13:22                             ` David Howells
  2010-04-07 15:57                               ` Paul E. McKenney
@ 2010-04-07 16:35                               ` David Howells
  2010-04-07 17:10                                 ` Paul E. McKenney
  1 sibling, 1 reply; 48+ messages in thread
From: David Howells @ 2010-04-07 16:35 UTC (permalink / raw)
  To: paulmck; +Cc: dhowells, Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:

> > Why is there a need for 'c'?
> 
> An example use is where rcu_access_pointer() is legal because we are
> either initializing or cleaning up, so that no other CPU has access
> to the pointer.  In these cases, you might do something like:
> 
> 	q = rcu_access_pointer(p->a, p->refcnt == 0);

I think the main problem I have with this is that the fact that p->refcnt
should be 0 here is unrelated to the fact that we're wanting to look at the
value of p->a.  I'd say that this should be two separate statements, for
example:

	ASSERT(p->refcnt == 0);
	q = rcu_access_pointer(p->a);

I could see using a lockdep-managed ASSERT here would work, though.

The other problem I have with this is that I'm assuming rcu_access_pointer()
is simply for looking at the value of the pointer without dereferencing it -
in which case, is there any need for the lock-describing condition?


I agree, though, that:

	q = rcu_dereference_check(p->a,
				  rcu_read_lock_held() || (
				   lockdep_is_held(p->lock) &&
				   lockdep_is_held(q->lock)));

is a reasonable way of keeping the dereference and the lock checks together,
though that could equally well be written, say:

	LOCKDEP_ASSERT(rcu_read_lock_held() || (
		        lockdep_is_held(p->lock) &&
			lockdep_is_held(q->lock)));
	q = rcu_dereference_protected(p->a);

but combining those makes it easier to ensure people to write lock checking.

Davod

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: RCU condition checks
  2010-04-07 16:35                               ` RCU condition checks David Howells
@ 2010-04-07 17:10                                 ` Paul E. McKenney
  2010-04-11 22:57                                   ` Trond Myklebust
  0 siblings, 1 reply; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-07 17:10 UTC (permalink / raw)
  To: David Howells; +Cc: Eric Dumazet, Trond.Myklebust, linux-nfs, linux-kernel

On Wed, Apr 07, 2010 at 05:35:30PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > Why is there a need for 'c'?
> > 
> > An example use is where rcu_access_pointer() is legal because we are
> > either initializing or cleaning up, so that no other CPU has access
> > to the pointer.  In these cases, you might do something like:
> > 
> > 	q = rcu_access_pointer(p->a, p->refcnt == 0);
> 
> I think the main problem I have with this is that the fact that p->refcnt
> should be 0 here is unrelated to the fact that we're wanting to look at the
> value of p->a.  I'd say that this should be two separate statements, for
> example:
> 
> 	ASSERT(p->refcnt == 0);
> 	q = rcu_access_pointer(p->a);
> 
> I could see using a lockdep-managed ASSERT here would work, though.
> 
> The other problem I have with this is that I'm assuming rcu_access_pointer()
> is simply for looking at the value of the pointer without dereferencing it -
> in which case, is there any need for the lock-describing condition?

I agree that in many cases there won't be a reasonable condition.
In which case, using "1" and an explanatory comment makes sense.
In other cases, the fact that the value is zero can mean that no one
else can possibly have a reference.

All that aside, I fully expect that uses of rcu_access_pointer() will
require more than the usual code-review effort, as these sorts of
unprotected accesses are notoriously error-prone.

> I agree, though, that:
> 
> 	q = rcu_dereference_check(p->a,
> 				  rcu_read_lock_held() || (
> 				   lockdep_is_held(p->lock) &&
> 				   lockdep_is_held(q->lock)));
> 
> is a reasonable way of keeping the dereference and the lock checks together,
> though that could equally well be written, say:
> 
> 	LOCKDEP_ASSERT(rcu_read_lock_held() || (
> 		        lockdep_is_held(p->lock) &&
> 			lockdep_is_held(q->lock)));
> 	q = rcu_dereference_protected(p->a);
> 
> but combining those makes it easier to ensure people to write lock checking.

Glad you like it!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: RCU condition checks
  2010-04-07 17:10                                 ` Paul E. McKenney
@ 2010-04-11 22:57                                   ` Trond Myklebust
  2010-04-12 16:47                                       ` Paul E. McKenney
  0 siblings, 1 reply; 48+ messages in thread
From: Trond Myklebust @ 2010-04-11 22:57 UTC (permalink / raw)
  To: paulmck; +Cc: David Howells, Eric Dumazet, linux-nfs, linux-kernel

On Wed, 2010-04-07 at 10:10 -0700, Paul E. McKenney wrote: 
> On Wed, Apr 07, 2010 at 05:35:30PM +0100, David Howells wrote:
> > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > > Why is there a need for 'c'?
> > > 
> > > An example use is where rcu_access_pointer() is legal because we are
> > > either initializing or cleaning up, so that no other CPU has access
> > > to the pointer.  In these cases, you might do something like:
> > > 
> > > 	q = rcu_access_pointer(p->a, p->refcnt == 0);
> > 
> > I think the main problem I have with this is that the fact that p->refcnt
> > should be 0 here is unrelated to the fact that we're wanting to look at the
> > value of p->a.  I'd say that this should be two separate statements, for
> > example:
> > 
> > 	ASSERT(p->refcnt == 0);
> > 	q = rcu_access_pointer(p->a);
> > 
> > I could see using a lockdep-managed ASSERT here would work, though.
> > 
> > The other problem I have with this is that I'm assuming rcu_access_pointer()
> > is simply for looking at the value of the pointer without dereferencing it -
> > in which case, is there any need for the lock-describing condition?
> 
> I agree that in many cases there won't be a reasonable condition.
> In which case, using "1" and an explanatory comment makes sense.
> In other cases, the fact that the value is zero can mean that no one
> else can possibly have a reference.
> 
> All that aside, I fully expect that uses of rcu_access_pointer() will
> require more than the usual code-review effort, as these sorts of
> unprotected accesses are notoriously error-prone.
> 
> > I agree, though, that:
> > 
> > 	q = rcu_dereference_check(p->a,
> > 				  rcu_read_lock_held() || (
> > 				   lockdep_is_held(p->lock) &&
> > 				   lockdep_is_held(q->lock)));
> > 
> > is a reasonable way of keeping the dereference and the lock checks together,
> > though that could equally well be written, say:
> > 
> > 	LOCKDEP_ASSERT(rcu_read_lock_held() || (
> > 		        lockdep_is_held(p->lock) &&
> > 			lockdep_is_held(q->lock)));
> > 	q = rcu_dereference_protected(p->a);
> > 
> > but combining those makes it easier to ensure people to write lock checking.
> 
> Glad you like it!
> 
> 							Thanx, Paul

What say we just list the conditions in the comments. I'm happy with
something like the following:

Trond
--------------------------------------------------------------------------------------------- 
NFSv4: Kill the bogus RCU dereferencing warnings in fs/nfs/delegation.c

From: Trond Myklebust <Trond.Myklebust@netapp.com>

Kill all the bogus warnings about RCU dereferencing, and document which
locks are protecting the pointer derefs.

Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/delegation.c |   24 ++++++++++++++++++------
 1 files changed, 18 insertions(+), 6 deletions(-)


diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 1567124..5a1a379 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -34,12 +34,17 @@ static void nfs_free_delegation_callback(struct rcu_head *head)
 	nfs_do_free_delegation(delegation);
 }
 
+/*
+ * At this point, we know that the nfsi->rwsem protects us against read
+ * access by the state recovery thread, so it is safe to assume nobody
+ * else is accessing delegation->cred.
+ */
 static void nfs_free_delegation(struct nfs_delegation *delegation)
 {
 	struct rpc_cred *cred;
 
-	cred = rcu_dereference(delegation->cred);
-	rcu_assign_pointer(delegation->cred, NULL);
+	cred = delegation->cred;
+	delegation->cred = NULL;
 	call_rcu(&delegation->rcu, nfs_free_delegation_callback);
 	if (cred)
 		put_rpccred(cred);
@@ -166,12 +171,18 @@ static struct inode *nfs_delegation_grab_inode(struct nfs_delegation *delegation
 	return inode;
 }
 
+/*
+ * This function must be called with the nfs_client->cl_lock held to
+ * ensure that the value of nfsi->delegation is protected against
+ * modification by other threads.
+ */
 static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
 {
-	struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
+	struct nfs_delegation *delegation = nfsi->delegation;
 
 	if (delegation == NULL)
 		goto nomatch;
+	/* Lock out RCU-protected lookups. */
 	spin_lock(&delegation->lock);
 	if (stateid != NULL && memcmp(delegation->stateid.data, stateid->data,
 				sizeof(delegation->stateid.data)) != 0)
@@ -212,8 +223,9 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
 	delegation->flags = 1<<NFS_DELEGATION_REFERENCED;
 	spin_lock_init(&delegation->lock);
 
+	/* Protect nfsi->delegation against modification */
 	spin_lock(&clp->cl_lock);
-	if (rcu_dereference(nfsi->delegation) != NULL) {
+	if (nfsi->delegation != NULL) {
 		if (memcmp(&delegation->stateid, &nfsi->delegation->stateid,
 					sizeof(delegation->stateid)) == 0 &&
 				delegation->type == nfsi->delegation->type) {
@@ -330,7 +342,7 @@ void nfs_inode_return_delegation_noreclaim(struct inode *inode)
 	struct nfs_inode *nfsi = NFS_I(inode);
 	struct nfs_delegation *delegation;
 
-	if (rcu_dereference(nfsi->delegation) != NULL) {
+	if (nfsi->delegation != NULL) {
 		spin_lock(&clp->cl_lock);
 		delegation = nfs_detach_delegation_locked(nfsi, NULL);
 		spin_unlock(&clp->cl_lock);
@@ -346,7 +358,7 @@ int nfs_inode_return_delegation(struct inode *inode)
 	struct nfs_delegation *delegation;
 	int err = 0;
 
-	if (rcu_dereference(nfsi->delegation) != NULL) {
+	if (nfsi->delegation != NULL) {
 		spin_lock(&clp->cl_lock);
 		delegation = nfs_detach_delegation_locked(nfsi, NULL);
 		spin_unlock(&clp->cl_lock);


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: RCU condition checks
@ 2010-04-12 16:47                                       ` Paul E. McKenney
  0 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-12 16:47 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: David Howells, Eric Dumazet, linux-nfs, linux-kernel

On Sun, Apr 11, 2010 at 06:57:23PM -0400, Trond Myklebust wrote:
> On Wed, 2010-04-07 at 10:10 -0700, Paul E. McKenney wrote: 
> > On Wed, Apr 07, 2010 at 05:35:30PM +0100, David Howells wrote:
> > > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > > > > Why is there a need for 'c'?
> > > > 
> > > > An example use is where rcu_access_pointer() is legal because we are
> > > > either initializing or cleaning up, so that no other CPU has access
> > > > to the pointer.  In these cases, you might do something like:
> > > > 
> > > > 	q = rcu_access_pointer(p->a, p->refcnt == 0);
> > > 
> > > I think the main problem I have with this is that the fact that p->refcnt
> > > should be 0 here is unrelated to the fact that we're wanting to look at the
> > > value of p->a.  I'd say that this should be two separate statements, for
> > > example:
> > > 
> > > 	ASSERT(p->refcnt == 0);
> > > 	q = rcu_access_pointer(p->a);
> > > 
> > > I could see using a lockdep-managed ASSERT here would work, though.
> > > 
> > > The other problem I have with this is that I'm assuming rcu_access_pointer()
> > > is simply for looking at the value of the pointer without dereferencing it -
> > > in which case, is there any need for the lock-describing condition?
> > 
> > I agree that in many cases there won't be a reasonable condition.
> > In which case, using "1" and an explanatory comment makes sense.
> > In other cases, the fact that the value is zero can mean that no one
> > else can possibly have a reference.
> > 
> > All that aside, I fully expect that uses of rcu_access_pointer() will
> > require more than the usual code-review effort, as these sorts of
> > unprotected accesses are notoriously error-prone.
> > 
> > > I agree, though, that:
> > > 
> > > 	q = rcu_dereference_check(p->a,
> > > 				  rcu_read_lock_held() || (
> > > 				   lockdep_is_held(p->lock) &&
> > > 				   lockdep_is_held(q->lock)));
> > > 
> > > is a reasonable way of keeping the dereference and the lock checks together,
> > > though that could equally well be written, say:
> > > 
> > > 	LOCKDEP_ASSERT(rcu_read_lock_held() || (
> > > 		        lockdep_is_held(p->lock) &&
> > > 			lockdep_is_held(q->lock)));
> > > 	q = rcu_dereference_protected(p->a);
> > > 
> > > but combining those makes it easier to ensure people to write lock checking.
> > 
> > Glad you like it!
> > 
> > 							Thanx, Paul
> 
> What say we just list the conditions in the comments. I'm happy with
> something like the following:

This does work at present, but the sparse-based checks that Arnd is
working on will generate warnings anywhere an RCU-protected pointer is
used as a normal pointer.  So I believe that it would be good to get
these taken care of now, rather than having them break again in a very
short time.

							Thanx, Paul

> Trond
> --------------------------------------------------------------------------------------------- 
> NFSv4: Kill the bogus RCU dereferencing warnings in fs/nfs/delegation.c
> 
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> Kill all the bogus warnings about RCU dereferencing, and document which
> locks are protecting the pointer derefs.
> 
> Reported-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  fs/nfs/delegation.c |   24 ++++++++++++++++++------
>  1 files changed, 18 insertions(+), 6 deletions(-)
> 
> 
> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
> index 1567124..5a1a379 100644
> --- a/fs/nfs/delegation.c
> +++ b/fs/nfs/delegation.c
> @@ -34,12 +34,17 @@ static void nfs_free_delegation_callback(struct rcu_head *head)
>  	nfs_do_free_delegation(delegation);
>  }
>  
> +/*
> + * At this point, we know that the nfsi->rwsem protects us against read
> + * access by the state recovery thread, so it is safe to assume nobody
> + * else is accessing delegation->cred.
> + */
>  static void nfs_free_delegation(struct nfs_delegation *delegation)
>  {
>  	struct rpc_cred *cred;
>  
> -	cred = rcu_dereference(delegation->cred);
> -	rcu_assign_pointer(delegation->cred, NULL);
> +	cred = delegation->cred;
> +	delegation->cred = NULL;
>  	call_rcu(&delegation->rcu, nfs_free_delegation_callback);
>  	if (cred)
>  		put_rpccred(cred);
> @@ -166,12 +171,18 @@ static struct inode *nfs_delegation_grab_inode(struct nfs_delegation *delegation
>  	return inode;
>  }
>  
> +/*
> + * This function must be called with the nfs_client->cl_lock held to
> + * ensure that the value of nfsi->delegation is protected against
> + * modification by other threads.
> + */
>  static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
>  {
> -	struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
> +	struct nfs_delegation *delegation = nfsi->delegation;
>  
>  	if (delegation == NULL)
>  		goto nomatch;
> +	/* Lock out RCU-protected lookups. */
>  	spin_lock(&delegation->lock);
>  	if (stateid != NULL && memcmp(delegation->stateid.data, stateid->data,
>  				sizeof(delegation->stateid.data)) != 0)
> @@ -212,8 +223,9 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
>  	delegation->flags = 1<<NFS_DELEGATION_REFERENCED;
>  	spin_lock_init(&delegation->lock);
>  
> +	/* Protect nfsi->delegation against modification */
>  	spin_lock(&clp->cl_lock);
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {
>  		if (memcmp(&delegation->stateid, &nfsi->delegation->stateid,
>  					sizeof(delegation->stateid)) == 0 &&
>  				delegation->type == nfsi->delegation->type) {
> @@ -330,7 +342,7 @@ void nfs_inode_return_delegation_noreclaim(struct inode *inode)
>  	struct nfs_inode *nfsi = NFS_I(inode);
>  	struct nfs_delegation *delegation;
>  
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {
>  		spin_lock(&clp->cl_lock);
>  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
>  		spin_unlock(&clp->cl_lock);
> @@ -346,7 +358,7 @@ int nfs_inode_return_delegation(struct inode *inode)
>  	struct nfs_delegation *delegation;
>  	int err = 0;
>  
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {
>  		spin_lock(&clp->cl_lock);
>  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
>  		spin_unlock(&clp->cl_lock);
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: RCU condition checks
@ 2010-04-12 16:47                                       ` Paul E. McKenney
  0 siblings, 0 replies; 48+ messages in thread
From: Paul E. McKenney @ 2010-04-12 16:47 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: David Howells, Eric Dumazet, linux-nfs, linux-kernel

On Sun, Apr 11, 2010 at 06:57:23PM -0400, Trond Myklebust wrote:
> On Wed, 2010-04-07 at 10:10 -0700, Paul E. McKenney wrote: 
> > On Wed, Apr 07, 2010 at 05:35:30PM +0100, David Howells wrote:
> > > Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > > > > Why is there a need for 'c'?
> > > > 
> > > > An example use is where rcu_access_pointer() is legal because we are
> > > > either initializing or cleaning up, so that no other CPU has access
> > > > to the pointer.  In these cases, you might do something like:
> > > > 
> > > > 	q = rcu_access_pointer(p->a, p->refcnt == 0);
> > > 
> > > I think the main problem I have with this is that the fact that p->refcnt
> > > should be 0 here is unrelated to the fact that we're wanting to look at the
> > > value of p->a.  I'd say that this should be two separate statements, for
> > > example:
> > > 
> > > 	ASSERT(p->refcnt == 0);
> > > 	q = rcu_access_pointer(p->a);
> > > 
> > > I could see using a lockdep-managed ASSERT here would work, though.
> > > 
> > > The other problem I have with this is that I'm assuming rcu_access_pointer()
> > > is simply for looking at the value of the pointer without dereferencing it -
> > > in which case, is there any need for the lock-describing condition?
> > 
> > I agree that in many cases there won't be a reasonable condition.
> > In which case, using "1" and an explanatory comment makes sense.
> > In other cases, the fact that the value is zero can mean that no one
> > else can possibly have a reference.
> > 
> > All that aside, I fully expect that uses of rcu_access_pointer() will
> > require more than the usual code-review effort, as these sorts of
> > unprotected accesses are notoriously error-prone.
> > 
> > > I agree, though, that:
> > > 
> > > 	q = rcu_dereference_check(p->a,
> > > 				  rcu_read_lock_held() || (
> > > 				   lockdep_is_held(p->lock) &&
> > > 				   lockdep_is_held(q->lock)));
> > > 
> > > is a reasonable way of keeping the dereference and the lock checks together,
> > > though that could equally well be written, say:
> > > 
> > > 	LOCKDEP_ASSERT(rcu_read_lock_held() || (
> > > 		        lockdep_is_held(p->lock) &&
> > > 			lockdep_is_held(q->lock)));
> > > 	q = rcu_dereference_protected(p->a);
> > > 
> > > but combining those makes it easier to ensure people to write lock checking.
> > 
> > Glad you like it!
> > 
> > 							Thanx, Paul
> 
> What say we just list the conditions in the comments. I'm happy with
> something like the following:

This does work at present, but the sparse-based checks that Arnd is
working on will generate warnings anywhere an RCU-protected pointer is
used as a normal pointer.  So I believe that it would be good to get
these taken care of now, rather than having them break again in a very
short time.

							Thanx, Paul

> Trond
> --------------------------------------------------------------------------------------------- 
> NFSv4: Kill the bogus RCU dereferencing warnings in fs/nfs/delegation.c
> 
> From: Trond Myklebust <Trond.Myklebust@netapp.com>
> 
> Kill all the bogus warnings about RCU dereferencing, and document which
> locks are protecting the pointer derefs.
> 
> Reported-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> ---
> 
>  fs/nfs/delegation.c |   24 ++++++++++++++++++------
>  1 files changed, 18 insertions(+), 6 deletions(-)
> 
> 
> diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
> index 1567124..5a1a379 100644
> --- a/fs/nfs/delegation.c
> +++ b/fs/nfs/delegation.c
> @@ -34,12 +34,17 @@ static void nfs_free_delegation_callback(struct rcu_head *head)
>  	nfs_do_free_delegation(delegation);
>  }
>  
> +/*
> + * At this point, we know that the nfsi->rwsem protects us against read
> + * access by the state recovery thread, so it is safe to assume nobody
> + * else is accessing delegation->cred.
> + */
>  static void nfs_free_delegation(struct nfs_delegation *delegation)
>  {
>  	struct rpc_cred *cred;
>  
> -	cred = rcu_dereference(delegation->cred);
> -	rcu_assign_pointer(delegation->cred, NULL);
> +	cred = delegation->cred;
> +	delegation->cred = NULL;
>  	call_rcu(&delegation->rcu, nfs_free_delegation_callback);
>  	if (cred)
>  		put_rpccred(cred);
> @@ -166,12 +171,18 @@ static struct inode *nfs_delegation_grab_inode(struct nfs_delegation *delegation
>  	return inode;
>  }
>  
> +/*
> + * This function must be called with the nfs_client->cl_lock held to
> + * ensure that the value of nfsi->delegation is protected against
> + * modification by other threads.
> + */
>  static struct nfs_delegation *nfs_detach_delegation_locked(struct nfs_inode *nfsi, const nfs4_stateid *stateid)
>  {
> -	struct nfs_delegation *delegation = rcu_dereference(nfsi->delegation);
> +	struct nfs_delegation *delegation = nfsi->delegation;
>  
>  	if (delegation == NULL)
>  		goto nomatch;
> +	/* Lock out RCU-protected lookups. */
>  	spin_lock(&delegation->lock);
>  	if (stateid != NULL && memcmp(delegation->stateid.data, stateid->data,
>  				sizeof(delegation->stateid.data)) != 0)
> @@ -212,8 +223,9 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
>  	delegation->flags = 1<<NFS_DELEGATION_REFERENCED;
>  	spin_lock_init(&delegation->lock);
>  
> +	/* Protect nfsi->delegation against modification */
>  	spin_lock(&clp->cl_lock);
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {
>  		if (memcmp(&delegation->stateid, &nfsi->delegation->stateid,
>  					sizeof(delegation->stateid)) == 0 &&
>  				delegation->type == nfsi->delegation->type) {
> @@ -330,7 +342,7 @@ void nfs_inode_return_delegation_noreclaim(struct inode *inode)
>  	struct nfs_inode *nfsi = NFS_I(inode);
>  	struct nfs_delegation *delegation;
>  
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {
>  		spin_lock(&clp->cl_lock);
>  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
>  		spin_unlock(&clp->cl_lock);
> @@ -346,7 +358,7 @@ int nfs_inode_return_delegation(struct inode *inode)
>  	struct nfs_delegation *delegation;
>  	int err = 0;
>  
> -	if (rcu_dereference(nfsi->delegation) != NULL) {
> +	if (nfsi->delegation != NULL) {
>  		spin_lock(&clp->cl_lock);
>  		delegation = nfs_detach_delegation_locked(nfsi, NULL);
>  		spin_unlock(&clp->cl_lock);
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2010-04-12 16:48 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-18 13:33 [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2] David Howells
2010-03-19  2:25 ` Paul E. McKenney
2010-03-19  2:25   ` Paul E. McKenney
2010-03-29 19:02 ` David Howells
2010-03-29 19:21   ` Paul E. McKenney
2010-03-29 20:15   ` David Howells
2010-03-29 20:26     ` Eric Dumazet
2010-03-29 20:26       ` Eric Dumazet
2010-03-29 21:05     ` Paul E. McKenney
2010-03-29 22:22     ` David Howells
2010-03-29 22:36       ` Paul E. McKenney
2010-03-29 22:59       ` David Howells
2010-03-29 23:26         ` Paul E. McKenney
2010-03-30 15:40           ` Paul E. McKenney
2010-03-30 16:39           ` David Howells
2010-03-30 16:49             ` Paul E. McKenney
2010-03-30 17:04               ` Eric Dumazet
2010-03-30 17:04                 ` Eric Dumazet
2010-03-30 17:25                 ` Paul E. McKenney
2010-03-30 17:25                   ` Paul E. McKenney
2010-03-30 23:51             ` David Howells
2010-03-31  0:08               ` Paul E. McKenney
2010-03-31 14:04               ` David Howells
2010-03-31 15:16                 ` Paul E. McKenney
2010-03-31 17:37                 ` David Howells
2010-03-31 18:30                   ` Paul E. McKenney
2010-03-31 18:32                   ` Eric Dumazet
2010-03-31 18:32                     ` Eric Dumazet
2010-03-31 22:53                   ` David Howells
2010-04-01  1:29                     ` Paul E. McKenney
2010-04-01 11:45                     ` David Howells
2010-04-01 14:39                       ` Paul E. McKenney
2010-04-01 14:46                       ` David Howells
2010-04-05 17:57                         ` Paul E. McKenney
2010-04-06  9:30                         ` David Howells
2010-04-06 16:14                         ` David Howells
2010-04-06 17:29                           ` Paul E. McKenney
2010-04-06 19:34                           ` David Howells
2010-04-07  0:02                             ` Paul E. McKenney
2010-04-07 13:22                             ` David Howells
2010-04-07 15:57                               ` Paul E. McKenney
2010-04-07 16:35                               ` RCU condition checks David Howells
2010-04-07 17:10                                 ` Paul E. McKenney
2010-04-11 22:57                                   ` Trond Myklebust
2010-04-12 16:47                                     ` Paul E. McKenney
2010-04-12 16:47                                       ` Paul E. McKenney
2010-03-30 16:37         ` [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2] David Howells
2010-03-30 17:01           ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.