All of lore.kernel.org
 help / color / mirror / Atom feed
* Potential locking issue in sunrpc
@ 2011-06-30 21:08 Ben Greear
  2011-07-01 18:37 ` Trond Myklebust
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Greear @ 2011-06-30 21:08 UTC (permalink / raw)
  To: linux-nfs

This method in sched.c says we should have a lock for ASYNC

/*
  * Make an RPC task runnable.
  *
  * Note: If the task is ASYNC, this must be called with
  * the spinlock held to protect the wait queue operation.
  */
static void rpc_make_runnable(struct rpc_task *task)

However, I don't think the lock is being taken for
the call path starting with rpcb_call_async in
rpcb_clnt.c:

rpcb_call_async
rpc_run_task
rpc_execute
rpc_make_runnable

Is the comment on the make_runnable method wrong, or are we missing locking?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Potential locking issue in sunrpc
  2011-06-30 21:08 Potential locking issue in sunrpc Ben Greear
@ 2011-07-01 18:37 ` Trond Myklebust
  2011-07-01 18:48   ` Ben Greear
  0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2011-07-01 18:37 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-nfs

On Thu, 2011-06-30 at 14:08 -0700, Ben Greear wrote: 
> This method in sched.c says we should have a lock for ASYNC
> 
> /*
>   * Make an RPC task runnable.
>   *
>   * Note: If the task is ASYNC, this must be called with
>   * the spinlock held to protect the wait queue operation.
>   */
> static void rpc_make_runnable(struct rpc_task *task)
> 
> However, I don't think the lock is being taken for
> the call path starting with rpcb_call_async in
> rpcb_clnt.c:
> 
> rpcb_call_async
> rpc_run_task
> rpc_execute
> rpc_make_runnable
> 
> Is the comment on the make_runnable method wrong, or are we missing locking?

Neither. In the above case, we are not waking up from a wait queue, so
there are no locks involved.

Trond

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Potential locking issue in sunrpc
  2011-07-01 18:37 ` Trond Myklebust
@ 2011-07-01 18:48   ` Ben Greear
  0 siblings, 0 replies; 4+ messages in thread
From: Ben Greear @ 2011-07-01 18:48 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On 07/01/2011 11:37 AM, Trond Myklebust wrote:
> On Thu, 2011-06-30 at 14:08 -0700, Ben Greear wrote:
>> This method in sched.c says we should have a lock for ASYNC
>>
>> /*
>>    * Make an RPC task runnable.
>>    *
>>    * Note: If the task is ASYNC, this must be called with
>>    * the spinlock held to protect the wait queue operation.
>>    */
>> static void rpc_make_runnable(struct rpc_task *task)
>>
>> However, I don't think the lock is being taken for
>> the call path starting with rpcb_call_async in
>> rpcb_clnt.c:
>>
>> rpcb_call_async
>> rpc_run_task
>> rpc_execute
>> rpc_make_runnable
>>
>> Is the comment on the make_runnable method wrong, or are we missing locking?
>
> Neither. In the above case, we are not waking up from a wait queue, so
> there are no locks involved.

Ok.  That MUST in the comment could have a caveat, but either way,
looks like that locking isn't an issue so I'll have to look elsewhere.

I'm seeing a worker-thread making calls on tasks that have
already been freed.  Takes very long time (hours or days) to reproduce,
as well as my patches to support lots of client mounts by binding to IP.

Our test case is to create 200 client mounts, start low-speed O_DIRECT
traffic on each mount, run for 15-45 seconds, un-mount, repeat.

Do you have any suggestions for how this sort of thing might happen?

Like if something added the task to the work-queue, and then deleted it
before the task was run, or perhaps some sort of corruption/race while
adding it to the work-queue?

Calling memset to zero out the task right before giving it back to the
mempool makes it much harder to reproduce, but eventually we do see
a null-pointer exception with something trying to access that task's
memory.

It seems reliably related to rpcb_clnt logic that creates an async, dynamic
task.

I'll be happy to re-post the traces from slub debugging
if you have interest in seeing them.

Thanks,
Ben


>
> Trond
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Potential locking issue in sunrpc
@ 2011-06-30 23:12 Ben Greear
  0 siblings, 0 replies; 4+ messages in thread
From: Ben Greear @ 2011-06-30 23:12 UTC (permalink / raw)
  To: netdev

This method in sched.c says we should have a lock for ASYNC

/*
  * Make an RPC task runnable.
  *
  * Note: If the task is ASYNC, this must be called with
  * the spinlock held to protect the wait queue operation.
  */
static void rpc_make_runnable(struct rpc_task *task)

However, I don't think the lock is being taken for
the call path starting with rpcb_call_async in
rpcb_clnt.c:

rpcb_call_async
rpc_run_task
rpc_execute
rpc_make_runnable

Is the comment on the make_runnable method wrong, or are we missing locking?

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-07-01 18:48 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-30 21:08 Potential locking issue in sunrpc Ben Greear
2011-07-01 18:37 ` Trond Myklebust
2011-07-01 18:48   ` Ben Greear
2011-06-30 23:12 Ben Greear

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.