All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper
@ 2014-01-11  1:19 Srinivas Eeda
  2014-01-13 15:37 ` Joel Becker
  2014-01-14  4:06 ` Joseph Qi
  0 siblings, 2 replies; 6+ messages in thread
From: Srinivas Eeda @ 2014-01-11  1:19 UTC (permalink / raw)
  To: ocfs2-devel

From: Srinivas Eeda <seeda@srini.(none)>

A tiny race between BAST and unlock message causes the NULL dereference.

A node sends an unlock request to master and receives a response. Before
processing the response it receives a BAST from the master. Since both requests
are processed by different threads it creates a race. While the BAST is being
processed, lock can get freed by unlock code.

This patch makes bast to return immediately if lock is found but unlock is
pending. The code should handle this race. We also have to fix master node to
skip sending BAST after receiving unlock message.

Below is the crash stack

BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [<ffffffffa015e023>] o2dlm_blocking_ast_wrapper+0xd/0x16
[<ffffffffa034e3db>] dlm_do_local_bast+0x8e/0x97 [ocfs2_dlm]
[<ffffffffa034f366>] dlm_proxy_ast_handler+0x838/0x87e [ocfs2_dlm]
[<ffffffffa0308abe>] o2net_process_message+0x395/0x5b8 [ocfs2_nodemanager]
[<ffffffffa030aac8>] o2net_rx_until_empty+0x762/0x90d [ocfs2_nodemanager]
[<ffffffff81071802>] worker_thread+0x14d/0x1ed

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
---
 fs/ocfs2/dlm/dlmast.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
index b46278f..dbc6cee 100644
--- a/fs/ocfs2/dlm/dlmast.c
+++ b/fs/ocfs2/dlm/dlmast.c
@@ -385,8 +385,13 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
 		head = &res->granted;
 
 	list_for_each_entry(lock, head, list) {
-		if (lock->ml.cookie == cookie)
-			goto do_ast;
+		/* if lock is found but unlock is pending ignore the bast */
+		if (lock->ml.cookie == cookie) {
+			if (lock->unlock_pending)
+				break;
+			else
+				goto do_ast;
+		}
 	}
 
 	mlog(0, "Got %sast for unknown lock! cookie=%u:%llu, name=%.*s, "
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper
  2014-01-11  1:19 [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper Srinivas Eeda
@ 2014-01-13 15:37 ` Joel Becker
  2014-01-14  5:32   ` Srinivas Eeda
  2014-01-14  4:06 ` Joseph Qi
  1 sibling, 1 reply; 6+ messages in thread
From: Joel Becker @ 2014-01-13 15:37 UTC (permalink / raw)
  To: ocfs2-devel

On Fri, Jan 10, 2014 at 05:19:13PM -0800, Srinivas Eeda wrote:
> From: Srinivas Eeda <seeda@srini.(none)>
> 
> A tiny race between BAST and unlock message causes the NULL dereference.
> 
> A node sends an unlock request to master and receives a response. Before
> processing the response it receives a BAST from the master. Since both requests
> are processed by different threads it creates a race. While the BAST is being
> processed, lock can get freed by unlock code.
> 
> This patch makes bast to return immediately if lock is found but unlock is
> pending. The code should handle this race. We also have to fix master node to
> skip sending BAST after receiving unlock message.

Did the master send the BAST after the unlock, or does that race too?
Does the master know the unlock has succeeded, or does it just think so?

> @@ -385,8 +385,13 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>  		head = &res->granted;
>  
>  	list_for_each_entry(lock, head, list) {
> -		if (lock->ml.cookie == cookie)
> -			goto do_ast;
> +		/* if lock is found but unlock is pending ignore the bast */
> +		if (lock->ml.cookie == cookie) {
> +			if (lock->unlock_pending)
> +				break;
> +			else
> +				goto do_ast;
> +		}

This breaks out for asts as well as basts.  Can't that cause problems
with the unlock ast expected by the caller?

Joel


-- 

"Not being known doesn't stop the truth from being true."
        - Richard Bach

			http://www.jlbec.org/
			jlbec at evilplan.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper
  2014-01-11  1:19 [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper Srinivas Eeda
  2014-01-13 15:37 ` Joel Becker
@ 2014-01-14  4:06 ` Joseph Qi
  2014-01-14  5:33   ` Srinivas Eeda
  1 sibling, 1 reply; 6+ messages in thread
From: Joseph Qi @ 2014-01-14  4:06 UTC (permalink / raw)
  To: ocfs2-devel

On 2014/1/11 9:19, Srinivas Eeda wrote:
> From: Srinivas Eeda <seeda@srini.(none)>
> 
> A tiny race between BAST and unlock message causes the NULL dereference.
> 
> A node sends an unlock request to master and receives a response. Before
> processing the response it receives a BAST from the master. Since both requests
> are processed by different threads it creates a race. While the BAST is being
> processed, lock can get freed by unlock code.
> 
> This patch makes bast to return immediately if lock is found but unlock is
> pending. The code should handle this race. We also have to fix master node to
> skip sending BAST after receiving unlock message.
> 
> Below is the crash stack
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
> IP: [<ffffffffa015e023>] o2dlm_blocking_ast_wrapper+0xd/0x16
> [<ffffffffa034e3db>] dlm_do_local_bast+0x8e/0x97 [ocfs2_dlm]
> [<ffffffffa034f366>] dlm_proxy_ast_handler+0x838/0x87e [ocfs2_dlm]
> [<ffffffffa0308abe>] o2net_process_message+0x395/0x5b8 [ocfs2_nodemanager]
> [<ffffffffa030aac8>] o2net_rx_until_empty+0x762/0x90d [ocfs2_nodemanager]
> [<ffffffff81071802>] worker_thread+0x14d/0x1ed
> 
> Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
> ---
>  fs/ocfs2/dlm/dlmast.c |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
> index b46278f..dbc6cee 100644
> --- a/fs/ocfs2/dlm/dlmast.c
> +++ b/fs/ocfs2/dlm/dlmast.c
> @@ -385,8 +385,13 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>  		head = &res->granted;
>  
>  	list_for_each_entry(lock, head, list) {
> -		if (lock->ml.cookie == cookie)
> -			goto do_ast;
> +		/* if lock is found but unlock is pending ignore the bast */
> +		if (lock->ml.cookie == cookie) {
> +			if (lock->unlock_pending)
> +				break;
> +			else
> +				goto do_ast;
> +		}
>  	}
>  
>  	mlog(0, "Got %sast for unknown lock! cookie=%u:%llu, name=%.*s, "
> 
I found you sent a version on Jan 30, 2012.
https://oss.oracle.com/pipermail/ocfs2-devel/2012-January/008469.html
Compared with the old version, this version only saves a little bit CPU,
am I right?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper
  2014-01-13 15:37 ` Joel Becker
@ 2014-01-14  5:32   ` Srinivas Eeda
  0 siblings, 0 replies; 6+ messages in thread
From: Srinivas Eeda @ 2014-01-14  5:32 UTC (permalink / raw)
  To: ocfs2-devel

On 01/13/2014 07:37 AM, Joel Becker wrote:
> On Fri, Jan 10, 2014 at 05:19:13PM -0800, Srinivas Eeda wrote:
>> From: Srinivas Eeda <seeda@srini.(none)>
>>
>> A tiny race between BAST and unlock message causes the NULL dereference.
>>
>> A node sends an unlock request to master and receives a response. Before
>> processing the response it receives a BAST from the master. Since both requests
>> are processed by different threads it creates a race. While the BAST is being
>> processed, lock can get freed by unlock code.
>>
>> This patch makes bast to return immediately if lock is found but unlock is
>> pending. The code should handle this race. We also have to fix master node to
>> skip sending BAST after receiving unlock message.
> Did the master send the BAST after the unlock, or does that race too?
> Does the master know the unlock has succeeded, or does it just think so?
I think it's due to a race but I haven't debugged the master. My guess 
is unlock request sneaked in before the dlm_flush_asts was called. 
However non master node should handle this race as well, so just did 
that part which fixed a bug we were seeing.


>
>> @@ -385,8 +385,13 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>   		head = &res->granted;
>>   
>>   	list_for_each_entry(lock, head, list) {
>> -		if (lock->ml.cookie == cookie)
>> -			goto do_ast;
>> +		/* if lock is found but unlock is pending ignore the bast */
>> +		if (lock->ml.cookie == cookie) {
>> +			if (lock->unlock_pending)
>> +				break;
>> +			else
>> +				goto do_ast;
>> +		}
> This breaks out for asts as well as basts.  Can't that cause problems
> with the unlock ast expected by the caller?
if unlock_pending is set, then the node is trying to unlock an existing 
lock and shouldn't receive any asts ?
>
> Joel
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper
  2014-01-14  4:06 ` Joseph Qi
@ 2014-01-14  5:33   ` Srinivas Eeda
  0 siblings, 0 replies; 6+ messages in thread
From: Srinivas Eeda @ 2014-01-14  5:33 UTC (permalink / raw)
  To: ocfs2-devel

On 01/13/2014 08:06 PM, Joseph Qi wrote:
> On 2014/1/11 9:19, Srinivas Eeda wrote:
>> From: Srinivas Eeda <seeda@srini.(none)>
>>
>> A tiny race between BAST and unlock message causes the NULL dereference.
>>
>> A node sends an unlock request to master and receives a response. Before
>> processing the response it receives a BAST from the master. Since both requests
>> are processed by different threads it creates a race. While the BAST is being
>> processed, lock can get freed by unlock code.
>>
>> This patch makes bast to return immediately if lock is found but unlock is
>> pending. The code should handle this race. We also have to fix master node to
>> skip sending BAST after receiving unlock message.
>>
>> Below is the crash stack
>>
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
>> IP: [<ffffffffa015e023>] o2dlm_blocking_ast_wrapper+0xd/0x16
>> [<ffffffffa034e3db>] dlm_do_local_bast+0x8e/0x97 [ocfs2_dlm]
>> [<ffffffffa034f366>] dlm_proxy_ast_handler+0x838/0x87e [ocfs2_dlm]
>> [<ffffffffa0308abe>] o2net_process_message+0x395/0x5b8 [ocfs2_nodemanager]
>> [<ffffffffa030aac8>] o2net_rx_until_empty+0x762/0x90d [ocfs2_nodemanager]
>> [<ffffffff81071802>] worker_thread+0x14d/0x1ed
>>
>> Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
>> ---
>>   fs/ocfs2/dlm/dlmast.c |    9 +++++++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>> index b46278f..dbc6cee 100644
>> --- a/fs/ocfs2/dlm/dlmast.c
>> +++ b/fs/ocfs2/dlm/dlmast.c
>> @@ -385,8 +385,13 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>   		head = &res->granted;
>>   
>>   	list_for_each_entry(lock, head, list) {
>> -		if (lock->ml.cookie == cookie)
>> -			goto do_ast;
>> +		/* if lock is found but unlock is pending ignore the bast */
>> +		if (lock->ml.cookie == cookie) {
>> +			if (lock->unlock_pending)
>> +				break;
>> +			else
>> +				goto do_ast;
>> +		}
>>   	}
>>   
>>   	mlog(0, "Got %sast for unknown lock! cookie=%u:%llu, name=%.*s, "
>>
> I found you sent a version on Jan 30, 2012.
> https://oss.oracle.com/pipermail/ocfs2-devel/2012-January/008469.html
> Compared with the old version, this version only saves a little bit CPU,
> am I right?
Yes you are right. I made the change as Goldwyn suggested which is a 
good thing to have :)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper
@ 2012-01-31  7:16 Srinivas Eeda
  0 siblings, 0 replies; 6+ messages in thread
From: Srinivas Eeda @ 2012-01-31  7:16 UTC (permalink / raw)
  To: ocfs2-devel

A tiny race between BAST and unlock message causes the NULL dereference.

A node sends an unlock request to master and receives a response. Before
processing the response it receives a BAST from the master. Since both requests
are processed by different threads it creates a race. While the BAST is being
processed, lock can get freed by unlock code.

This patch makes bast to return immediately if lock is found but unlock is
pending. The code should handle this race. We also have to fix master node to
skip sending BAST after receiving unlock message.

Below is the crash stack

BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [<ffffffffa015e023>] o2dlm_blocking_ast_wrapper+0xd/0x16
[<ffffffffa034e3db>] dlm_do_local_bast+0x8e/0x97 [ocfs2_dlm]
[<ffffffffa034f366>] dlm_proxy_ast_handler+0x838/0x87e [ocfs2_dlm]
[<ffffffffa0308abe>] o2net_process_message+0x395/0x5b8 [ocfs2_nodemanager]
[<ffffffffa030aac8>] o2net_rx_until_empty+0x762/0x90d [ocfs2_nodemanager]
[<ffffffff81071802>] worker_thread+0x14d/0x1ed

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
---
 fs/ocfs2/dlm/dlmast.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
index 3a3ed4b..1281e8a 100644
--- a/fs/ocfs2/dlm/dlmast.c
+++ b/fs/ocfs2/dlm/dlmast.c
@@ -386,8 +386,9 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
 		head = &res->granted;
 
 	list_for_each(iter, head) {
+		/* if lock is found but unlock is pending ignore the bast */
 		lock = list_entry (iter, struct dlm_lock, list);
-		if (lock->ml.cookie == cookie)
+		if ((lock->ml.cookie == cookie) && (!lock->unlock_pending))
 			goto do_ast;
 	}
 
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-01-14  5:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-11  1:19 [Ocfs2-devel] [PATCH 1/1] o2dlm: fix NULL pointer dereference in o2dlm_blocking_ast_wrapper Srinivas Eeda
2014-01-13 15:37 ` Joel Becker
2014-01-14  5:32   ` Srinivas Eeda
2014-01-14  4:06 ` Joseph Qi
2014-01-14  5:33   ` Srinivas Eeda
  -- strict thread matches above, loose matches on Subject: below --
2012-01-31  7:16 Srinivas Eeda

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.