ocfs2-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH RFC 1/1] ocfs2: fix a deadlock caused by commit 6f1b228529ae49b0
@ 2022-01-20  6:07 Gautham Ananthakrishna via Ocfs2-devel
  2022-01-20  8:06 ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 2+ messages in thread
From: Gautham Ananthakrishna via Ocfs2-devel @ 2022-01-20  6:07 UTC (permalink / raw)
  To: gautham.ananthakrishna, junxiao.bi, rajesh.sivaramasubramaniom,
	joseph.qi, ocfs2-devel

commit 6f1b228529ae49b0f85ab89bcdb6c365df401558 caused a deadlock
which was uncovered by our internal tests. The deadlock is as foollows:

Task 1:
A1) jbd2_journal_commit_transaction()
A2)         spin_lock(&jh->b_state_lock);
A3)         __jbd2_journal_remove_checkpoint()
A4)                         jbd2_journal_put_journal_head()
A5)                                         jbd_lock_bh_journal_head()

Task 2:
B1) ocfs2_test_bg_bit_allocatable()
B2)         jbd_lock_bh_journal_head()
B3)         spin_lock(&jh->b_state_lock);

A1->A2->A3->A4->B1->B2->B3(blocked)->A5(blocked)

Now cause process 2 has the jbd lock, it doesn’t let process 1 to continue after A5.
Process 2 now is waiting for b_state_lock to be released by process 1.

This patch resolves the deadlock.

Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
---
 fs/ocfs2/suballoc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 481017e..fd995870 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -1259,7 +1259,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh,
 	if (!buffer_jbd(bg_bh))
 		return 1;
 
-	jbd_lock_bh_journal_head(bg_bh);
+	jbd2_journal_grab_journal_head(bg_bh);
 	if (buffer_jbd(bg_bh)) {
 		jh = bh2jh(bg_bh);
 		spin_lock(&jh->b_state_lock);
@@ -1270,7 +1270,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh,
 			ret = 1;
 		spin_unlock(&jh->b_state_lock);
 	}
-	jbd_unlock_bh_journal_head(bg_bh);
+	jbd2_journal_put_journal_head(bg_bh);
 
 	return ret;
 }
-- 
1.8.3.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [Ocfs2-devel] [PATCH RFC 1/1] ocfs2: fix a deadlock caused by commit 6f1b228529ae49b0
  2022-01-20  6:07 [Ocfs2-devel] [PATCH RFC 1/1] ocfs2: fix a deadlock caused by commit 6f1b228529ae49b0 Gautham Ananthakrishna via Ocfs2-devel
@ 2022-01-20  8:06 ` Joseph Qi via Ocfs2-devel
  0 siblings, 0 replies; 2+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-01-20  8:06 UTC (permalink / raw)
  To: Gautham Ananthakrishna, junxiao.bi, rajesh.sivaramasubramaniom,
	ocfs2-devel



On 1/20/22 2:07 PM, Gautham Ananthakrishna wrote:
> commit 6f1b228529ae49b0f85ab89bcdb6c365df401558 caused a deadlock
> which was uncovered by our internal tests. The deadlock is as foollows:

typo, s/foollows/follows
> 
> Task 1:
> A1) jbd2_journal_commit_transaction()
> A2)         spin_lock(&jh->b_state_lock);
> A3)         __jbd2_journal_remove_checkpoint()
> A4)                         jbd2_journal_put_journal_head()
> A5)                                         jbd_lock_bh_journal_head()
> 
> Task 2:
> B1) ocfs2_test_bg_bit_allocatable()
> B2)         jbd_lock_bh_journal_head()
> B3)         spin_lock(&jh->b_state_lock);
> 
> A1->A2->A3->A4->B1->B2->B3(blocked)->A5(blocked)
> 
> Now cause process 2 has the jbd lock, it doesn’t let process 1 to continue after A5.
> Process 2 now is waiting for b_state_lock to be released by process 1.
> 
> This patch resolves the deadlock.
> 
> Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
> ---
>  fs/ocfs2/suballoc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
> index 481017e..fd995870 100644
> --- a/fs/ocfs2/suballoc.c
> +++ b/fs/ocfs2/suballoc.c
> @@ -1259,7 +1259,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh,
>  	if (!buffer_jbd(bg_bh))
>  		return 1;
>  
> -	jbd_lock_bh_journal_head(bg_bh);
> +	jbd2_journal_grab_journal_head(bg_bh);

Seems we have to check the returned 'jh' here.
If NULL, we can finish the bit test earlier, else we can use this 'jh' directly.
And the following check without bh lock is unsafe.

Thanks,
Joseph

>  	if (buffer_jbd(bg_bh)) {
>  		jh = bh2jh(bg_bh);
>  		spin_lock(&jh->b_state_lock);
> @@ -1270,7 +1270,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh,
>  			ret = 1;
>  		spin_unlock(&jh->b_state_lock);
>  	}
> -	jbd_unlock_bh_journal_head(bg_bh);
> +	jbd2_journal_put_journal_head(bg_bh);
>  
>  	return ret;
>  }

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-01-20  8:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-20  6:07 [Ocfs2-devel] [PATCH RFC 1/1] ocfs2: fix a deadlock caused by commit 6f1b228529ae49b0 Gautham Ananthakrishna via Ocfs2-devel
2022-01-20  8:06 ` Joseph Qi via Ocfs2-devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).