* [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head @ 2021-10-20 5:18 Gautham Ananthakrishna 2021-10-20 8:26 ` Joseph Qi 0 siblings, 1 reply; 9+ messages in thread From: Gautham Ananthakrishna @ 2021-10-20 5:18 UTC (permalink / raw) To: ocfs2-devel; +Cc: rajesh.sivaramasubramaniom, gautham.ananthakrishna Encountered a race between ocfs2_test_bg_bit_allocatable() and jbd2_journal_put_journal_head() resulting in the below vmcore. PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" 0 [ffff8802435ff1c0] panic at ffffffff816ed175 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 2 [ffff8802435ff270] no_context at ffffffff8106eccf 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 [exception RIP: ocfs2_block_group_find_clear_bits+316] RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at ffffffffc11ef680 [ocfs2] 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 [ocfs2] 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b [ocfs2] 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb [ocfs2] 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2] 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at ffffffffc11cc0db [ocfs2] 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at ffffffffc11ce53f [ocfs2] 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at ffffffffc11f59b5 [ocfs2] 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 [ocfs2] 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at ffffffffc11dc169 [ocfs2] 19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 [ocfs2] 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 [ocfs2] 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 26 [ffff8802435ffec0] kthread at ffffffff810a7afb 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' to fix this race. Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> --- fs/ocfs2/suballoc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 8521942..86f33f2 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; + /* Fast path */ if (!buffer_jbd(bg_bh)) return 1; + /* Slow path */ + jbd_lock_bh_journal_head(bg_bh); + if (!buffer_jbd(bg_bh)){ + jbd_unlock_bh_journal_head(bg_bh); + return 1; + } + jh = bh2jh(bg_bh); spin_lock(&jh->b_state_lock); bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, else ret = 1; spin_unlock(&jh->b_state_lock); + jbd_unlock_bh_journal_head(bg_bh); return ret; } -- 1.8.3.1 _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head 2021-10-20 5:18 [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head Gautham Ananthakrishna @ 2021-10-20 8:26 ` Joseph Qi 2021-10-20 13:46 ` Gautham Ananthakrishna 0 siblings, 1 reply; 9+ messages in thread From: Joseph Qi @ 2021-10-20 8:26 UTC (permalink / raw) To: Gautham Ananthakrishna, ocfs2-devel; +Cc: rajesh.sivaramasubramaniom Hi, How about make the change like following? diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 8521942..481017e 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1251,7 +1251,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; struct journal_head *jh; - int ret; + int ret = 1; if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; @@ -1259,14 +1259,18 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, if (!buffer_jbd(bg_bh)) return 1; - jh = bh2jh(bg_bh); - spin_lock(&jh->b_state_lock); - bg = (struct ocfs2_group_desc *) jh->b_committed_data; - if (bg) - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); - else - ret = 1; - spin_unlock(&jh->b_state_lock); + jbd_lock_bh_journal_head(bg_bh); + if (buffer_jbd(bg_bh)) { + jh = bh2jh(bg_bh); + spin_lock(&jh->b_state_lock); + bg = (struct ocfs2_group_desc *) jh->b_committed_data; + if (bg) + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); + else + ret = 1; + spin_unlock(&jh->b_state_lock); + } + jbd_unlock_bh_journal_head(bg_bh); return ret; } On 10/20/21 1:18 PM, Gautham Ananthakrishna wrote: > Encountered a race between ocfs2_test_bg_bit_allocatable() and > jbd2_journal_put_journal_head() resulting in the below vmcore. > > PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" > 0 [ffff8802435ff1c0] panic at ffffffff816ed175 > 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 > 2 [ffff8802435ff270] no_context at ffffffff8106eccf > 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d > 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 > 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b > 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f > 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 > [exception RIP: ocfs2_block_group_find_clear_bits+316] > RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 > RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 > RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 > RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff > R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 > R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 > ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b > 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at ffffffffc11ef680 [ocfs2] > 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 [ocfs2] > 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] > 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b [ocfs2] > 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb [ocfs2] > 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2] > 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at ffffffffc11cc0db [ocfs2] > 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at ffffffffc11ce53f [ocfs2] > 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at ffffffffc11f59b5 [ocfs2] > 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 [ocfs2] > 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at ffffffffc11dc169 [ocfs2] > 19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 [ocfs2] > 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] > 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] > 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 [ocfs2] > 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d > 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 > 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 > 26 [ffff8802435ffec0] kthread at ffffffff810a7afb > 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 > > When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the bg_bh->b_private > NULL as jbd2_journal_put_journal_head() raced and released the jounal head > from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' > to fix this race. > > Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> > --- > fs/ocfs2/suballoc.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c > index 8521942..86f33f2 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, > if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) > return 0; > > + /* Fast path */ > if (!buffer_jbd(bg_bh)) > return 1; > > + /* Slow path */ > + jbd_lock_bh_journal_head(bg_bh); > + if (!buffer_jbd(bg_bh)){ > + jbd_unlock_bh_journal_head(bg_bh); > + return 1; > + } > + > jh = bh2jh(bg_bh); > spin_lock(&jh->b_state_lock); > bg = (struct ocfs2_group_desc *) jh->b_committed_data; > @@ -1267,6 +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, > else > ret = 1; > spin_unlock(&jh->b_state_lock); > + jbd_unlock_bh_journal_head(bg_bh); > > return ret; > } > _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head 2021-10-20 8:26 ` Joseph Qi @ 2021-10-20 13:46 ` Gautham Ananthakrishna 2021-10-21 7:26 ` Joseph Qi 0 siblings, 1 reply; 9+ messages in thread From: Gautham Ananthakrishna @ 2021-10-20 13:46 UTC (permalink / raw) To: Joseph Qi, ocfs2-devel; +Cc: Rajesh Sivaramasubramaniom Hi Joseph The following would retain the fast path, as per your earlier comment: diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 8521942..048e532 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1251,22 +1251,25 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; struct journal_head *jh; - int ret; + int ret = 1; if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; - if (!buffer_jbd(bg_bh)) - return 1; - - jh = bh2jh(bg_bh); - spin_lock(&jh->b_state_lock); - bg = (struct ocfs2_group_desc *) jh->b_committed_data; - if (bg) - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); - else - ret = 1; - spin_unlock(&jh->b_state_lock); + if (buffer_jbd(bg_bh)) { + jbd_lock_bh_journal_head(bg_bh); + if (buffer_jbd(bg_bh)){ + jh = bh2jh(bg_bh); + spin_lock(&jh->b_state_lock); + bg = (struct ocfs2_group_desc *) jh->b_committed_data; + if (bg) + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); + else + ret = 1; + spin_unlock(&jh->b_state_lock); + } + jbd_unlock_bh_journal_head(bg_bh); + } return ret; } We can also remove the re-initialization of ret = 1 in the 'else' part, as 'ret' is already initialized to 1 (However, personally I would like to keep this). Could you please take a look and comment? Thanks, Gautham. -----Original Message----- From: Joseph Qi <joseph.qi@linux.alibaba.com> Sent: Wednesday, October 20, 2021 1:57 PM To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; ocfs2-devel@oss.oracle.com Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com> Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head Hi, How about make the change like following? diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 8521942..481017e 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1251,7 +1251,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; struct journal_head *jh; - int ret; + int ret = 1; if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; @@ -1259,14 +1259,18 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, if (!buffer_jbd(bg_bh)) return 1; - jh = bh2jh(bg_bh); - spin_lock(&jh->b_state_lock); - bg = (struct ocfs2_group_desc *) jh->b_committed_data; - if (bg) - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); - else - ret = 1; - spin_unlock(&jh->b_state_lock); + jbd_lock_bh_journal_head(bg_bh); + if (buffer_jbd(bg_bh)) { + jh = bh2jh(bg_bh); + spin_lock(&jh->b_state_lock); + bg = (struct ocfs2_group_desc *) jh->b_committed_data; + if (bg) + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); + else + ret = 1; + spin_unlock(&jh->b_state_lock); + } + jbd_unlock_bh_journal_head(bg_bh); return ret; } On 10/20/21 1:18 PM, Gautham Ananthakrishna wrote: > Encountered a race between ocfs2_test_bg_bit_allocatable() and > jbd2_journal_put_journal_head() resulting in the below vmcore. > > PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" > 0 [ffff8802435ff1c0] panic at ffffffff816ed175 > 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 > 2 [ffff8802435ff270] no_context at ffffffff8106eccf > 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d > 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 > 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b > 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f > 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 > [exception RIP: ocfs2_block_group_find_clear_bits+316] > RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 > RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 > RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 > RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff > R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 > R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 > ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b > 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at > ffffffffc11ef680 [ocfs2] > 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 > [ocfs2] > 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] > 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b > [ocfs2] > 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb > [ocfs2] > 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2] > 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at > ffffffffc11cc0db [ocfs2] > 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at > ffffffffc11ce53f [ocfs2] > 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at > ffffffffc11f59b5 [ocfs2] > 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 > [ocfs2] > 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at > ffffffffc11dc169 [ocfs2] > 19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 > [ocfs2] > 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] > 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] > 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 > [ocfs2] > 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d > 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 > 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 > 26 [ffff8802435ffec0] kthread at ffffffff810a7afb > 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 > > When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the > bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and > released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' > to fix this race. > > Signed-off-by: Gautham Ananthakrishna > <gautham.ananthakrishna@oracle.com> > --- > fs/ocfs2/suballoc.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index > 8521942..86f33f2 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, > if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) > return 0; > > + /* Fast path */ > if (!buffer_jbd(bg_bh)) > return 1; > > + /* Slow path */ > + jbd_lock_bh_journal_head(bg_bh); > + if (!buffer_jbd(bg_bh)){ > + jbd_unlock_bh_journal_head(bg_bh); > + return 1; > + } > + > jh = bh2jh(bg_bh); > spin_lock(&jh->b_state_lock); > bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 > +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, > else > ret = 1; > spin_unlock(&jh->b_state_lock); > + jbd_unlock_bh_journal_head(bg_bh); > > return ret; > } > _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head 2021-10-20 13:46 ` Gautham Ananthakrishna @ 2021-10-21 7:26 ` Joseph Qi 2021-10-21 7:30 ` Gautham Ananthakrishna 0 siblings, 1 reply; 9+ messages in thread From: Joseph Qi @ 2021-10-21 7:26 UTC (permalink / raw) To: Gautham Ananthakrishna, ocfs2-devel; +Cc: Rajesh Sivaramasubramaniom Seems it has one more code intent level. Is there any issue on my suggested change? Thanks, Joseph On 10/20/21 9:46 PM, Gautham Ananthakrishna wrote: > Hi Joseph > > The following would retain the fast path, as per your earlier comment: > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c > index 8521942..048e532 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -1251,22 +1251,25 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, > { > struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; > struct journal_head *jh; > - int ret; > + int ret = 1; > > if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) > return 0; > > - if (!buffer_jbd(bg_bh)) > - return 1; > - > - jh = bh2jh(bg_bh); > - spin_lock(&jh->b_state_lock); > - bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - if (bg) > - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > - else > - ret = 1; > - spin_unlock(&jh->b_state_lock); > + if (buffer_jbd(bg_bh)) { > + jbd_lock_bh_journal_head(bg_bh); > + if (buffer_jbd(bg_bh)){ > + jh = bh2jh(bg_bh); > + spin_lock(&jh->b_state_lock); > + bg = (struct ocfs2_group_desc *) jh->b_committed_data; > + if (bg) > + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > + else > + ret = 1; > + spin_unlock(&jh->b_state_lock); > + } > + jbd_unlock_bh_journal_head(bg_bh); > + } > > return ret; > } > > We can also remove the re-initialization of ret = 1 in the 'else' part, as 'ret' is already initialized to 1 (However, personally I would like to keep this). > > Could you please take a look and comment? > > Thanks, > Gautham. > > -----Original Message----- > From: Joseph Qi <joseph.qi@linux.alibaba.com> > Sent: Wednesday, October 20, 2021 1:57 PM > To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; ocfs2-devel@oss.oracle.com > Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com> > Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head > > Hi, > > How about make the change like following? > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 8521942..481017e 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -1251,7 +1251,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { > struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; > struct journal_head *jh; > - int ret; > + int ret = 1; > > if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) > return 0; > @@ -1259,14 +1259,18 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, > if (!buffer_jbd(bg_bh)) > return 1; > > - jh = bh2jh(bg_bh); > - spin_lock(&jh->b_state_lock); > - bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - if (bg) > - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > - else > - ret = 1; > - spin_unlock(&jh->b_state_lock); > + jbd_lock_bh_journal_head(bg_bh); > + if (buffer_jbd(bg_bh)) { > + jh = bh2jh(bg_bh); > + spin_lock(&jh->b_state_lock); > + bg = (struct ocfs2_group_desc *) jh->b_committed_data; > + if (bg) > + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > + else > + ret = 1; > + spin_unlock(&jh->b_state_lock); > + } > + jbd_unlock_bh_journal_head(bg_bh); > > return ret; > } > > > On 10/20/21 1:18 PM, Gautham Ananthakrishna wrote: >> Encountered a race between ocfs2_test_bg_bit_allocatable() and >> jbd2_journal_put_journal_head() resulting in the below vmcore. >> >> PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" >> 0 [ffff8802435ff1c0] panic at ffffffff816ed175 >> 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 >> 2 [ffff8802435ff270] no_context at ffffffff8106eccf >> 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d >> 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 >> 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b >> 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f >> 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 >> [exception RIP: ocfs2_block_group_find_clear_bits+316] >> RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 >> RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 >> RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 >> RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff >> R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 >> R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 >> ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b >> 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at >> ffffffffc11ef680 [ocfs2] >> 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 >> [ocfs2] >> 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] >> 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b >> [ocfs2] >> 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb >> [ocfs2] >> 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2] >> 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at >> ffffffffc11cc0db [ocfs2] >> 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at >> ffffffffc11ce53f [ocfs2] >> 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at >> ffffffffc11f59b5 [ocfs2] >> 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 >> [ocfs2] >> 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at >> ffffffffc11dc169 [ocfs2] >> 19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 >> [ocfs2] >> 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] >> 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] >> 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 >> [ocfs2] >> 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d >> 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 >> 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 >> 26 [ffff8802435ffec0] kthread at ffffffff810a7afb >> 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 >> >> When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the >> bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and >> released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' >> to fix this race. >> >> Signed-off-by: Gautham Ananthakrishna >> <gautham.ananthakrishna@oracle.com> >> --- >> fs/ocfs2/suballoc.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index >> 8521942..86f33f2 100644 >> --- a/fs/ocfs2/suballoc.c >> +++ b/fs/ocfs2/suballoc.c >> @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, >> if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) >> return 0; >> >> + /* Fast path */ >> if (!buffer_jbd(bg_bh)) >> return 1; >> >> + /* Slow path */ >> + jbd_lock_bh_journal_head(bg_bh); >> + if (!buffer_jbd(bg_bh)){ >> + jbd_unlock_bh_journal_head(bg_bh); >> + return 1; >> + } >> + >> jh = bh2jh(bg_bh); >> spin_lock(&jh->b_state_lock); >> bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 >> +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, >> else >> ret = 1; >> spin_unlock(&jh->b_state_lock); >> + jbd_unlock_bh_journal_head(bg_bh); >> >> return ret; >> } >> _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head 2021-10-21 7:26 ` Joseph Qi @ 2021-10-21 7:30 ` Gautham Ananthakrishna 2021-10-21 7:33 ` Joseph Qi 0 siblings, 1 reply; 9+ messages in thread From: Gautham Ananthakrishna @ 2021-10-21 7:30 UTC (permalink / raw) To: Joseph Qi, ocfs2-devel; +Cc: Rajesh Sivaramasubramaniom Hi Joseph, No, I don’t see any issue with your suggestion. I thought it didn’t have the fast path which you suggested earlier. Thanks, Gautham. -----Original Message----- From: Joseph Qi <joseph.qi@linux.alibaba.com> Sent: Thursday, October 21, 2021 12:57 PM To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; ocfs2-devel@oss.oracle.com Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com> Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head Seems it has one more code intent level. Is there any issue on my suggested change? Thanks, Joseph On 10/20/21 9:46 PM, Gautham Ananthakrishna wrote: > Hi Joseph > > The following would retain the fast path, as per your earlier comment: > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index > 8521942..048e532 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -1251,22 +1251,25 @@ static int > ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { > struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; > struct journal_head *jh; > - int ret; > + int ret = 1; > > if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) > return 0; > > - if (!buffer_jbd(bg_bh)) > - return 1; > - > - jh = bh2jh(bg_bh); > - spin_lock(&jh->b_state_lock); > - bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - if (bg) > - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > - else > - ret = 1; > - spin_unlock(&jh->b_state_lock); > + if (buffer_jbd(bg_bh)) { > + jbd_lock_bh_journal_head(bg_bh); > + if (buffer_jbd(bg_bh)){ > + jh = bh2jh(bg_bh); > + spin_lock(&jh->b_state_lock); > + bg = (struct ocfs2_group_desc *) jh->b_committed_data; > + if (bg) > + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > + else > + ret = 1; > + spin_unlock(&jh->b_state_lock); > + } > + jbd_unlock_bh_journal_head(bg_bh); > + } > > return ret; > } > > We can also remove the re-initialization of ret = 1 in the 'else' part, as 'ret' is already initialized to 1 (However, personally I would like to keep this). > > Could you please take a look and comment? > > Thanks, > Gautham. > > -----Original Message----- > From: Joseph Qi <joseph.qi@linux.alibaba.com> > Sent: Wednesday, October 20, 2021 1:57 PM > To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; > ocfs2-devel@oss.oracle.com > Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom > <rajesh.sivaramasubramaniom@oracle.com> > Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks > and release journal_head from buffer_head > > Hi, > > How about make the change like following? > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index > 8521942..481017e 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -1251,7 +1251,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { > struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; > struct journal_head *jh; > - int ret; > + int ret = 1; > > if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) > return 0; > @@ -1259,14 +1259,18 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, > if (!buffer_jbd(bg_bh)) > return 1; > > - jh = bh2jh(bg_bh); > - spin_lock(&jh->b_state_lock); > - bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - if (bg) > - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > - else > - ret = 1; > - spin_unlock(&jh->b_state_lock); > + jbd_lock_bh_journal_head(bg_bh); > + if (buffer_jbd(bg_bh)) { > + jh = bh2jh(bg_bh); > + spin_lock(&jh->b_state_lock); > + bg = (struct ocfs2_group_desc *) jh->b_committed_data; > + if (bg) > + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > + else > + ret = 1; > + spin_unlock(&jh->b_state_lock); > + } > + jbd_unlock_bh_journal_head(bg_bh); > > return ret; > } > > > On 10/20/21 1:18 PM, Gautham Ananthakrishna wrote: >> Encountered a race between ocfs2_test_bg_bit_allocatable() and >> jbd2_journal_put_journal_head() resulting in the below vmcore. >> >> PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" >> 0 [ffff8802435ff1c0] panic at ffffffff816ed175 >> 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 >> 2 [ffff8802435ff270] no_context at ffffffff8106eccf >> 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d >> 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 >> 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b >> 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f >> 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 >> [exception RIP: ocfs2_block_group_find_clear_bits+316] >> RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 >> RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 >> RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 >> RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff >> R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 >> R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 >> ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b >> 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at >> ffffffffc11ef680 [ocfs2] >> 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 >> [ocfs2] >> 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] >> 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b >> [ocfs2] >> 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb >> [ocfs2] >> 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf >> [ocfs2] >> 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at >> ffffffffc11cc0db [ocfs2] >> 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at >> ffffffffc11ce53f [ocfs2] >> 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at >> ffffffffc11f59b5 [ocfs2] >> 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 >> [ocfs2] >> 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at >> ffffffffc11dc169 [ocfs2] >> 19 [ffff8802435ff960] ocfs2_make_clusters_writable at >> ffffffffc11e4274 [ocfs2] >> 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] >> 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] >> 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 >> [ocfs2] >> 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d >> 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 >> 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 >> 26 [ffff8802435ffec0] kthread at ffffffff810a7afb >> 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 >> >> When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the >> bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and >> released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' >> to fix this race. >> >> Signed-off-by: Gautham Ananthakrishna >> <gautham.ananthakrishna@oracle.com> >> --- >> fs/ocfs2/suballoc.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index >> 8521942..86f33f2 100644 >> --- a/fs/ocfs2/suballoc.c >> +++ b/fs/ocfs2/suballoc.c >> @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, >> if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) >> return 0; >> >> + /* Fast path */ >> if (!buffer_jbd(bg_bh)) >> return 1; >> >> + /* Slow path */ >> + jbd_lock_bh_journal_head(bg_bh); >> + if (!buffer_jbd(bg_bh)){ >> + jbd_unlock_bh_journal_head(bg_bh); >> + return 1; >> + } >> + >> jh = bh2jh(bg_bh); >> spin_lock(&jh->b_state_lock); >> bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 >> +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct >> +buffer_head *bg_bh, >> else >> ret = 1; >> spin_unlock(&jh->b_state_lock); >> + jbd_unlock_bh_journal_head(bg_bh); >> >> return ret; >> } >> _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head 2021-10-21 7:30 ` Gautham Ananthakrishna @ 2021-10-21 7:33 ` Joseph Qi 2021-10-21 7:40 ` Gautham Ananthakrishna 0 siblings, 1 reply; 9+ messages in thread From: Joseph Qi @ 2021-10-21 7:33 UTC (permalink / raw) To: Gautham Ananthakrishna, ocfs2-devel; +Cc: Rajesh Sivaramasubramaniom I've kept the 'if (!buffer_jbd(bg_bh))', that's the exactly 'fast path'. Thanks, Joseph On 10/21/21 3:30 PM, Gautham Ananthakrishna wrote: > Hi Joseph, > > No, I don’t see any issue with your suggestion. I thought it didn’t have the fast path which you suggested earlier. > > Thanks, > Gautham. > > -----Original Message----- > From: Joseph Qi <joseph.qi@linux.alibaba.com> > Sent: Thursday, October 21, 2021 12:57 PM > To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; ocfs2-devel@oss.oracle.com > Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com> > Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head > > Seems it has one more code intent level. > Is there any issue on my suggested change? > > Thanks, > Joseph > > On 10/20/21 9:46 PM, Gautham Ananthakrishna wrote: >> Hi Joseph >> >> The following would retain the fast path, as per your earlier comment: >> >> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index >> 8521942..048e532 100644 >> --- a/fs/ocfs2/suballoc.c >> +++ b/fs/ocfs2/suballoc.c >> @@ -1251,22 +1251,25 @@ static int >> ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { >> struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; >> struct journal_head *jh; >> - int ret; >> + int ret = 1; >> >> if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) >> return 0; >> >> - if (!buffer_jbd(bg_bh)) >> - return 1; >> - >> - jh = bh2jh(bg_bh); >> - spin_lock(&jh->b_state_lock); >> - bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> - if (bg) >> - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> - else >> - ret = 1; >> - spin_unlock(&jh->b_state_lock); >> + if (buffer_jbd(bg_bh)) { >> + jbd_lock_bh_journal_head(bg_bh); >> + if (buffer_jbd(bg_bh)){ >> + jh = bh2jh(bg_bh); >> + spin_lock(&jh->b_state_lock); >> + bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> + if (bg) >> + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> + else >> + ret = 1; >> + spin_unlock(&jh->b_state_lock); >> + } >> + jbd_unlock_bh_journal_head(bg_bh); >> + } >> >> return ret; >> } >> >> We can also remove the re-initialization of ret = 1 in the 'else' part, as 'ret' is already initialized to 1 (However, personally I would like to keep this). >> >> Could you please take a look and comment? >> >> Thanks, >> Gautham. >> >> -----Original Message----- >> From: Joseph Qi <joseph.qi@linux.alibaba.com> >> Sent: Wednesday, October 20, 2021 1:57 PM >> To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; >> ocfs2-devel@oss.oracle.com >> Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom >> <rajesh.sivaramasubramaniom@oracle.com> >> Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks >> and release journal_head from buffer_head >> >> Hi, >> >> How about make the change like following? >> >> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index >> 8521942..481017e 100644 >> --- a/fs/ocfs2/suballoc.c >> +++ b/fs/ocfs2/suballoc.c >> @@ -1251,7 +1251,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { >> struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; >> struct journal_head *jh; >> - int ret; >> + int ret = 1; >> >> if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) >> return 0; >> @@ -1259,14 +1259,18 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, >> if (!buffer_jbd(bg_bh)) >> return 1; >> >> - jh = bh2jh(bg_bh); >> - spin_lock(&jh->b_state_lock); >> - bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> - if (bg) >> - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> - else >> - ret = 1; >> - spin_unlock(&jh->b_state_lock); >> + jbd_lock_bh_journal_head(bg_bh); >> + if (buffer_jbd(bg_bh)) { >> + jh = bh2jh(bg_bh); >> + spin_lock(&jh->b_state_lock); >> + bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> + if (bg) >> + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> + else >> + ret = 1; >> + spin_unlock(&jh->b_state_lock); >> + } >> + jbd_unlock_bh_journal_head(bg_bh); >> >> return ret; >> } >> >> >> On 10/20/21 1:18 PM, Gautham Ananthakrishna wrote: >>> Encountered a race between ocfs2_test_bg_bit_allocatable() and >>> jbd2_journal_put_journal_head() resulting in the below vmcore. >>> >>> PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" >>> 0 [ffff8802435ff1c0] panic at ffffffff816ed175 >>> 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 >>> 2 [ffff8802435ff270] no_context at ffffffff8106eccf >>> 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d >>> 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 >>> 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b >>> 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f >>> 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 >>> [exception RIP: ocfs2_block_group_find_clear_bits+316] >>> RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 >>> RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 >>> RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 >>> RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff >>> R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 >>> R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 >>> ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b >>> 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at >>> ffffffffc11ef680 [ocfs2] >>> 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 >>> [ocfs2] >>> 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] >>> 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b >>> [ocfs2] >>> 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb >>> [ocfs2] >>> 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf >>> [ocfs2] >>> 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at >>> ffffffffc11cc0db [ocfs2] >>> 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at >>> ffffffffc11ce53f [ocfs2] >>> 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at >>> ffffffffc11f59b5 [ocfs2] >>> 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 >>> [ocfs2] >>> 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at >>> ffffffffc11dc169 [ocfs2] >>> 19 [ffff8802435ff960] ocfs2_make_clusters_writable at >>> ffffffffc11e4274 [ocfs2] >>> 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] >>> 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] >>> 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 >>> [ocfs2] >>> 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d >>> 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 >>> 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 >>> 26 [ffff8802435ffec0] kthread at ffffffff810a7afb >>> 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 >>> >>> When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the >>> bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and >>> released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' >>> to fix this race. >>> >>> Signed-off-by: Gautham Ananthakrishna >>> <gautham.ananthakrishna@oracle.com> >>> --- >>> fs/ocfs2/suballoc.c | 9 +++++++++ >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index >>> 8521942..86f33f2 100644 >>> --- a/fs/ocfs2/suballoc.c >>> +++ b/fs/ocfs2/suballoc.c >>> @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, >>> if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) >>> return 0; >>> >>> + /* Fast path */ >>> if (!buffer_jbd(bg_bh)) >>> return 1; >>> >>> + /* Slow path */ >>> + jbd_lock_bh_journal_head(bg_bh); >>> + if (!buffer_jbd(bg_bh)){ >>> + jbd_unlock_bh_journal_head(bg_bh); >>> + return 1; >>> + } >>> + >>> jh = bh2jh(bg_bh); >>> spin_lock(&jh->b_state_lock); >>> bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 >>> +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct >>> +buffer_head *bg_bh, >>> else >>> ret = 1; >>> spin_unlock(&jh->b_state_lock); >>> + jbd_unlock_bh_journal_head(bg_bh); >>> >>> return ret; >>> } >>> _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head 2021-10-21 7:33 ` Joseph Qi @ 2021-10-21 7:40 ` Gautham Ananthakrishna 0 siblings, 0 replies; 9+ messages in thread From: Gautham Ananthakrishna @ 2021-10-21 7:40 UTC (permalink / raw) To: Joseph Qi, ocfs2-devel; +Cc: Rajesh Sivaramasubramaniom Oops.. I had missed it.... thank you. I will make changes and send V2. Thanks, Gautham. -----Original Message----- From: Joseph Qi <joseph.qi@linux.alibaba.com> Sent: Thursday, October 21, 2021 1:03 PM To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; ocfs2-devel@oss.oracle.com Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com> Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head I've kept the 'if (!buffer_jbd(bg_bh))', that's the exactly 'fast path'. Thanks, Joseph On 10/21/21 3:30 PM, Gautham Ananthakrishna wrote: > Hi Joseph, > > No, I don’t see any issue with your suggestion. I thought it didn’t have the fast path which you suggested earlier. > > Thanks, > Gautham. > > -----Original Message----- > From: Joseph Qi <joseph.qi@linux.alibaba.com> > Sent: Thursday, October 21, 2021 12:57 PM > To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; > ocfs2-devel@oss.oracle.com > Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom > <rajesh.sivaramasubramaniom@oracle.com> > Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks > and release journal_head from buffer_head > > Seems it has one more code intent level. > Is there any issue on my suggested change? > > Thanks, > Joseph > > On 10/20/21 9:46 PM, Gautham Ananthakrishna wrote: >> Hi Joseph >> >> The following would retain the fast path, as per your earlier comment: >> >> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index >> 8521942..048e532 100644 >> --- a/fs/ocfs2/suballoc.c >> +++ b/fs/ocfs2/suballoc.c >> @@ -1251,22 +1251,25 @@ static int >> ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { >> struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; >> struct journal_head *jh; >> - int ret; >> + int ret = 1; >> >> if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) >> return 0; >> >> - if (!buffer_jbd(bg_bh)) >> - return 1; >> - >> - jh = bh2jh(bg_bh); >> - spin_lock(&jh->b_state_lock); >> - bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> - if (bg) >> - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> - else >> - ret = 1; >> - spin_unlock(&jh->b_state_lock); >> + if (buffer_jbd(bg_bh)) { >> + jbd_lock_bh_journal_head(bg_bh); >> + if (buffer_jbd(bg_bh)){ >> + jh = bh2jh(bg_bh); >> + spin_lock(&jh->b_state_lock); >> + bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> + if (bg) >> + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> + else >> + ret = 1; >> + spin_unlock(&jh->b_state_lock); >> + } >> + jbd_unlock_bh_journal_head(bg_bh); >> + } >> >> return ret; >> } >> >> We can also remove the re-initialization of ret = 1 in the 'else' part, as 'ret' is already initialized to 1 (However, personally I would like to keep this). >> >> Could you please take a look and comment? >> >> Thanks, >> Gautham. >> >> -----Original Message----- >> From: Joseph Qi <joseph.qi@linux.alibaba.com> >> Sent: Wednesday, October 20, 2021 1:57 PM >> To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; >> ocfs2-devel@oss.oracle.com >> Cc: Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom >> <rajesh.sivaramasubramaniom@oracle.com> >> Subject: Re: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks >> and release journal_head from buffer_head >> >> Hi, >> >> How about make the change like following? >> >> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index >> 8521942..481017e 100644 >> --- a/fs/ocfs2/suballoc.c >> +++ b/fs/ocfs2/suballoc.c >> @@ -1251,7 +1251,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { >> struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; >> struct journal_head *jh; >> - int ret; >> + int ret = 1; >> >> if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) >> return 0; >> @@ -1259,14 +1259,18 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, >> if (!buffer_jbd(bg_bh)) >> return 1; >> >> - jh = bh2jh(bg_bh); >> - spin_lock(&jh->b_state_lock); >> - bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> - if (bg) >> - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> - else >> - ret = 1; >> - spin_unlock(&jh->b_state_lock); >> + jbd_lock_bh_journal_head(bg_bh); >> + if (buffer_jbd(bg_bh)) { >> + jh = bh2jh(bg_bh); >> + spin_lock(&jh->b_state_lock); >> + bg = (struct ocfs2_group_desc *) jh->b_committed_data; >> + if (bg) >> + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); >> + else >> + ret = 1; >> + spin_unlock(&jh->b_state_lock); >> + } >> + jbd_unlock_bh_journal_head(bg_bh); >> >> return ret; >> } >> >> >> On 10/20/21 1:18 PM, Gautham Ananthakrishna wrote: >>> Encountered a race between ocfs2_test_bg_bit_allocatable() and >>> jbd2_journal_put_journal_head() resulting in the below vmcore. >>> >>> PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" >>> 0 [ffff8802435ff1c0] panic at ffffffff816ed175 >>> 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 >>> 2 [ffff8802435ff270] no_context at ffffffff8106eccf >>> 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d >>> 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 >>> 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b >>> 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f >>> 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 >>> [exception RIP: ocfs2_block_group_find_clear_bits+316] >>> RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 >>> RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 >>> RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 >>> RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff >>> R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 >>> R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 >>> ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b >>> 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at >>> ffffffffc11ef680 [ocfs2] >>> 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 >>> [ocfs2] >>> 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] >>> 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b >>> [ocfs2] >>> 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb >>> [ocfs2] >>> 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf >>> [ocfs2] >>> 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at >>> ffffffffc11cc0db [ocfs2] >>> 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at >>> ffffffffc11ce53f [ocfs2] >>> 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at >>> ffffffffc11f59b5 [ocfs2] >>> 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 >>> [ocfs2] >>> 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at >>> ffffffffc11dc169 [ocfs2] >>> 19 [ffff8802435ff960] ocfs2_make_clusters_writable at >>> ffffffffc11e4274 [ocfs2] >>> 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] >>> 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] >>> 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 >>> [ocfs2] >>> 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d >>> 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 >>> 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 >>> 26 [ffff8802435ffec0] kthread at ffffffff810a7afb >>> 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 >>> >>> When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the >>> bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and >>> released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' >>> to fix this race. >>> >>> Signed-off-by: Gautham Ananthakrishna >>> <gautham.ananthakrishna@oracle.com> >>> --- >>> fs/ocfs2/suballoc.c | 9 +++++++++ >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index >>> 8521942..86f33f2 100644 >>> --- a/fs/ocfs2/suballoc.c >>> +++ b/fs/ocfs2/suballoc.c >>> @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, >>> if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) >>> return 0; >>> >>> + /* Fast path */ >>> if (!buffer_jbd(bg_bh)) >>> return 1; >>> >>> + /* Slow path */ >>> + jbd_lock_bh_journal_head(bg_bh); >>> + if (!buffer_jbd(bg_bh)){ >>> + jbd_unlock_bh_journal_head(bg_bh); >>> + return 1; >>> + } >>> + >>> jh = bh2jh(bg_bh); >>> spin_lock(&jh->b_state_lock); >>> bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 >>> +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct >>> +buffer_head *bg_bh, >>> else >>> ret = 1; >>> spin_unlock(&jh->b_state_lock); >>> + jbd_unlock_bh_journal_head(bg_bh); >>> >>> return ret; >>> } >>> _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Ocfs2-devel] [PATCH V1 RFC 1/1] Subject: [[PATCH V1 RFC] 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head @ 2021-10-20 5:15 Gautham Ananthakrishna 2021-10-20 5:15 ` [Ocfs2-devel] [PATCH V1 RFC " Gautham Ananthakrishna 0 siblings, 1 reply; 9+ messages in thread From: Gautham Ananthakrishna @ 2021-10-20 5:15 UTC (permalink / raw) To: ocfs2-devel; +Cc: rajesh.sivaramasubramaniom, gautham.ananthakrishna Encountered a race between ocfs2_test_bg_bit_allocatable() and jbd2_journal_put_journal_head() resulting in the below vmcore. PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" 0 [ffff8802435ff1c0] panic at ffffffff816ed175 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 2 [ffff8802435ff270] no_context at ffffffff8106eccf 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 [exception RIP: ocfs2_block_group_find_clear_bits+316] RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at ffffffffc11ef680 [ocfs2] 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 [ocfs2] 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b [ocfs2] 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb [ocfs2] 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2] 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at ffffffffc11cc0db [ocfs2] 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at ffffffffc11ce53f [ocfs2] 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at ffffffffc11f59b5 [ocfs2] 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 [ocfs2] 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at ffffffffc11dc169 [ocfs2] 19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 [ocfs2] 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 [ocfs2] 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 26 [ffff8802435ffec0] kthread at ffffffff810a7afb 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' to fix this race. Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> --- fs/ocfs2/suballoc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 8521942..86f33f2 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; + /* Fast path */ if (!buffer_jbd(bg_bh)) return 1; + /* Slow path */ + jbd_lock_bh_journal_head(bg_bh); + if (!buffer_jbd(bg_bh)){ + jbd_unlock_bh_journal_head(bg_bh); + return 1; + } + jh = bh2jh(bg_bh); spin_lock(&jh->b_state_lock); bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, else ret = 1; spin_unlock(&jh->b_state_lock); + jbd_unlock_bh_journal_head(bg_bh); return ret; } -- 1.8.3.1 _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head 2021-10-20 5:15 [Ocfs2-devel] [PATCH V1 RFC 1/1] Subject: [[PATCH V1 RFC] " Gautham Ananthakrishna @ 2021-10-20 5:15 ` Gautham Ananthakrishna 2021-10-20 5:17 ` Gautham Ananthakrishna 0 siblings, 1 reply; 9+ messages in thread From: Gautham Ananthakrishna @ 2021-10-20 5:15 UTC (permalink / raw) To: ocfs2-devel; +Cc: rajesh.sivaramasubramaniom, gautham.ananthakrishna Encountered a race between ocfs2_test_bg_bit_allocatable() and jbd2_journal_put_journal_head() resulting in the below vmcore. PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" 0 [ffff8802435ff1c0] panic at ffffffff816ed175 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 2 [ffff8802435ff270] no_context at ffffffff8106eccf 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 [exception RIP: ocfs2_block_group_find_clear_bits+316] RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at ffffffffc11ef680 [ocfs2] 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 [ocfs2] 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b [ocfs2] 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb [ocfs2] 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2] 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at ffffffffc11cc0db [ocfs2] 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at ffffffffc11ce53f [ocfs2] 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at ffffffffc11f59b5 [ocfs2] 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 [ocfs2] 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at ffffffffc11dc169 [ocfs2] 19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 [ocfs2] 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 [ocfs2] 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 26 [ffff8802435ffec0] kthread at ffffffff810a7afb 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' to fix this race. Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> --- fs/ocfs2/suballoc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 8521942..86f33f2 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; + /* Fast path */ if (!buffer_jbd(bg_bh)) return 1; + /* Slow path */ + jbd_lock_bh_journal_head(bg_bh); + if (!buffer_jbd(bg_bh)){ + jbd_unlock_bh_journal_head(bg_bh); + return 1; + } + jh = bh2jh(bg_bh); spin_lock(&jh->b_state_lock); bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, else ret = 1; spin_unlock(&jh->b_state_lock); + jbd_unlock_bh_journal_head(bg_bh); return ret; } -- 1.8.3.1 _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head 2021-10-20 5:15 ` [Ocfs2-devel] [PATCH V1 RFC " Gautham Ananthakrishna @ 2021-10-20 5:17 ` Gautham Ananthakrishna 0 siblings, 0 replies; 9+ messages in thread From: Gautham Ananthakrishna @ 2021-10-20 5:17 UTC (permalink / raw) To: Gautham Ananthakrishna, ocfs2-devel; +Cc: Rajesh Sivaramasubramaniom Please ignore this patch I will be resending it Thanks, Gautham. -----Original Message----- From: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> Sent: Wednesday, October 20, 2021 10:45 AM To: ocfs2-devel@oss.oracle.com Cc: joseph.qi@linux.alibaba.com; Junxiao Bi <junxiao.bi@oracle.com>; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>; Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> Subject: [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head Encountered a race between ocfs2_test_bg_bit_allocatable() and jbd2_journal_put_journal_head() resulting in the below vmcore. PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3" 0 [ffff8802435ff1c0] panic at ffffffff816ed175 1 [ffff8802435ff240] oops_end at ffffffff8101a7c9 2 [ffff8802435ff270] no_context at ffffffff8106eccf 3 [ffff8802435ff2e0] __bad_area_nosemaphore at ffffffff8106ef9d 4 [ffff8802435ff330] bad_area_nosemaphore at ffffffff8106f143 5 [ffff8802435ff340] __do_page_fault at ffffffff8106f80b 6 [ffff8802435ff3a0] do_page_fault at ffffffff8106fc2f 7 [ffff8802435ff3e0] page_fault at ffffffff816fd667 [exception RIP: ocfs2_block_group_find_clear_bits+316] RIP: ffffffffc11ef6fc RSP: ffff8802435ff498 RFLAGS: 00010206 RAX: 0000000000003918 RBX: 0000000000000001 RCX: 0000000000000018 RDX: 0000000000003918 RSI: 0000000000000000 RDI: ffff880060194040 RBP: ffff8802435ff4f8 R8: ffffffffff000000 R9: ffffffffffffffff R10: ffff8802435ff730 R11: ffff8802a94e5800 R12: 0000000000000007 R13: 0000000000007e00 R14: 0000000000003918 R15: ffff88017c973a28 ORIG_RAX: ffffffffffffffff CS: e030 SS: e02b 8 [ffff8802435ff490] ocfs2_block_group_find_clear_bits at ffffffffc11ef680 [ocfs2] 9 [ffff8802435ff500] ocfs2_cluster_group_search at ffffffffc11ef916 [ocfs2] 10 [ffff8802435ff580] ocfs2_search_chain at ffffffffc11f0fb6 [ocfs2] 11 [ffff8802435ff660] ocfs2_claim_suballoc_bits at ffffffffc11f1b1b [ocfs2] 12 [ffff8802435ff6f0] __ocfs2_claim_clusters at ffffffffc11f32cb [ocfs2] 13 [ffff8802435ff770] ocfs2_claim_clusters at ffffffffc11f5caf [ocfs2] 14 [ffff8802435ff780] ocfs2_local_alloc_slide_window at ffffffffc11cc0db [ocfs2] 15 [ffff8802435ff820] ocfs2_reserve_local_alloc_bits at ffffffffc11ce53f [ocfs2] 16 [ffff8802435ff890] ocfs2_reserve_clusters_with_limit at ffffffffc11f59b5 [ocfs2] 17 [ffff8802435ff8e0] ocfs2_reserve_clusters at ffffffffc11f5c88 [ocfs2] 18 [ffff8802435ff8f0] ocfs2_lock_refcount_allocators at ffffffffc11dc169 [ocfs2] 19 [ffff8802435ff960] ocfs2_make_clusters_writable at ffffffffc11e4274 [ocfs2] 20 [ffff8802435ffa50] ocfs2_replace_cow at ffffffffc11e4df1 [ocfs2] 21 [ffff8802435ffac0] ocfs2_refcount_cow at ffffffffc11e54b1 [ocfs2] 22 [ffff8802435ffb80] ocfs2_file_write_iter at ffffffffc11bf8f4 [ocfs2] 23 [ffff8802435ffcd0] lo_rw_aio at ffffffff814a1b5d 24 [ffff8802435ffd80] loop_queue_work at ffffffff814a2802 25 [ffff8802435ffe60] kthread_worker_fn at ffffffff810a80d2 26 [ffff8802435ffec0] kthread at ffffffff810a7afb 27 [ffff8802435fff50] ret_from_fork at ffffffff816f7da1 When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and released the jounal head from the buffer head. Needed to take bit lock for the bit 'BH_JournalHead' to fix this race. Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> --- fs/ocfs2/suballoc.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 8521942..86f33f2 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1256,9 +1256,17 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; + /* Fast path */ if (!buffer_jbd(bg_bh)) return 1; + /* Slow path */ + jbd_lock_bh_journal_head(bg_bh); + if (!buffer_jbd(bg_bh)){ + jbd_unlock_bh_journal_head(bg_bh); + return 1; + } + jh = bh2jh(bg_bh); spin_lock(&jh->b_state_lock); bg = (struct ocfs2_group_desc *) jh->b_committed_data; @@ -1267,6 +1275,7 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, else ret = 1; spin_unlock(&jh->b_state_lock); + jbd_unlock_bh_journal_head(bg_bh); return ret; } -- 1.8.3.1 _______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-10-21 7:40 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-10-20 5:18 [Ocfs2-devel] [PATCH V1 RFC 1/1] ocfs2: race between searching chunks and release journal_head from buffer_head Gautham Ananthakrishna 2021-10-20 8:26 ` Joseph Qi 2021-10-20 13:46 ` Gautham Ananthakrishna 2021-10-21 7:26 ` Joseph Qi 2021-10-21 7:30 ` Gautham Ananthakrishna 2021-10-21 7:33 ` Joseph Qi 2021-10-21 7:40 ` Gautham Ananthakrishna -- strict thread matches above, loose matches on Subject: below -- 2021-10-20 5:15 [Ocfs2-devel] [PATCH V1 RFC 1/1] Subject: [[PATCH V1 RFC] " Gautham Ananthakrishna 2021-10-20 5:15 ` [Ocfs2-devel] [PATCH V1 RFC " Gautham Ananthakrishna 2021-10-20 5:17 ` Gautham Ananthakrishna
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).