* [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps @ 2019-07-20 6:05 Zhihao Cheng 2019-07-27 11:09 ` 答复: " chengzhihao 2019-07-29 16:51 ` Richard Weinberger 0 siblings, 2 replies; 14+ messages in thread From: Zhihao Cheng @ 2019-07-20 6:05 UTC (permalink / raw) To: richard, s.hauer, dedekind1, yi.zhang Cc: linux-mtd, linux-kernel, chengzhihao1 Running stress-test test_2 in mtd-utils on ubi device, sometimes we can get following oops message: BUG: unable to handle page fault for address: ffffffff00000140 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 280a067 P4D 280a067 PUD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 60 Comm: kworker/u16:1 Kdump: loaded Not tainted 5.2.0 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Workqueue: writeback wb_workfn (flush-ubifs_0_0) RIP: 0010:rb_next_postorder+0x2e/0xb0 Code: 80 db 03 01 48 85 ff 0f 84 97 00 00 00 48 8b 17 48 83 05 bc 80 db 03 01 48 83 e2 fc 0f 84 82 00 00 00 48 83 05 b2 80 db 03 01 <48> 3b 7a 10 48 89 d0 74 02 f3 c3 48 8b 52 08 48 83 05 a3 80 db 03 RSP: 0018:ffffc90000887758 EFLAGS: 00010202 RAX: ffff888129ae4700 RBX: ffff888138b08400 RCX: 0000000080800001 RDX: ffffffff00000130 RSI: 0000000080800024 RDI: ffff888138b08400 RBP: ffff888138b08400 R08: ffffea0004a6b920 R09: 0000000000000000 R10: ffffc90000887740 R11: 0000000000000001 R12: ffff888128d48000 R13: 0000000000000800 R14: 000000000000011e R15: 00000000000007c8 FS: 0000000000000000(0000) GS:ffff88813ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff00000140 CR3: 000000013789d000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: destroy_old_idx+0x5d/0xa0 [ubifs] ubifs_tnc_start_commit+0x4fe/0x1380 [ubifs] do_commit+0x3eb/0x830 [ubifs] ubifs_run_commit+0xdc/0x1c0 [ubifs] Above Oops are due to the slab-out-of-bounds happened in do-while of function layout_in_gaps indirectly called by ubifs_tnc_start_commit. In function layout_in_gaps, there is a do-while loop placing index nodes into the gaps created by obsolete index nodes in non-empty index LEBs until rest index nodes can totally be placed into pre-allocated empty LEBs. @c->gap_lebs points to a memory area(integer array) which records LEB numbers used by 'in-the-gaps' method. Whenever a fitable index LEB is found, corresponding lnum will be incrementally written into the memory area pointed by @c->gap_lebs. The size ((@c->lst.idx_lebs + 1) * sizeof(int)) of memory area is allocated before do-while loop and can not be changed in the loop. But @c->lst.idx_lebs could be increased by function ubifs_change_lp (called by layout_leb_in_gaps->ubifs_find_dirty_idx_leb->get_idx_gc_leb) during the loop. So, sometimes oob happens when number of cycles in do-while loop exceeds the original value of @c->lst.idx_lebs. See detail in https://bugzilla.kernel.org/show_bug.cgi?id=204229. This patch fixes oob in layout_in_gaps. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> --- fs/ubifs/tnc_commit.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c index a384a0f..234be1c 100644 --- a/fs/ubifs/tnc_commit.c +++ b/fs/ubifs/tnc_commit.c @@ -212,7 +212,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, /** * layout_leb_in_gaps - layout index nodes using in-the-gaps method. * @c: UBIFS file-system description object - * @p: return LEB number here + * @p: return LEB number in @c->gap_lebs[p] * * This function lays out new index nodes for dirty znodes using in-the-gaps * method of TNC commit. @@ -221,7 +221,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, * This function returns the number of index nodes written into the gaps, or a * negative error code on failure. */ -static int layout_leb_in_gaps(struct ubifs_info *c, int *p) +static int layout_leb_in_gaps(struct ubifs_info *c, int p) { struct ubifs_scan_leb *sleb; struct ubifs_scan_node *snod; @@ -236,7 +236,7 @@ static int layout_leb_in_gaps(struct ubifs_info *c, int *p) * filled, however we do not check there at present. */ return lnum; /* Error code */ - *p = lnum; + c->gap_lebs[p] = lnum; dbg_gc("LEB %d", lnum); /* * Scan the index LEB. We use the generic scan for this even though @@ -355,7 +355,7 @@ static int get_leb_cnt(struct ubifs_info *c, int cnt) */ static int layout_in_gaps(struct ubifs_info *c, int cnt) { - int err, leb_needed_cnt, written, *p; + int err, leb_needed_cnt, written, p = 0, old_idx_lebs, *gap_lebs; dbg_gc("%d znodes to write", cnt); @@ -364,9 +364,9 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) if (!c->gap_lebs) return -ENOMEM; - p = c->gap_lebs; + old_idx_lebs = c->lst.idx_lebs; do { - ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); + ubifs_assert(c, p < c->lst.idx_lebs); written = layout_leb_in_gaps(c, p); if (written < 0) { err = written; @@ -392,9 +392,29 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) leb_needed_cnt = get_leb_cnt(c, cnt); dbg_gc("%d znodes remaining, need %d LEBs, have %d", cnt, leb_needed_cnt, c->ileb_cnt); + /* + * Dynamically change the size of @c->gap_lebs to prevent + * oob, because @c->lst.idx_lebs could be increased by + * function @get_idx_gc_leb (called by layout_leb_in_gaps-> + * ubifs_find_dirty_idx_leb) during loop. Only enlarge + * @c->gap_lebs when needed. + * + */ + if (leb_needed_cnt > c->ileb_cnt && p >= old_idx_lebs && + old_idx_lebs < c->lst.idx_lebs) { + old_idx_lebs = c->lst.idx_lebs; + gap_lebs = krealloc(c->gap_lebs, sizeof(int) * + (old_idx_lebs + 1), GFP_NOFS); + if (!gap_lebs) { + kfree(c->gap_lebs); + c->gap_lebs = NULL; + return -ENOMEM; + } + c->gap_lebs = gap_lebs; + } } while (leb_needed_cnt > c->ileb_cnt); - *p = -1; + c->gap_lebs[p] = -1; return 0; } -- 2.7.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* 答复: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-07-20 6:05 [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps Zhihao Cheng @ 2019-07-27 11:09 ` chengzhihao 2019-07-27 11:13 ` Richard Weinberger 2019-07-29 16:51 ` Richard Weinberger 1 sibling, 1 reply; 14+ messages in thread From: chengzhihao @ 2019-07-27 11:09 UTC (permalink / raw) To: richard, s.hauer, dedekind1, zhangyi (F); +Cc: linux-mtd, linux-kernel ping -----邮件原件----- 发件人: chengzhihao 发送时间: 2019年7月20日 14:05 收件人: richard@nod.at; s.hauer@pengutronix.de; dedekind1@gmail.com; zhangyi (F) <yi.zhang@huawei.com> 抄送: linux-mtd@lists.infradead.org; linux-kernel@vger.kernel.org; chengzhihao <chengzhihao1@huawei.com> 主题: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps Running stress-test test_2 in mtd-utils on ubi device, sometimes we can get following oops message: BUG: unable to handle page fault for address: ffffffff00000140 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 280a067 P4D 280a067 PUD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 60 Comm: kworker/u16:1 Kdump: loaded Not tainted 5.2.0 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Workqueue: writeback wb_workfn (flush-ubifs_0_0) RIP: 0010:rb_next_postorder+0x2e/0xb0 Code: 80 db 03 01 48 85 ff 0f 84 97 00 00 00 48 8b 17 48 83 05 bc 80 db 03 01 48 83 e2 fc 0f 84 82 00 00 00 48 83 05 b2 80 db 03 01 <48> 3b 7a 10 48 89 d0 74 02 f3 c3 48 8b 52 08 48 83 05 a3 80 db 03 RSP: 0018:ffffc90000887758 EFLAGS: 00010202 RAX: ffff888129ae4700 RBX: ffff888138b08400 RCX: 0000000080800001 RDX: ffffffff00000130 RSI: 0000000080800024 RDI: ffff888138b08400 RBP: ffff888138b08400 R08: ffffea0004a6b920 R09: 0000000000000000 R10: ffffc90000887740 R11: 0000000000000001 R12: ffff888128d48000 R13: 0000000000000800 R14: 000000000000011e R15: 00000000000007c8 FS: 0000000000000000(0000) GS:ffff88813ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff00000140 CR3: 000000013789d000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: destroy_old_idx+0x5d/0xa0 [ubifs] ubifs_tnc_start_commit+0x4fe/0x1380 [ubifs] do_commit+0x3eb/0x830 [ubifs] ubifs_run_commit+0xdc/0x1c0 [ubifs] Above Oops are due to the slab-out-of-bounds happened in do-while of function layout_in_gaps indirectly called by ubifs_tnc_start_commit. In function layout_in_gaps, there is a do-while loop placing index nodes into the gaps created by obsolete index nodes in non-empty index LEBs until rest index nodes can totally be placed into pre-allocated empty LEBs. @c->gap_lebs points to a memory area(integer array) which records LEB numbers used by 'in-the-gaps' method. Whenever a fitable index LEB is found, corresponding lnum will be incrementally written into the memory area pointed by @c->gap_lebs. The size ((@c->lst.idx_lebs + 1) * sizeof(int)) of memory area is allocated before do-while loop and can not be changed in the loop. But @c->lst.idx_lebs could be increased by function ubifs_change_lp (called by layout_leb_in_gaps->ubifs_find_dirty_idx_leb->get_idx_gc_leb) during the loop. So, sometimes oob happens when number of cycles in do-while loop exceeds the original value of @c->lst.idx_lebs. See detail in https://bugzilla.kernel.org/show_bug.cgi?id=204229. This patch fixes oob in layout_in_gaps. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> --- fs/ubifs/tnc_commit.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c index a384a0f..234be1c 100644 --- a/fs/ubifs/tnc_commit.c +++ b/fs/ubifs/tnc_commit.c @@ -212,7 +212,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, /** * layout_leb_in_gaps - layout index nodes using in-the-gaps method. * @c: UBIFS file-system description object - * @p: return LEB number here + * @p: return LEB number in @c->gap_lebs[p] * * This function lays out new index nodes for dirty znodes using in-the-gaps * method of TNC commit. @@ -221,7 +221,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, * This function returns the number of index nodes written into the gaps, or a * negative error code on failure. */ -static int layout_leb_in_gaps(struct ubifs_info *c, int *p) +static int layout_leb_in_gaps(struct ubifs_info *c, int p) { struct ubifs_scan_leb *sleb; struct ubifs_scan_node *snod; @@ -236,7 +236,7 @@ static int layout_leb_in_gaps(struct ubifs_info *c, int *p) * filled, however we do not check there at present. */ return lnum; /* Error code */ - *p = lnum; + c->gap_lebs[p] = lnum; dbg_gc("LEB %d", lnum); /* * Scan the index LEB. We use the generic scan for this even though @@ -355,7 +355,7 @@ static int get_leb_cnt(struct ubifs_info *c, int cnt) */ static int layout_in_gaps(struct ubifs_info *c, int cnt) { - int err, leb_needed_cnt, written, *p; + int err, leb_needed_cnt, written, p = 0, old_idx_lebs, *gap_lebs; dbg_gc("%d znodes to write", cnt); @@ -364,9 +364,9 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) if (!c->gap_lebs) return -ENOMEM; - p = c->gap_lebs; + old_idx_lebs = c->lst.idx_lebs; do { - ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); + ubifs_assert(c, p < c->lst.idx_lebs); written = layout_leb_in_gaps(c, p); if (written < 0) { err = written; @@ -392,9 +392,29 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) leb_needed_cnt = get_leb_cnt(c, cnt); dbg_gc("%d znodes remaining, need %d LEBs, have %d", cnt, leb_needed_cnt, c->ileb_cnt); + /* + * Dynamically change the size of @c->gap_lebs to prevent + * oob, because @c->lst.idx_lebs could be increased by + * function @get_idx_gc_leb (called by layout_leb_in_gaps-> + * ubifs_find_dirty_idx_leb) during loop. Only enlarge + * @c->gap_lebs when needed. + * + */ + if (leb_needed_cnt > c->ileb_cnt && p >= old_idx_lebs && + old_idx_lebs < c->lst.idx_lebs) { + old_idx_lebs = c->lst.idx_lebs; + gap_lebs = krealloc(c->gap_lebs, sizeof(int) * + (old_idx_lebs + 1), GFP_NOFS); + if (!gap_lebs) { + kfree(c->gap_lebs); + c->gap_lebs = NULL; + return -ENOMEM; + } + c->gap_lebs = gap_lebs; + } } while (leb_needed_cnt > c->ileb_cnt); - *p = -1; + c->gap_lebs[p] = -1; return 0; } -- 2.7.4 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: 答复: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-07-27 11:09 ` 答复: " chengzhihao @ 2019-07-27 11:13 ` Richard Weinberger 0 siblings, 0 replies; 14+ messages in thread From: Richard Weinberger @ 2019-07-27 11:13 UTC (permalink / raw) To: chengzhihao1 Cc: Sascha Hauer, Artem Bityutskiy, yi zhang, linux-mtd, linux-kernel ----- Ursprüngliche Mail ----- > Von: "chengzhihao1" <chengzhihao1@huawei.com> > An: "richard" <richard@nod.at>, "Sascha Hauer" <s.hauer@pengutronix.de>, "Artem Bityutskiy" <dedekind1@gmail.com>, "yi > zhang" <yi.zhang@huawei.com> > CC: "linux-mtd" <linux-mtd@lists.infradead.org>, "linux-kernel" <linux-kernel@vger.kernel.org> > Gesendet: Samstag, 27. Juli 2019 13:09:59 > Betreff: 答复: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps > ping I had no time to look at this yet. It is on my list. Thanks, //richard ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-07-20 6:05 [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps Zhihao Cheng 2019-07-27 11:09 ` 答复: " chengzhihao @ 2019-07-29 16:51 ` Richard Weinberger 2019-07-29 20:35 ` Richard Weinberger 2019-07-30 1:20 ` 答复: " chengzhihao 1 sibling, 2 replies; 14+ messages in thread From: Richard Weinberger @ 2019-07-29 16:51 UTC (permalink / raw) To: Zhihao Cheng Cc: Richard Weinberger, Sascha Hauer, Artem Bityutskiy, yi.zhang, linux-mtd, LKML On Sat, Jul 20, 2019 at 8:00 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote: > > Running stress-test test_2 in mtd-utils on ubi device, sometimes we can > get following oops message: > > BUG: unable to handle page fault for address: ffffffff00000140 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 280a067 P4D 280a067 PUD 0 > Oops: 0000 [#1] SMP > CPU: 0 PID: 60 Comm: kworker/u16:1 Kdump: loaded Not tainted 5.2.0 #13 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 > -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 > Workqueue: writeback wb_workfn (flush-ubifs_0_0) > RIP: 0010:rb_next_postorder+0x2e/0xb0 > Code: 80 db 03 01 48 85 ff 0f 84 97 00 00 00 48 8b 17 48 83 05 bc 80 db > 03 01 48 83 e2 fc 0f 84 82 00 00 00 48 83 05 b2 80 db 03 01 <48> 3b 7a > 10 48 89 d0 74 02 f3 c3 48 8b 52 08 48 83 05 a3 80 db 03 > RSP: 0018:ffffc90000887758 EFLAGS: 00010202 > RAX: ffff888129ae4700 RBX: ffff888138b08400 RCX: 0000000080800001 > RDX: ffffffff00000130 RSI: 0000000080800024 RDI: ffff888138b08400 > RBP: ffff888138b08400 R08: ffffea0004a6b920 R09: 0000000000000000 > R10: ffffc90000887740 R11: 0000000000000001 R12: ffff888128d48000 > R13: 0000000000000800 R14: 000000000000011e R15: 00000000000007c8 > FS: 0000000000000000(0000) GS:ffff88813ba00000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffff00000140 CR3: 000000013789d000 CR4: 00000000000006f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > destroy_old_idx+0x5d/0xa0 [ubifs] > ubifs_tnc_start_commit+0x4fe/0x1380 [ubifs] > do_commit+0x3eb/0x830 [ubifs] > ubifs_run_commit+0xdc/0x1c0 [ubifs] > > Above Oops are due to the slab-out-of-bounds happened in do-while of > function layout_in_gaps indirectly called by ubifs_tnc_start_commit. In > function layout_in_gaps, there is a do-while loop placing index nodes > into the gaps created by obsolete index nodes in non-empty index LEBs > until rest index nodes can totally be placed into pre-allocated empty > LEBs. @c->gap_lebs points to a memory area(integer array) which records > LEB numbers used by 'in-the-gaps' method. Whenever a fitable index LEB > is found, corresponding lnum will be incrementally written into the > memory area pointed by @c->gap_lebs. The size > ((@c->lst.idx_lebs + 1) * sizeof(int)) of memory area is allocated before > do-while loop and can not be changed in the loop. But @c->lst.idx_lebs > could be increased by function ubifs_change_lp (called by > layout_leb_in_gaps->ubifs_find_dirty_idx_leb->get_idx_gc_leb) during the > loop. So, sometimes oob happens when number of cycles in do-while loop > exceeds the original value of @c->lst.idx_lebs. See detail in > https://bugzilla.kernel.org/show_bug.cgi?id=204229. > This patch fixes oob in layout_in_gaps. > > Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> > --- > fs/ubifs/tnc_commit.c | 34 +++++++++++++++++++++++++++------- > 1 file changed, 27 insertions(+), 7 deletions(-) > > diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c > index a384a0f..234be1c 100644 > --- a/fs/ubifs/tnc_commit.c > +++ b/fs/ubifs/tnc_commit.c > @@ -212,7 +212,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, > /** > * layout_leb_in_gaps - layout index nodes using in-the-gaps method. > * @c: UBIFS file-system description object > - * @p: return LEB number here > + * @p: return LEB number in @c->gap_lebs[p] > * > * This function lays out new index nodes for dirty znodes using in-the-gaps > * method of TNC commit. > @@ -221,7 +221,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, > * This function returns the number of index nodes written into the gaps, or a > * negative error code on failure. > */ > -static int layout_leb_in_gaps(struct ubifs_info *c, int *p) > +static int layout_leb_in_gaps(struct ubifs_info *c, int p) > { > struct ubifs_scan_leb *sleb; > struct ubifs_scan_node *snod; > @@ -236,7 +236,7 @@ static int layout_leb_in_gaps(struct ubifs_info *c, int *p) > * filled, however we do not check there at present. > */ > return lnum; /* Error code */ > - *p = lnum; > + c->gap_lebs[p] = lnum; > dbg_gc("LEB %d", lnum); > /* > * Scan the index LEB. We use the generic scan for this even though > @@ -355,7 +355,7 @@ static int get_leb_cnt(struct ubifs_info *c, int cnt) > */ > static int layout_in_gaps(struct ubifs_info *c, int cnt) > { > - int err, leb_needed_cnt, written, *p; > + int err, leb_needed_cnt, written, p = 0, old_idx_lebs, *gap_lebs; > > dbg_gc("%d znodes to write", cnt); > > @@ -364,9 +364,9 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) > if (!c->gap_lebs) > return -ENOMEM; > > - p = c->gap_lebs; > + old_idx_lebs = c->lst.idx_lebs; > do { > - ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); > + ubifs_assert(c, p < c->lst.idx_lebs); > written = layout_leb_in_gaps(c, p); > if (written < 0) { > err = written; > @@ -392,9 +392,29 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) > leb_needed_cnt = get_leb_cnt(c, cnt); > dbg_gc("%d znodes remaining, need %d LEBs, have %d", cnt, > leb_needed_cnt, c->ileb_cnt); > + /* > + * Dynamically change the size of @c->gap_lebs to prevent > + * oob, because @c->lst.idx_lebs could be increased by > + * function @get_idx_gc_leb (called by layout_leb_in_gaps-> > + * ubifs_find_dirty_idx_leb) during loop. Only enlarge > + * @c->gap_lebs when needed. > + * > + */ > + if (leb_needed_cnt > c->ileb_cnt && p >= old_idx_lebs && > + old_idx_lebs < c->lst.idx_lebs) { > + old_idx_lebs = c->lst.idx_lebs; > + gap_lebs = krealloc(c->gap_lebs, sizeof(int) * > + (old_idx_lebs + 1), GFP_NOFS); I see the problem. :-( But I'm not sure yet whether krealloc() is the right solution, we need to be sure that this does not just paper over the root cause. Please give me more time to understand the root cause. -- Thanks, //richard ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-07-29 16:51 ` Richard Weinberger @ 2019-07-29 20:35 ` Richard Weinberger 2019-07-30 1:20 ` 答复: " chengzhihao 1 sibling, 0 replies; 14+ messages in thread From: Richard Weinberger @ 2019-07-29 20:35 UTC (permalink / raw) To: Zhihao Cheng Cc: Richard Weinberger, Sascha Hauer, Artem Bityutskiy, yi.zhang, linux-mtd, LKML On Mon, Jul 29, 2019 at 6:51 PM Richard Weinberger <richard.weinberger@gmail.com> wrote: > > - ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); > > + ubifs_assert(c, p < c->lst.idx_lebs); I wonder, doesn't this assert trigger too? -- Thanks, //richard ^ permalink raw reply [flat|nested] 14+ messages in thread
* 答复: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-07-29 16:51 ` Richard Weinberger 2019-07-29 20:35 ` Richard Weinberger @ 2019-07-30 1:20 ` chengzhihao 2019-08-13 21:43 ` Richard Weinberger 1 sibling, 1 reply; 14+ messages in thread From: chengzhihao @ 2019-07-30 1:20 UTC (permalink / raw) To: Richard Weinberger Cc: Richard Weinberger, Sascha Hauer, Artem Bityutskiy, zhangyi (F), linux-mtd, LKML OK, that's fine, and I will continue to understand more implementation code related to this part. - Thanks, Cheng zhihao -----邮件原件----- 发件人: Richard Weinberger [mailto:richard.weinberger@gmail.com] 发送时间: 2019年7月30日 0:52 收件人: chengzhihao <chengzhihao1@huawei.com> 抄送: Richard Weinberger <richard@nod.at>; Sascha Hauer <s.hauer@pengutronix.de>; Artem Bityutskiy <dedekind1@gmail.com>; zhangyi (F) <yi.zhang@huawei.com>; linux-mtd@lists.infradead.org; LKML <linux-kernel@vger.kernel.org> 主题: Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps On Sat, Jul 20, 2019 at 8:00 AM Zhihao Cheng <chengzhihao1@huawei.com> wrote: > > Running stress-test test_2 in mtd-utils on ubi device, sometimes we > can get following oops message: > > BUG: unable to handle page fault for address: ffffffff00000140 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 280a067 P4D 280a067 PUD 0 > Oops: 0000 [#1] SMP > CPU: 0 PID: 60 Comm: kworker/u16:1 Kdump: loaded Not tainted 5.2.0 #13 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 > -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 > Workqueue: writeback wb_workfn (flush-ubifs_0_0) > RIP: 0010:rb_next_postorder+0x2e/0xb0 > Code: 80 db 03 01 48 85 ff 0f 84 97 00 00 00 48 8b 17 48 83 05 bc 80 db > 03 01 48 83 e2 fc 0f 84 82 00 00 00 48 83 05 b2 80 db 03 01 <48> 3b 7a > 10 48 89 d0 74 02 f3 c3 48 8b 52 08 48 83 05 a3 80 db 03 > RSP: 0018:ffffc90000887758 EFLAGS: 00010202 > RAX: ffff888129ae4700 RBX: ffff888138b08400 RCX: 0000000080800001 > RDX: ffffffff00000130 RSI: 0000000080800024 RDI: ffff888138b08400 > RBP: ffff888138b08400 R08: ffffea0004a6b920 R09: 0000000000000000 > R10: ffffc90000887740 R11: 0000000000000001 R12: ffff888128d48000 > R13: 0000000000000800 R14: 000000000000011e R15: 00000000000007c8 > FS: 0000000000000000(0000) GS:ffff88813ba00000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffffff00000140 CR3: 000000013789d000 CR4: 00000000000006f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > destroy_old_idx+0x5d/0xa0 [ubifs] > ubifs_tnc_start_commit+0x4fe/0x1380 [ubifs] > do_commit+0x3eb/0x830 [ubifs] > ubifs_run_commit+0xdc/0x1c0 [ubifs] > > Above Oops are due to the slab-out-of-bounds happened in do-while of > function layout_in_gaps indirectly called by ubifs_tnc_start_commit. > In function layout_in_gaps, there is a do-while loop placing index > nodes into the gaps created by obsolete index nodes in non-empty index > LEBs until rest index nodes can totally be placed into pre-allocated > empty LEBs. @c->gap_lebs points to a memory area(integer array) which > records LEB numbers used by 'in-the-gaps' method. Whenever a fitable > index LEB is found, corresponding lnum will be incrementally written > into the memory area pointed by @c->gap_lebs. The size > ((@c->lst.idx_lebs + 1) * sizeof(int)) of memory area is allocated > before do-while loop and can not be changed in the loop. But > @c->lst.idx_lebs could be increased by function ubifs_change_lp > (called by > layout_leb_in_gaps->ubifs_find_dirty_idx_leb->get_idx_gc_leb) during > the loop. So, sometimes oob happens when number of cycles in do-while > loop exceeds the original value of @c->lst.idx_lebs. See detail in > https://bugzilla.kernel.org/show_bug.cgi?id=204229. > This patch fixes oob in layout_in_gaps. > > Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> > --- > fs/ubifs/tnc_commit.c | 34 +++++++++++++++++++++++++++------- > 1 file changed, 27 insertions(+), 7 deletions(-) > > diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c index > a384a0f..234be1c 100644 > --- a/fs/ubifs/tnc_commit.c > +++ b/fs/ubifs/tnc_commit.c > @@ -212,7 +212,7 @@ static int is_idx_node_in_use(struct ubifs_info > *c, union ubifs_key *key, > /** > * layout_leb_in_gaps - layout index nodes using in-the-gaps method. > * @c: UBIFS file-system description object > - * @p: return LEB number here > + * @p: return LEB number in @c->gap_lebs[p] > * > * This function lays out new index nodes for dirty znodes using in-the-gaps > * method of TNC commit. > @@ -221,7 +221,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, > * This function returns the number of index nodes written into the gaps, or a > * negative error code on failure. > */ > -static int layout_leb_in_gaps(struct ubifs_info *c, int *p) > +static int layout_leb_in_gaps(struct ubifs_info *c, int p) > { > struct ubifs_scan_leb *sleb; > struct ubifs_scan_node *snod; > @@ -236,7 +236,7 @@ static int layout_leb_in_gaps(struct ubifs_info *c, int *p) > * filled, however we do not check there at present. > */ > return lnum; /* Error code */ > - *p = lnum; > + c->gap_lebs[p] = lnum; > dbg_gc("LEB %d", lnum); > /* > * Scan the index LEB. We use the generic scan for this even > though @@ -355,7 +355,7 @@ static int get_leb_cnt(struct ubifs_info *c, int cnt) > */ > static int layout_in_gaps(struct ubifs_info *c, int cnt) { > - int err, leb_needed_cnt, written, *p; > + int err, leb_needed_cnt, written, p = 0, old_idx_lebs, > + *gap_lebs; > > dbg_gc("%d znodes to write", cnt); > > @@ -364,9 +364,9 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) > if (!c->gap_lebs) > return -ENOMEM; > > - p = c->gap_lebs; > + old_idx_lebs = c->lst.idx_lebs; > do { > - ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); > + ubifs_assert(c, p < c->lst.idx_lebs); > written = layout_leb_in_gaps(c, p); > if (written < 0) { > err = written; @@ -392,9 +392,29 @@ static int > layout_in_gaps(struct ubifs_info *c, int cnt) > leb_needed_cnt = get_leb_cnt(c, cnt); > dbg_gc("%d znodes remaining, need %d LEBs, have %d", cnt, > leb_needed_cnt, c->ileb_cnt); > + /* > + * Dynamically change the size of @c->gap_lebs to prevent > + * oob, because @c->lst.idx_lebs could be increased by > + * function @get_idx_gc_leb (called by layout_leb_in_gaps-> > + * ubifs_find_dirty_idx_leb) during loop. Only enlarge > + * @c->gap_lebs when needed. > + * > + */ > + if (leb_needed_cnt > c->ileb_cnt && p >= old_idx_lebs && > + old_idx_lebs < c->lst.idx_lebs) { > + old_idx_lebs = c->lst.idx_lebs; > + gap_lebs = krealloc(c->gap_lebs, sizeof(int) * > + (old_idx_lebs + 1), > + GFP_NOFS); I see the problem. :-( But I'm not sure yet whether krealloc() is the right solution, we need to be sure that this does not just paper over the root cause. Please give me more time to understand the root cause. -- Thanks, //richard ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-07-30 1:20 ` 答复: " chengzhihao @ 2019-08-13 21:43 ` Richard Weinberger 2019-08-14 1:20 ` 答复: " chengzhihao 2019-08-16 8:00 ` chengzhihao 0 siblings, 2 replies; 14+ messages in thread From: Richard Weinberger @ 2019-08-13 21:43 UTC (permalink / raw) To: chengzhihao Cc: Richard Weinberger, Sascha Hauer, Artem Bityutskiy, zhangyi (F), linux-mtd, LKML On Tue, Jul 30, 2019 at 3:21 AM chengzhihao <chengzhihao1@huawei.com> wrote: > > OK, that's fine, and I will continue to understand more implementation code related to this part. I think we can go with the realloc() approach for now. Can you please check whether the assert() triggers? -- Thanks, //richard ^ permalink raw reply [flat|nested] 14+ messages in thread
* 答复: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-08-13 21:43 ` Richard Weinberger @ 2019-08-14 1:20 ` chengzhihao 2019-08-16 8:00 ` chengzhihao 1 sibling, 0 replies; 14+ messages in thread From: chengzhihao @ 2019-08-14 1:20 UTC (permalink / raw) To: Richard Weinberger Cc: Richard Weinberger, Sascha Hauer, Artem Bityutskiy, zhangyi (F), linux-mtd, LKML Sure, I'll do more tests on different machines to check the assertion. I'm trying to understand when this assertion will be triggered. Although I haven't found this assertion be triggered so far in several tests on x86_64(qemu). -----邮件原件----- 发件人: Richard Weinberger [mailto:richard.weinberger@gmail.com] 发送时间: 2019年8月14日 5:44 收件人: chengzhihao <chengzhihao1@huawei.com> 抄送: Richard Weinberger <richard@nod.at>; Sascha Hauer <s.hauer@pengutronix.de>; Artem Bityutskiy <dedekind1@gmail.com>; zhangyi (F) <yi.zhang@huawei.com>; linux-mtd@lists.infradead.org; LKML <linux-kernel@vger.kernel.org> 主题: Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps On Tue, Jul 30, 2019 at 3:21 AM chengzhihao <chengzhihao1@huawei.com> wrote: > > OK, that's fine, and I will continue to understand more implementation code related to this part. I think we can go with the realloc() approach for now. Can you please check whether the assert() triggers? -- Thanks, //richard ^ permalink raw reply [flat|nested] 14+ messages in thread
* 答复: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-08-13 21:43 ` Richard Weinberger 2019-08-14 1:20 ` 答复: " chengzhihao @ 2019-08-16 8:00 ` chengzhihao 2019-09-15 22:00 ` Richard Weinberger 1 sibling, 1 reply; 14+ messages in thread From: chengzhihao @ 2019-08-16 8:00 UTC (permalink / raw) To: Richard Weinberger Cc: Richard Weinberger, Sascha Hauer, Artem Bityutskiy, zhangyi (F), linux-mtd, LKML [-- Attachment #1: Type: text/plain, Size: 1701 bytes --] > ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); I've done 50 problem reproduces on different flash devices and made sure that the assertion was not triggered. See record.txt for details. -----邮件原件----- 发件人: chengzhihao 发送时间: 2019年8月14日 9:20 收件人: 'Richard Weinberger' <richard.weinberger@gmail.com> 抄送: Richard Weinberger <richard@nod.at>; Sascha Hauer <s.hauer@pengutronix.de>; Artem Bityutskiy <dedekind1@gmail.com>; zhangyi (F) <yi.zhang@huawei.com>; linux-mtd@lists.infradead.org; LKML <linux-kernel@vger.kernel.org> 主题: 答复: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps Sure, I'll do more tests on different machines to check the assertion. I'm trying to understand when this assertion will be triggered. Although I haven't found this assertion be triggered so far in several tests on x86_64(qemu). -----邮件原件----- 发件人: Richard Weinberger [mailto:richard.weinberger@gmail.com] 发送时间: 2019年8月14日 5:44 收件人: chengzhihao <chengzhihao1@huawei.com> 抄送: Richard Weinberger <richard@nod.at>; Sascha Hauer <s.hauer@pengutronix.de>; Artem Bityutskiy <dedekind1@gmail.com>; zhangyi (F) <yi.zhang@huawei.com>; linux-mtd@lists.infradead.org; LKML <linux-kernel@vger.kernel.org> 主题: Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps On Tue, Jul 30, 2019 at 3:21 AM chengzhihao <chengzhihao1@huawei.com> wrote: > > OK, that's fine, and I will continue to understand more implementation code related to this part. I think we can go with the realloc() approach for now. Can you please check whether the assert() triggers? -- Thanks, //richard [-- Attachment #2: record.txt --] [-- Type: text/plain, Size: 8397 bytes --] No Log Config 1 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 2 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 9 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 3 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 9 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 4 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 5 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 6 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 7 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 8 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 9 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 10 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 11 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 12 c->lst.idx_lebs[origin] = 3, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 10 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 13 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 14 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 15 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 16 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 17 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 18 c->lst.idx_lebs[origin] = 6, c->lst.idx_lebs[curr] = 13, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 19 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 20 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 9 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 21 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 22 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 23 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 24 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 25 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 26 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 27 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 28 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 29 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 30 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 31 c->lst.idx_lebs[origin] = 6, c->lst.idx_lebs[curr] = 13, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 32 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 33 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 9 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 34 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 35 c->lst.idx_lebs[origin] = 15, c->lst.idx_lebs[curr] = 19, p - c->gap_lebs = 16 ==== mtdram: 32MiB, PEB size 16KiB, fastmap enabled, volume size 22MiB 36 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 37 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 9 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 38 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 39 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 40 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 41 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 42 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 43 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 44 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB 45 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 46 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 47 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 48 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 49 c->lst.idx_lebs[origin] = 5, c->lst.idx_lebs[curr] = 12, p - c->gap_lebs = 8 ==== nandsim: 16MiB, PEB size 16KiB, page size 512KiB, VID offset 0, fastmap enabled, volume size 11MiB 50 c->lst.idx_lebs[origin] = 4, c->lst.idx_lebs[curr] = 11, p - c->gap_lebs = 8 ==== mtdram: 16MiB, PEB size 16KiB, fastmap enabled, volume size 11MiB ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-08-16 8:00 ` chengzhihao @ 2019-09-15 22:00 ` Richard Weinberger 2019-09-16 1:20 ` Zhihao Cheng ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Richard Weinberger @ 2019-09-15 22:00 UTC (permalink / raw) To: chengzhihao Cc: zhangyi (F), Richard Weinberger, Sascha Hauer, Artem Bityutskiy, LKML, linux-mtd On Fri, Aug 16, 2019 at 10:01 AM chengzhihao <chengzhihao1@huawei.com> wrote: > > > ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); > > I've done 50 problem reproduces on different flash devices and made sure that the assertion was not triggered. See record.txt for details. Thanks for letting me know. :) I need to give this another thought, then we can apply it for -rc2. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-09-15 22:00 ` Richard Weinberger @ 2019-09-16 1:20 ` Zhihao Cheng 2019-10-18 2:23 ` Zhihao Cheng 2019-11-12 3:45 ` Zhihao Cheng 2 siblings, 0 replies; 14+ messages in thread From: Zhihao Cheng @ 2019-09-16 1:20 UTC (permalink / raw) To: Richard Weinberger Cc: zhangyi (F), Richard Weinberger, Sascha Hauer, Artem Bityutskiy, LKML, linux-mtd 在 2019/9/16 6:00, Richard Weinberger 写道: > On Fri, Aug 16, 2019 at 10:01 AM chengzhihao <chengzhihao1@huawei.com> wrote: >> >>> ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); >> >> I've done 50 problem reproduces on different flash devices and made sure that the assertion was not triggered. See record.txt for details. > > Thanks for letting me know. :) > I need to give this another thought, then we can apply it for -rc2. > ACK. :) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-09-15 22:00 ` Richard Weinberger 2019-09-16 1:20 ` Zhihao Cheng @ 2019-10-18 2:23 ` Zhihao Cheng 2019-11-12 3:45 ` Zhihao Cheng 2 siblings, 0 replies; 14+ messages in thread From: Zhihao Cheng @ 2019-10-18 2:23 UTC (permalink / raw) To: Richard Weinberger Cc: zhangyi (F), Richard Weinberger, Sascha Hauer, Artem Bityutskiy, LKML, linux-mtd Can the current modification method be confirmed? 在 2019/9/16 6:00, Richard Weinberger 写道: > I need to give this another thought ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps 2019-09-15 22:00 ` Richard Weinberger 2019-09-16 1:20 ` Zhihao Cheng 2019-10-18 2:23 ` Zhihao Cheng @ 2019-11-12 3:45 ` Zhihao Cheng 2 siblings, 0 replies; 14+ messages in thread From: Zhihao Cheng @ 2019-11-12 3:45 UTC (permalink / raw) To: Richard Weinberger Cc: zhangyi (F), Richard Weinberger, Sascha Hauer, Artem Bityutskiy, LKML, linux-mtd ping. 在 2019/9/16 6:00, Richard Weinberger 写道: > On Fri, Aug 16, 2019 at 10:01 AM chengzhihao <chengzhihao1@huawei.com> wrote: >>> ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); >> I've done 50 problem reproduces on different flash devices and made sure that the assertion was not triggered. See record.txt for details. > Thanks for letting me know. :) > I need to give this another thought, then we can apply it for -rc2. > ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps @ 2019-07-19 13:09 Zhihao, Cheng 0 siblings, 0 replies; 14+ messages in thread From: Zhihao, Cheng @ 2019-07-19 13:09 UTC (permalink / raw) To: richard, s.hauer, dedekind1, yi.zhang Cc: linux-mtd, linux-kernel, chengzhihao1 From: Zhihao Cheng <chengzhihao1@huawei.com> Running stress-test test_2 in mtd-utils on ubi device, sometimes we can get follwing oops message: BUG: unable to handle page fault for address: ffffffff00000140 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 280a067 P4D 280a067 PUD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 60 Comm: kworker/u16:1 Kdump: loaded Not tainted 5.2.0 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Workqueue: writeback wb_workfn (flush-ubifs_0_0) RIP: 0010:rb_next_postorder+0x2e/0xb0 Code: 80 db 03 01 48 85 ff 0f 84 97 00 00 00 48 8b 17 48 83 05 bc 80 db 03 01 48 83 e2 fc 0f 84 82 00 00 00 48 83 05 b2 80 db 03 01 <48> 3b 7a 10 48 89 d0 74 02 f3 c3 48 8b 52 08 48 83 05 a3 80 db 03 RSP: 0018:ffffc90000887758 EFLAGS: 00010202 RAX: ffff888129ae4700 RBX: ffff888138b08400 RCX: 0000000080800001 RDX: ffffffff00000130 RSI: 0000000080800024 RDI: ffff888138b08400 RBP: ffff888138b08400 R08: ffffea0004a6b920 R09: 0000000000000000 R10: ffffc90000887740 R11: 0000000000000001 R12: ffff888128d48000 R13: 0000000000000800 R14: 000000000000011e R15: 00000000000007c8 FS: 0000000000000000(0000) GS:ffff88813ba00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff00000140 CR3: 000000013789d000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: destroy_old_idx+0x5d/0xa0 [ubifs] ubifs_tnc_start_commit+0x4fe/0x1380 [ubifs] do_commit+0x3eb/0x830 [ubifs] ubifs_run_commit+0xdc/0x1c0 [ubifs] Above Oops are due to the slab-out-of-bounds happened in do-while of function layout_in_gaps indirectly called by ubifs_tnc_start_commit. In function layout_in_gaps, there is a do-while loop placing index nodes into the gaps created by obsolete index nodes in non-empty index LEBs until rest index nodes can totally be placed into pre-allocated empty LEBs. @c->gap_lebs points to a memory area(integer array) which records LEB numbers used by 'in-the-gaps' method. Whenever a fitable index LEB is found, corresponding lnum will be incrementally written into the memory area pointed by @c->gap_lebs. The size ((@c->lst.idx_lebs + 1) * sizeof(int)) of memory area is allocated before do-while loop and can not be changed in the loop. But @c->lst.idx_lebs could be increased by function ubifs_change_lp (called by layout_leb_in_gaps->ubifs_find_dirty_idx_leb->get_idx_gc_leb) during the loop. So, sometimes oob happens when number of cycles in do-while loop exceeds the original value of @c->lst.idx_lebs. See detail in https://bugzilla.kernel.org/show_bug.cgi?id=204229. This patch fixes oob in layout_in_gaps. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> --- fs/ubifs/tnc_commit.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/fs/ubifs/tnc_commit.c b/fs/ubifs/tnc_commit.c index a384a0f..9fc6b8e 100644 --- a/fs/ubifs/tnc_commit.c +++ b/fs/ubifs/tnc_commit.c @@ -212,7 +212,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, /** * layout_leb_in_gaps - layout index nodes using in-the-gaps method. * @c: UBIFS file-system description object - * @p: return LEB number here + * @p: return LEB number in @c->gap_lebs[p] * * This function lays out new index nodes for dirty znodes using in-the-gaps * method of TNC commit. @@ -221,7 +221,7 @@ static int is_idx_node_in_use(struct ubifs_info *c, union ubifs_key *key, * This function returns the number of index nodes written into the gaps, or a * negative error code on failure. */ -static int layout_leb_in_gaps(struct ubifs_info *c, int *p) +static int layout_leb_in_gaps(struct ubifs_info *c, int p) { struct ubifs_scan_leb *sleb; struct ubifs_scan_node *snod; @@ -236,7 +236,7 @@ static int layout_leb_in_gaps(struct ubifs_info *c, int *p) * filled, however we do not check there at present. */ return lnum; /* Error code */ - *p = lnum; + c->gap_lebs[p] = lnum; dbg_gc("LEB %d", lnum); /* * Scan the index LEB. We use the generic scan for this even though @@ -355,7 +355,8 @@ static int get_leb_cnt(struct ubifs_info *c, int cnt) */ static int layout_in_gaps(struct ubifs_info *c, int cnt) { - int err, leb_needed_cnt, written, *p; + int err, leb_needed_cnt, written, p = 0, old_idx_lebs; + old_idx_lebs = c->lst.idx_lebs; dbg_gc("%d znodes to write", cnt); @@ -364,9 +365,8 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) if (!c->gap_lebs) return -ENOMEM; - p = c->gap_lebs; do { - ubifs_assert(c, p < c->gap_lebs + c->lst.idx_lebs); + ubifs_assert(c, p < c->lst.idx_lebs); written = layout_leb_in_gaps(c, p); if (written < 0) { err = written; @@ -392,9 +392,25 @@ static int layout_in_gaps(struct ubifs_info *c, int cnt) leb_needed_cnt = get_leb_cnt(c, cnt); dbg_gc("%d znodes remaining, need %d LEBs, have %d", cnt, leb_needed_cnt, c->ileb_cnt); + /* + * Dynamically change the size of @c->gap_lebs to prevent + * oob, because @c->lst.idx_lebs could be increased by + * function @get_idx_gc_leb (called by layout_leb_in_gaps-> + * ubifs_find_dirty_idx_leb) during loop. Only enlarge + * @c->gap_lebs when needed. + * + */ + if (leb_needed_cnt > c->ileb_cnt && p >= old_idx_lebs && + old_idx_lebs < c->lst.idx_lebs) { + old_idx_lebs = c->lst.idx_lebs; + c->gap_lebs = krealloc(c->gap_lebs, sizeof(int) * + (old_idx_lebs + 1), GFP_NOFS); + if (!c->gap_lebs) + return -ENOMEM; + } } while (leb_needed_cnt > c->ileb_cnt); - *p = -1; + c->gap_lebs[p] = -1; return 0; } -- 2.7.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2019-11-12 3:47 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-20 6:05 [PATCH] ubifs: ubifs_tnc_start_commit: Fix OOB in layout_in_gaps Zhihao Cheng 2019-07-27 11:09 ` 答复: " chengzhihao 2019-07-27 11:13 ` Richard Weinberger 2019-07-29 16:51 ` Richard Weinberger 2019-07-29 20:35 ` Richard Weinberger 2019-07-30 1:20 ` 答复: " chengzhihao 2019-08-13 21:43 ` Richard Weinberger 2019-08-14 1:20 ` 答复: " chengzhihao 2019-08-16 8:00 ` chengzhihao 2019-09-15 22:00 ` Richard Weinberger 2019-09-16 1:20 ` Zhihao Cheng 2019-10-18 2:23 ` Zhihao Cheng 2019-11-12 3:45 ` Zhihao Cheng -- strict thread matches above, loose matches on Subject: below -- 2019-07-19 13:09 Zhihao, Cheng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).