* [PATCH AUTOSEL 4.19 1/5] debugobjects: Recheck debug_objects_enabled before reporting
@ 2023-07-02 19:42 Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 2/5] nbd: Add the maximum limit of allocated index in nbd_dev_add Sasha Levin
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Sasha Levin @ 2023-07-02 19:42 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Tetsuo Handa, syzbot, Thomas Gleixner, Sasha Levin
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
[ Upstream commit 8b64d420fe2450f82848178506d3e3a0bd195539 ]
syzbot is reporting false a positive ODEBUG message immediately after
ODEBUG was disabled due to OOM.
[ 1062.309646][T22911] ODEBUG: Out of memory. ODEBUG disabled
[ 1062.886755][ T5171] ------------[ cut here ]------------
[ 1062.892770][ T5171] ODEBUG: assert_init not available (active state 0) object: ffffc900056afb20 object type: timer_list hint: process_timeout+0x0/0x40
CPU 0 [ T5171] CPU 1 [T22911]
-------------- --------------
debug_object_assert_init() {
if (!debug_objects_enabled)
return;
db = get_bucket(addr);
lookup_object_or_alloc() {
debug_objects_enabled = 0;
return NULL;
}
debug_objects_oom() {
pr_warn("Out of memory. ODEBUG disabled\n");
// all buckets get emptied here, and
}
lookup_object_or_alloc(addr, db, descr, false, true) {
// this bucket is already empty.
return ERR_PTR(-ENOENT);
}
// Emits false positive warning.
debug_print_object(&o, "assert_init");
}
Recheck debug_object_enabled in debug_print_object() to avoid that.
Reported-by: syzbot <syzbot+7937ba6a50bdd00fffdf@syzkaller.appspotmail.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/492fe2ae-5141-d548-ebd5-62f5fe2e57f7@I-love.SAKURA.ne.jp
Closes: https://syzkaller.appspot.com/bug?extid=7937ba6a50bdd00fffdf
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
lib/debugobjects.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 5f23d896df55a..62d095fd0c52a 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -371,6 +371,15 @@ static void debug_print_object(struct debug_obj *obj, char *msg)
struct debug_obj_descr *descr = obj->descr;
static int limit;
+ /*
+ * Don't report if lookup_object_or_alloc() by the current thread
+ * failed because lookup_object_or_alloc()/debug_objects_oom() by a
+ * concurrent thread turned off debug_objects_enabled and cleared
+ * the hash buckets.
+ */
+ if (!debug_objects_enabled)
+ return;
+
if (limit < 5 && descr != descr_test) {
void *hint = descr->debug_hint ?
descr->debug_hint(obj->object) : NULL;
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH AUTOSEL 4.19 2/5] nbd: Add the maximum limit of allocated index in nbd_dev_add
2023-07-02 19:42 [PATCH AUTOSEL 4.19 1/5] debugobjects: Recheck debug_objects_enabled before reporting Sasha Levin
@ 2023-07-02 19:42 ` Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 3/5] md: fix data corruption for raid456 when reshape restart while grow up Sasha Levin
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2023-07-02 19:42 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Zhong Jinghua, Christoph Hellwig, Jens Axboe, Sasha Levin, josef,
linux-block, nbd
From: Zhong Jinghua <zhongjinghua@huawei.com>
[ Upstream commit f12bc113ce904777fd6ca003b473b427782b3dde ]
If the index allocated by idr_alloc greater than MINORMASK >> part_shift,
the device number will overflow, resulting in failure to create a block
device.
Fix it by imiting the size of the max allocation.
Signed-off-by: Zhong Jinghua <zhongjinghua@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230605122159.2134384-1-zhongjinghua@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/block/nbd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 28024248a7b53..5a07964a1e676 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1646,7 +1646,8 @@ static int nbd_dev_add(int index)
if (err == -ENOSPC)
err = -EEXIST;
} else {
- err = idr_alloc(&nbd_index_idr, nbd, 0, 0, GFP_KERNEL);
+ err = idr_alloc(&nbd_index_idr, nbd, 0,
+ (MINORMASK >> part_shift) + 1, GFP_KERNEL);
if (err >= 0)
index = err;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH AUTOSEL 4.19 3/5] md: fix data corruption for raid456 when reshape restart while grow up
2023-07-02 19:42 [PATCH AUTOSEL 4.19 1/5] debugobjects: Recheck debug_objects_enabled before reporting Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 2/5] nbd: Add the maximum limit of allocated index in nbd_dev_add Sasha Levin
@ 2023-07-02 19:42 ` Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 4/5] md/raid10: prevent soft lockup while flush writes Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 5/5] posix-timers: Ensure timer ID search-loop limit is valid Sasha Levin
3 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2023-07-02 19:42 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Yu Kuai, Peter Neuwirth, Song Liu, Sasha Levin, linux-raid
From: Yu Kuai <yukuai3@huawei.com>
[ Upstream commit 873f50ece41aad5c4f788a340960c53774b5526e ]
Currently, if reshape is interrupted, echo "reshape" to sync_action will
restart reshape from scratch, for example:
echo frozen > sync_action
echo reshape > sync_action
This will corrupt data before reshape_position if the array is growing,
fix the problem by continue reshape from reshape_position.
Reported-by: Peter Neuwirth <reddunur@online.de>
Link: https://lore.kernel.org/linux-raid/e2f96772-bfbc-f43b-6da1-f520e5164536@online.de/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230512015610.821290-3-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/md/md.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index f8c111b369928..17055cd46fcec 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4636,11 +4636,21 @@ action_store(struct mddev *mddev, const char *page, size_t len)
return -EINVAL;
err = mddev_lock(mddev);
if (!err) {
- if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
+ if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
err = -EBUSY;
- else {
+ } else if (mddev->reshape_position == MaxSector ||
+ mddev->pers->check_reshape == NULL ||
+ mddev->pers->check_reshape(mddev)) {
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
err = mddev->pers->start_reshape(mddev);
+ } else {
+ /*
+ * If reshape is still in progress, and
+ * md_check_recovery() can continue to reshape,
+ * don't restart reshape because data can be
+ * corrupted for raid456.
+ */
+ clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
}
mddev_unlock(mddev);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH AUTOSEL 4.19 4/5] md/raid10: prevent soft lockup while flush writes
2023-07-02 19:42 [PATCH AUTOSEL 4.19 1/5] debugobjects: Recheck debug_objects_enabled before reporting Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 2/5] nbd: Add the maximum limit of allocated index in nbd_dev_add Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 3/5] md: fix data corruption for raid456 when reshape restart while grow up Sasha Levin
@ 2023-07-02 19:42 ` Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 5/5] posix-timers: Ensure timer ID search-loop limit is valid Sasha Levin
3 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2023-07-02 19:42 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Yu Kuai, Song Liu, Sasha Levin, linux-raid
From: Yu Kuai <yukuai3@huawei.com>
[ Upstream commit 010444623e7f4da6b4a4dd603a7da7469981e293 ]
Currently, there is no limit for raid1/raid10 plugged bio. While flushing
writes, raid1 has cond_resched() while raid10 doesn't, and too many
writes can cause soft lockup.
Follow up soft lockup can be triggered easily with writeback test for
raid10 with ramdisks:
watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293]
Call Trace:
<TASK>
call_rcu+0x16/0x20
put_object+0x41/0x80
__delete_object+0x50/0x90
delete_object_full+0x2b/0x40
kmemleak_free+0x46/0xa0
slab_free_freelist_hook.constprop.0+0xed/0x1a0
kmem_cache_free+0xfd/0x300
mempool_free_slab+0x1f/0x30
mempool_free+0x3a/0x100
bio_free+0x59/0x80
bio_put+0xcf/0x2c0
free_r10bio+0xbf/0xf0
raid_end_bio_io+0x78/0xb0
one_write_done+0x8a/0xa0
raid10_end_write_request+0x1b4/0x430
bio_endio+0x175/0x320
brd_submit_bio+0x3b9/0x9b7 [brd]
__submit_bio+0x69/0xe0
submit_bio_noacct_nocheck+0x1e6/0x5a0
submit_bio_noacct+0x38c/0x7e0
flush_pending_writes+0xf0/0x240
raid10d+0xac/0x1ed0
Fix the problem by adding cond_resched() to raid10 like what raid1 did.
Note that unlimited plugged bio still need to be optimized, for example,
in the case of lots of dirty pages writeback, this will take lots of
memory and io will spend a long time in plug, hence io latency is bad.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230529131106.2123367-2-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/md/raid10.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index f6d2be1d23864..31ee0f2d75b70 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -934,6 +934,7 @@ static void flush_pending_writes(struct r10conf *conf)
else
generic_make_request(bio);
bio = next;
+ cond_resched();
}
blk_finish_plug(&plug);
} else
@@ -1119,6 +1120,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)
else
generic_make_request(bio);
bio = next;
+ cond_resched();
}
kfree(plug);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH AUTOSEL 4.19 5/5] posix-timers: Ensure timer ID search-loop limit is valid
2023-07-02 19:42 [PATCH AUTOSEL 4.19 1/5] debugobjects: Recheck debug_objects_enabled before reporting Sasha Levin
` (2 preceding siblings ...)
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 4/5] md/raid10: prevent soft lockup while flush writes Sasha Levin
@ 2023-07-02 19:42 ` Sasha Levin
3 siblings, 0 replies; 5+ messages in thread
From: Sasha Levin @ 2023-07-02 19:42 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Thomas Gleixner, syzbot+5c54bd3eb218bb595aa9, Dmitry Vyukov,
Frederic Weisbecker, Sasha Levin, ebiederm
From: Thomas Gleixner <tglx@linutronix.de>
[ Upstream commit 8ce8849dd1e78dadcee0ec9acbd259d239b7069f ]
posix_timer_add() tries to allocate a posix timer ID by starting from the
cached ID which was stored by the last successful allocation.
This is done in a loop searching the ID space for a free slot one by
one. The loop has to terminate when the search wrapped around to the
starting point.
But that's racy vs. establishing the starting point. That is read out
lockless, which leads to the following problem:
CPU0 CPU1
posix_timer_add()
start = sig->posix_timer_id;
lock(hash_lock);
... posix_timer_add()
if (++sig->posix_timer_id < 0)
start = sig->posix_timer_id;
sig->posix_timer_id = 0;
So CPU1 can observe a negative start value, i.e. -1, and the loop break
never happens because the condition can never be true:
if (sig->posix_timer_id == start)
break;
While this is unlikely to ever turn into an endless loop as the ID space is
huge (INT_MAX), the racy read of the start value caught the attention of
KCSAN and Dmitry unearthed that incorrectness.
Rewrite it so that all id operations are under the hash lock.
Reported-by: syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/87bkhzdn6g.ffs@tglx
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/linux/sched/signal.h | 2 +-
kernel/time/posix-timers.c | 31 ++++++++++++++++++-------------
2 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 660d78c9af6c8..6a55b30ae742b 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -127,7 +127,7 @@ struct signal_struct {
#ifdef CONFIG_POSIX_TIMERS
/* POSIX.1b Interval Timers */
- int posix_timer_id;
+ unsigned int next_posix_timer_id;
struct list_head posix_timers;
/* ITIMER_REAL timer for the process */
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1234868b3b03e..8768ce2c4bf52 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -159,25 +159,30 @@ static struct k_itimer *posix_timer_by_id(timer_t id)
static int posix_timer_add(struct k_itimer *timer)
{
struct signal_struct *sig = current->signal;
- int first_free_id = sig->posix_timer_id;
struct hlist_head *head;
- int ret = -ENOENT;
+ unsigned int cnt, id;
- do {
+ /*
+ * FIXME: Replace this by a per signal struct xarray once there is
+ * a plan to handle the resulting CRIU regression gracefully.
+ */
+ for (cnt = 0; cnt <= INT_MAX; cnt++) {
spin_lock(&hash_lock);
- head = &posix_timers_hashtable[hash(sig, sig->posix_timer_id)];
- if (!__posix_timers_find(head, sig, sig->posix_timer_id)) {
+ id = sig->next_posix_timer_id;
+
+ /* Write the next ID back. Clamp it to the positive space */
+ sig->next_posix_timer_id = (id + 1) & INT_MAX;
+
+ head = &posix_timers_hashtable[hash(sig, id)];
+ if (!__posix_timers_find(head, sig, id)) {
hlist_add_head_rcu(&timer->t_hash, head);
- ret = sig->posix_timer_id;
+ spin_unlock(&hash_lock);
+ return id;
}
- if (++sig->posix_timer_id < 0)
- sig->posix_timer_id = 0;
- if ((sig->posix_timer_id == first_free_id) && (ret == -ENOENT))
- /* Loop over all possible ids completed */
- ret = -EAGAIN;
spin_unlock(&hash_lock);
- } while (ret == -ENOENT);
- return ret;
+ }
+ /* POSIX return code when no timer ID could be allocated */
+ return -EAGAIN;
}
static inline void unlock_timer(struct k_itimer *timr, unsigned long flags)
--
2.39.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-07-02 19:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-02 19:42 [PATCH AUTOSEL 4.19 1/5] debugobjects: Recheck debug_objects_enabled before reporting Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 2/5] nbd: Add the maximum limit of allocated index in nbd_dev_add Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 3/5] md: fix data corruption for raid456 when reshape restart while grow up Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 4/5] md/raid10: prevent soft lockup while flush writes Sasha Levin
2023-07-02 19:42 ` [PATCH AUTOSEL 4.19 5/5] posix-timers: Ensure timer ID search-loop limit is valid Sasha Levin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.