* [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-18 7:08 ` Youfu Zhang
0 siblings, 0 replies; 11+ messages in thread
From: Youfu Zhang @ 2021-10-18 7:08 UTC (permalink / raw)
To: tj, axboe; +Cc: cgroups, linux-block
Hi,
I ran into a kernel bug related to blk-throttle on CentOS 7 AltArch for i386.
Userspace programs may panic the kernel if they hit the I/O limit
within 5 minutes after startup.
Root cause:
1. jiffies was initialized to -300HZ during boot on 32bit machines
2. enable blkio cgroup hierarchy
__DEVEL__sane_behavior for cgroup v1 or default hierarchy for cgroup v2
EL7 kernel modified throtl_pd_init and always enable hierarchical throttling
3. enable & trigger blkio throttling within 5 minutes after startup
bio propagated from child tg to parent
4. enter throtl_start_new_slice_with_credit
if(time_after_eq(start, tg->slice_start[rw]))
aka. time_after_eq(0xFFFxxxxx, 0) does not hold
parent tg->slice_start[rw] was zero-initialized and not updated
5. enter throtl_trim_slice
BUG_ON(time_before(tg->slice_end[rw], tg->slice_start[rw]))
aka. time_before(0xFFFxxxxx, 0) triggers a panic
Reproducer: (tested on Alpine Linux x86 kernel 5.10.X)
#!/bin/sh
CGROUP_PATH="$(mktemp -d)"
mount -t cgroup2 none "$CGROUP_PATH"
echo +io >"$CGROUP_PATH/cgroup.subtree_control"
mkdir "$CGROUP_PATH/child"
echo "7:0 riops=2" >"$CGROUP_PATH/child/io.max"
echo 0 >"$CGROUP_PATH/child/cgroup.procs"
echo 3 >/proc/sys/vm/drop_caches
dd if=/dev/loop0 of=/dev/null count=3
^ permalink raw reply [flat|nested] 11+ messages in thread
* [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-18 7:08 ` Youfu Zhang
0 siblings, 0 replies; 11+ messages in thread
From: Youfu Zhang @ 2021-10-18 7:08 UTC (permalink / raw)
To: tj-DgEjT+Ai2ygdnm+yROfE0A, axboe-tSWWG44O7X1aa/9Udqfwiw
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, linux-block-u79uwXL29TY76Z2rM5mHXA
Hi,
I ran into a kernel bug related to blk-throttle on CentOS 7 AltArch for i386.
Userspace programs may panic the kernel if they hit the I/O limit
within 5 minutes after startup.
Root cause:
1. jiffies was initialized to -300HZ during boot on 32bit machines
2. enable blkio cgroup hierarchy
__DEVEL__sane_behavior for cgroup v1 or default hierarchy for cgroup v2
EL7 kernel modified throtl_pd_init and always enable hierarchical throttling
3. enable & trigger blkio throttling within 5 minutes after startup
bio propagated from child tg to parent
4. enter throtl_start_new_slice_with_credit
if(time_after_eq(start, tg->slice_start[rw]))
aka. time_after_eq(0xFFFxxxxx, 0) does not hold
parent tg->slice_start[rw] was zero-initialized and not updated
5. enter throtl_trim_slice
BUG_ON(time_before(tg->slice_end[rw], tg->slice_start[rw]))
aka. time_before(0xFFFxxxxx, 0) triggers a panic
Reproducer: (tested on Alpine Linux x86 kernel 5.10.X)
#!/bin/sh
CGROUP_PATH="$(mktemp -d)"
mount -t cgroup2 none "$CGROUP_PATH"
echo +io >"$CGROUP_PATH/cgroup.subtree_control"
mkdir "$CGROUP_PATH/child"
echo "7:0 riops=2" >"$CGROUP_PATH/child/io.max"
echo 0 >"$CGROUP_PATH/child/cgroup.procs"
echo 3 >/proc/sys/vm/drop_caches
dd if=/dev/loop0 of=/dev/null count=3
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-18 15:22 ` Liqueur Librazy
0 siblings, 0 replies; 11+ messages in thread
From: Liqueur Librazy @ 2021-10-18 15:22 UTC (permalink / raw)
To: zhangyoufu; +Cc: axboe, cgroups, linux-block, tj
Hi,
Yet another colleague of the reporter here. I found that some
precondition maybe not sound when tg->slice_end[rw] is initialized with
0, which time_before(INITIAL_JIFFIES, 0) holds true in 32-bit Linux.
As in v5.15-rc6/block/blk-throttle.c
1. L833
/* Determine if previously allocated or extended slice is complete or
not */
static bool throtl_slice_used(struct throtl_grp *tg, bool rw)
{
if (time_in_range(jiffies, tg->slice_start[rw], tg->slice_end[rw]))
return false;
return true;
}
throtl_slice_used will always return true for a newly initialized slice.
This may be intentional behavior but not mentioned in comment.
(except when jiffies == 0, which is another topic: will
time_in_range_open do better here?)
2. L791, in throtl_start_new_slice_with_credit
/*
* Previous slice has expired. We must have trimmed it after last
* bio dispatch. That means since start of last slice, we never used
* that bandwidth. Do try to make use of that bandwidth while giving
* credit.
*/
if (time_after_eq(start, tg->slice_start[rw]))
tg->slice_start[rw] = start;
As mentioned in my colleague Haoran Luo's reply, time_after_eq(start,
tg->slice_start[rw]) is falsy when the jiffies had not wrapped around.
A easy solution is to add a check for tg->slice_start[rw] == 0, or we
should initialize tg->slice_start[rw] and tg->slice_end[rw] with
INITIAL_JIFFIES.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-18 15:22 ` Liqueur Librazy
0 siblings, 0 replies; 11+ messages in thread
From: Liqueur Librazy @ 2021-10-18 15:22 UTC (permalink / raw)
To: zhangyoufu-Re5JQEeQqe8AvxtiuMwx3w
Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, cgroups-u79uwXL29TY76Z2rM5mHXA,
linux-block-u79uwXL29TY76Z2rM5mHXA, tj-DgEjT+Ai2ygdnm+yROfE0A
Hi,
Yet another colleague of the reporter here. I found that some
precondition maybe not sound when tg->slice_end[rw] is initialized with
0, which time_before(INITIAL_JIFFIES, 0) holds true in 32-bit Linux.
As in v5.15-rc6/block/blk-throttle.c
1. L833
/* Determine if previously allocated or extended slice is complete or
not */
static bool throtl_slice_used(struct throtl_grp *tg, bool rw)
{
if (time_in_range(jiffies, tg->slice_start[rw], tg->slice_end[rw]))
return false;
return true;
}
throtl_slice_used will always return true for a newly initialized slice.
This may be intentional behavior but not mentioned in comment.
(except when jiffies == 0, which is another topic: will
time_in_range_open do better here?)
2. L791, in throtl_start_new_slice_with_credit
/*
* Previous slice has expired. We must have trimmed it after last
* bio dispatch. That means since start of last slice, we never used
* that bandwidth. Do try to make use of that bandwidth while giving
* credit.
*/
if (time_after_eq(start, tg->slice_start[rw]))
tg->slice_start[rw] = start;
As mentioned in my colleague Haoran Luo's reply, time_after_eq(start,
tg->slice_start[rw]) is falsy when the jiffies had not wrapped around.
A easy solution is to add a check for tg->slice_start[rw] == 0, or we
should initialize tg->slice_start[rw] and tg->slice_end[rw] with
INITIAL_JIFFIES.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-19 17:45 ` Tejun Heo
0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2021-10-19 17:45 UTC (permalink / raw)
To: Youfu Zhang; +Cc: axboe, cgroups, linux-block
On Mon, Oct 18, 2021 at 03:08:53PM +0800, Youfu Zhang wrote:
> Hi,
>
> I ran into a kernel bug related to blk-throttle on CentOS 7 AltArch for i386.
> Userspace programs may panic the kernel if they hit the I/O limit
> within 5 minutes after startup.
>
> Root cause:
> 1. jiffies was initialized to -300HZ during boot on 32bit machines
> 2. enable blkio cgroup hierarchy
> __DEVEL__sane_behavior for cgroup v1 or default hierarchy for cgroup v2
> EL7 kernel modified throtl_pd_init and always enable hierarchical throttling
> 3. enable & trigger blkio throttling within 5 minutes after startup
> bio propagated from child tg to parent
> 4. enter throtl_start_new_slice_with_credit
> if(time_after_eq(start, tg->slice_start[rw]))
> aka. time_after_eq(0xFFFxxxxx, 0) does not hold
> parent tg->slice_start[rw] was zero-initialized and not updated
> 5. enter throtl_trim_slice
> BUG_ON(time_before(tg->slice_end[rw], tg->slice_start[rw]))
> aka. time_before(0xFFFxxxxx, 0) triggers a panic
This doesn't reproduce on 5.14.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-19 17:45 ` Tejun Heo
0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2021-10-19 17:45 UTC (permalink / raw)
To: Youfu Zhang
Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, cgroups-u79uwXL29TY76Z2rM5mHXA,
linux-block-u79uwXL29TY76Z2rM5mHXA
On Mon, Oct 18, 2021 at 03:08:53PM +0800, Youfu Zhang wrote:
> Hi,
>
> I ran into a kernel bug related to blk-throttle on CentOS 7 AltArch for i386.
> Userspace programs may panic the kernel if they hit the I/O limit
> within 5 minutes after startup.
>
> Root cause:
> 1. jiffies was initialized to -300HZ during boot on 32bit machines
> 2. enable blkio cgroup hierarchy
> __DEVEL__sane_behavior for cgroup v1 or default hierarchy for cgroup v2
> EL7 kernel modified throtl_pd_init and always enable hierarchical throttling
> 3. enable & trigger blkio throttling within 5 minutes after startup
> bio propagated from child tg to parent
> 4. enter throtl_start_new_slice_with_credit
> if(time_after_eq(start, tg->slice_start[rw]))
> aka. time_after_eq(0xFFFxxxxx, 0) does not hold
> parent tg->slice_start[rw] was zero-initialized and not updated
> 5. enter throtl_trim_slice
> BUG_ON(time_before(tg->slice_end[rw], tg->slice_start[rw]))
> aka. time_before(0xFFFxxxxx, 0) triggers a panic
This doesn't reproduce on 5.14.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
2021-10-19 17:45 ` Tejun Heo
(?)
@ 2021-10-21 4:26 ` Youfu Zhang
-1 siblings, 0 replies; 11+ messages in thread
From: Youfu Zhang @ 2021-10-21 4:26 UTC (permalink / raw)
To: Tejun Heo; +Cc: axboe, cgroups, linux-block
> This doesn't reproduce on 5.14.
I can reproduce this bug on 5.14 i386.
I ran the reproducer (slightly modified, sr0 instead of loop0, 11:0
instead of 8:0) on Debian installer live CD.
https://cdimage.debian.org/cdimage/daily-builds/daily/current/i386/iso-cd/debian-testing-i386-netinst.iso
I posted a screen recording at https://youtu.be/ULdoHizTi0k. Please check.
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-18 9:25 ` Haoran Luo
0 siblings, 0 replies; 11+ messages in thread
From: Haoran Luo @ 2021-10-18 9:25 UTC (permalink / raw)
To: Haoran Luo; +Cc: axboe, cgroups, linux-block, tj, zhangyoufu
Pardon me for elaborating some of my opinions.
> I think this piece of code presumes all jiffies values are greater than
> 0, which is the initial value assigned when kzalloc-ing throtl_grp. It
> fails on 32-bit linux for the first 5 minutes after booting, since the
> jiffies value then will be less than 0.
Expressing the jiffies value as greater or less than 0 is a mistake of vagueness. I actually means that comparison in macro "time_after_eq", which written as below in "5.16-rc2" in "include/linux/jiffies.h" at around line 110.
#define time_after_eq(a,b) \
(typecheck(unsigned long, a) && \
typecheck(unsigned long, b) && \
((long)((a) - (b)) >= 0))
The "INITIAL_JIFFIES" which is defined as "((unsigned long)(unsigned int)-300*HZ)", converts to "((long)-300*HZ)" and is smaller than "((long)0)". And similarily "timer_after_eq(x, 0)" being evaluated to false holds for x ranged from `INITIAL_JIFFIES <= x <= MAX_LONG` on 32-bit linux.
The same thing will not happen for 64-bit linux, since "((unsigned long)(unsigned int)-300*HZ)" is evaluated to a value greater than zero in the macro above.
I actually cannot figure out a fix for this problem on 32-bit, if it presumes the jiffies value in "tg->slice_start[rw]" to be either "0" or jiffies "x" holding property of "time_after_eq(x, 0)".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-18 9:25 ` Haoran Luo
0 siblings, 0 replies; 11+ messages in thread
From: Haoran Luo @ 2021-10-18 9:25 UTC (permalink / raw)
To: Haoran Luo
Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, cgroups-u79uwXL29TY76Z2rM5mHXA,
linux-block-u79uwXL29TY76Z2rM5mHXA, tj-DgEjT+Ai2ygdnm+yROfE0A,
zhangyoufu-Re5JQEeQqe8AvxtiuMwx3w
Pardon me for elaborating some of my opinions.
> I think this piece of code presumes all jiffies values are greater than
> 0, which is the initial value assigned when kzalloc-ing throtl_grp. It
> fails on 32-bit linux for the first 5 minutes after booting, since the
> jiffies value then will be less than 0.
Expressing the jiffies value as greater or less than 0 is a mistake of vagueness. I actually means that comparison in macro "time_after_eq", which written as below in "5.16-rc2" in "include/linux/jiffies.h" at around line 110.
#define time_after_eq(a,b) \
(typecheck(unsigned long, a) && \
typecheck(unsigned long, b) && \
((long)((a) - (b)) >= 0))
The "INITIAL_JIFFIES" which is defined as "((unsigned long)(unsigned int)-300*HZ)", converts to "((long)-300*HZ)" and is smaller than "((long)0)". And similarily "timer_after_eq(x, 0)" being evaluated to false holds for x ranged from `INITIAL_JIFFIES <= x <= MAX_LONG` on 32-bit linux.
The same thing will not happen for 64-bit linux, since "((unsigned long)(unsigned int)-300*HZ)" is evaluated to a value greater than zero in the macro above.
I actually cannot figure out a fix for this problem on 32-bit, if it presumes the jiffies value in "tg->slice_start[rw]" to be either "0" or jiffies "x" holding property of "time_after_eq(x, 0)".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-18 8:00 ` Haoran Luo
0 siblings, 0 replies; 11+ messages in thread
From: Haoran Luo @ 2021-10-18 8:00 UTC (permalink / raw)
To: zhangyoufu; +Cc: axboe, cgroups, linux-block, tj
(Sorry for the garbled message due to my mistaken the configuration of mutt)
I'm the college of the reporter and I would like to provide more
information.
The code in 5.15-rc6 in "blk-throttle.c" around line 791 is written as
below:
/*
* Previous slice has expired. We must have trimmed it after
* last
* bio dispatch. That means since start of last slice, we never
* used
* that bandwidth. Do try to make use of that bandwidth while
* giving
* credit.
*/
if (time_after_eq(start, tg->slice_start[rw]))
tg->slice_start[rw] = start;
I think this piece of code presumes all jiffies values are greater than
0, which is the initial value assigned when kzalloc-ing throtl_grp. It
fails on 32-bit linux for the first 5 minutes after booting, since the
jiffies value then will be less than 0.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] blk-throttle panic on 32bit machine after startup
@ 2021-10-18 8:00 ` Haoran Luo
0 siblings, 0 replies; 11+ messages in thread
From: Haoran Luo @ 2021-10-18 8:00 UTC (permalink / raw)
To: zhangyoufu-Re5JQEeQqe8AvxtiuMwx3w
Cc: axboe-tSWWG44O7X1aa/9Udqfwiw, cgroups-u79uwXL29TY76Z2rM5mHXA,
linux-block-u79uwXL29TY76Z2rM5mHXA, tj-DgEjT+Ai2ygdnm+yROfE0A
(Sorry for the garbled message due to my mistaken the configuration of mutt)
I'm the college of the reporter and I would like to provide more
information.
The code in 5.15-rc6 in "blk-throttle.c" around line 791 is written as
below:
/*
* Previous slice has expired. We must have trimmed it after
* last
* bio dispatch. That means since start of last slice, we never
* used
* that bandwidth. Do try to make use of that bandwidth while
* giving
* credit.
*/
if (time_after_eq(start, tg->slice_start[rw]))
tg->slice_start[rw] = start;
I think this piece of code presumes all jiffies values are greater than
0, which is the initial value assigned when kzalloc-ing throtl_grp. It
fails on 32-bit linux for the first 5 minutes after booting, since the
jiffies value then will be less than 0.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-10-21 4:27 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 7:08 [BUG] blk-throttle panic on 32bit machine after startup Youfu Zhang
2021-10-18 7:08 ` Youfu Zhang
2021-10-18 15:22 ` Liqueur Librazy
2021-10-18 15:22 ` Liqueur Librazy
2021-10-19 17:45 ` Tejun Heo
2021-10-19 17:45 ` Tejun Heo
2021-10-21 4:26 ` Youfu Zhang
2021-10-18 8:00 Haoran Luo
2021-10-18 8:00 ` Haoran Luo
2021-10-18 9:25 Haoran Luo
2021-10-18 9:25 ` Haoran Luo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.