* [PATCH v2] md-cluster: fix use-after-free issue when removing rdev
@ 2021-04-08 7:44 Heming Zhao
2021-04-20 7:40 ` heming.zhao
2021-04-20 23:15 ` Song Liu
0 siblings, 2 replies; 4+ messages in thread
From: Heming Zhao @ 2021-04-08 7:44 UTC (permalink / raw)
To: linux-raid, song; +Cc: Heming Zhao, ghe, lidong.zhong, xni, colyli
md_kick_rdev_from_array will remove rdev, so we should
use rdev_for_each_safe to search list.
How to trigger:
env: Two nodes on kvm-qemu x86_64 VMs (2C2G with 2 iscsi luns).
```
node2=192.168.0.3
for i in {1..20}; do
echo ==== $i `date` ====;
mdadm -Ss && ssh ${node2} "mdadm -Ss"
wipefs -a /dev/sda /dev/sdb
mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l 1 /dev/sda \
/dev/sdb --assume-clean
ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb"
mdadm --wait /dev/md0
ssh ${node2} "mdadm --wait /dev/md0"
mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda
sleep 1
done
```
Crash stack:
```
stack segment: 0000 [#1] SMP
... ...
RIP: 0010:md_check_recovery+0x1e8/0x570 [md_mod]
... ...
RSP: 0018:ffffb149807a7d68 EFLAGS: 00010207
RAX: 0000000000000000 RBX: ffff9d494c180800 RCX: ffff9d490fc01e50
RDX: fffff047c0ed8308 RSI: 0000000000000246 RDI: 0000000000000246
RBP: 6b6b6b6b6b6b6b6b R08: ffff9d490fc01e40 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: ffff9d494c180818 R14: ffff9d493399ef38 R15: ffff9d4933a1d800
FS: 0000000000000000(0000) GS:ffff9d494f700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe68cab9010 CR3: 000000004c6be001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
raid1d+0x5c/0xd40 [raid1]
? finish_task_switch+0x75/0x2a0
? lock_timer_base+0x67/0x80
? try_to_del_timer_sync+0x4d/0x80
? del_timer_sync+0x41/0x50
? schedule_timeout+0x254/0x2d0
? md_start_sync+0xe0/0xe0 [md_mod]
? md_thread+0x127/0x160 [md_mod]
md_thread+0x127/0x160 [md_mod]
? wait_woken+0x80/0x80
kthread+0x10d/0x130
? kthread_park+0xa0/0xa0
ret_from_fork+0x1f/0x40
```
v2:
- modify commit comments
- add env info for test script
- add 'Fixes' filed
v1:
- create patch
---
Fixes: dbb64f8635f5d ("md-cluster: Fix adding of new disk with new
reload code")
Fixes: 659b254fa7392 ("md-cluster: remove a disk asynchronously from
cluster environment")
Reviewed-by: Gang He <ghe@suse.com>
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
drivers/md/md.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 21da0c48f6c2..9892c13cdfc8 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9251,11 +9251,11 @@ void md_check_recovery(struct mddev *mddev)
}
if (mddev_is_clustered(mddev)) {
- struct md_rdev *rdev;
+ struct md_rdev *rdev, *tmp;
/* kick the device if another node issued a
* remove disk.
*/
- rdev_for_each(rdev, mddev) {
+ rdev_for_each_safe(rdev, tmp, mddev) {
if (test_and_clear_bit(ClusterRemove, &rdev->flags) &&
rdev->raid_disk < 0)
md_kick_rdev_from_array(rdev);
@@ -9569,7 +9569,7 @@ static int __init md_init(void)
static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
{
struct mdp_superblock_1 *sb = page_address(rdev->sb_page);
- struct md_rdev *rdev2;
+ struct md_rdev *rdev2, *tmp;
int role, ret;
char b[BDEVNAME_SIZE];
@@ -9586,7 +9586,7 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
}
/* Check for change of roles in the active devices */
- rdev_for_each(rdev2, mddev) {
+ rdev_for_each_safe(rdev2, tmp, mddev) {
if (test_bit(Faulty, &rdev2->flags))
continue;
--
2.30.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v2] md-cluster: fix use-after-free issue when removing rdev
2021-04-08 7:44 [PATCH v2] md-cluster: fix use-after-free issue when removing rdev Heming Zhao
@ 2021-04-20 7:40 ` heming.zhao
2021-04-20 23:15 ` Song Liu
1 sibling, 0 replies; 4+ messages in thread
From: heming.zhao @ 2021-04-20 7:40 UTC (permalink / raw)
To: linux-raid, song; +Cc: ghe, lidong.zhong, xni, colyli
Hello Song,
It looks you missed this patch, or some places still need to review?
Thanks,
Heming
On 4/8/21 3:44 PM, Heming Zhao wrote:
> md_kick_rdev_from_array will remove rdev, so we should
> use rdev_for_each_safe to search list.
>
> How to trigger:
>
> env: Two nodes on kvm-qemu x86_64 VMs (2C2G with 2 iscsi luns).
>
> ```
> node2=192.168.0.3
>
> for i in {1..20}; do
> echo ==== $i `date` ====;
>
> mdadm -Ss && ssh ${node2} "mdadm -Ss"
> wipefs -a /dev/sda /dev/sdb
>
> mdadm -CR /dev/md0 -b clustered -e 1.2 -n 2 -l 1 /dev/sda \
> /dev/sdb --assume-clean
> ssh ${node2} "mdadm -A /dev/md0 /dev/sda /dev/sdb"
> mdadm --wait /dev/md0
> ssh ${node2} "mdadm --wait /dev/md0"
>
> mdadm --manage /dev/md0 --fail /dev/sda --remove /dev/sda
> sleep 1
> done
> ```
>
> Crash stack:
>
> ```
> stack segment: 0000 [#1] SMP
> ... ...
> RIP: 0010:md_check_recovery+0x1e8/0x570 [md_mod]
> ... ...
> RSP: 0018:ffffb149807a7d68 EFLAGS: 00010207
> RAX: 0000000000000000 RBX: ffff9d494c180800 RCX: ffff9d490fc01e50
> RDX: fffff047c0ed8308 RSI: 0000000000000246 RDI: 0000000000000246
> RBP: 6b6b6b6b6b6b6b6b R08: ffff9d490fc01e40 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
> R13: ffff9d494c180818 R14: ffff9d493399ef38 R15: ffff9d4933a1d800
> FS: 0000000000000000(0000) GS:ffff9d494f700000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fe68cab9010 CR3: 000000004c6be001 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> raid1d+0x5c/0xd40 [raid1]
> ? finish_task_switch+0x75/0x2a0
> ? lock_timer_base+0x67/0x80
> ? try_to_del_timer_sync+0x4d/0x80
> ? del_timer_sync+0x41/0x50
> ? schedule_timeout+0x254/0x2d0
> ? md_start_sync+0xe0/0xe0 [md_mod]
> ? md_thread+0x127/0x160 [md_mod]
> md_thread+0x127/0x160 [md_mod]
> ? wait_woken+0x80/0x80
> kthread+0x10d/0x130
> ? kthread_park+0xa0/0xa0
> ret_from_fork+0x1f/0x40
> ```
>
> v2:
> - modify commit comments
> - add env info for test script
> - add 'Fixes' filed
> v1:
> - create patch
> ---
> Fixes: dbb64f8635f5d ("md-cluster: Fix adding of new disk with new
> reload code")
> Fixes: 659b254fa7392 ("md-cluster: remove a disk asynchronously from
> cluster environment")
> Reviewed-by: Gang He <ghe@suse.com>
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> drivers/md/md.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 21da0c48f6c2..9892c13cdfc8 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -9251,11 +9251,11 @@ void md_check_recovery(struct mddev *mddev)
> }
>
> if (mddev_is_clustered(mddev)) {
> - struct md_rdev *rdev;
> + struct md_rdev *rdev, *tmp;
> /* kick the device if another node issued a
> * remove disk.
> */
> - rdev_for_each(rdev, mddev) {
> + rdev_for_each_safe(rdev, tmp, mddev) {
> if (test_and_clear_bit(ClusterRemove, &rdev->flags) &&
> rdev->raid_disk < 0)
> md_kick_rdev_from_array(rdev);
> @@ -9569,7 +9569,7 @@ static int __init md_init(void)
> static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
> {
> struct mdp_superblock_1 *sb = page_address(rdev->sb_page);
> - struct md_rdev *rdev2;
> + struct md_rdev *rdev2, *tmp;
> int role, ret;
> char b[BDEVNAME_SIZE];
>
> @@ -9586,7 +9586,7 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
> }
>
> /* Check for change of roles in the active devices */
> - rdev_for_each(rdev2, mddev) {
> + rdev_for_each_safe(rdev2, tmp, mddev) {
> if (test_bit(Faulty, &rdev2->flags))
> continue;
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] md-cluster: fix use-after-free issue when removing rdev
2021-04-08 7:44 [PATCH v2] md-cluster: fix use-after-free issue when removing rdev Heming Zhao
2021-04-20 7:40 ` heming.zhao
@ 2021-04-20 23:15 ` Song Liu
2021-04-21 8:12 ` heming.zhao
1 sibling, 1 reply; 4+ messages in thread
From: Song Liu @ 2021-04-20 23:15 UTC (permalink / raw)
To: Heming Zhao; +Cc: linux-raid, ghe, lidong.zhong, Xiao Ni, Coly Li
On Thu, Apr 8, 2021 at 12:44 AM Heming Zhao <heming.zhao@suse.com> wrote:
>
[...]
>
> v2:
> - modify commit comments
> - add env info for test script
> - add 'Fixes' filed
> v1:
> - create patch
> ---
> Fixes: dbb64f8635f5d ("md-cluster: Fix adding of new disk with new
> reload code")
> Fixes: 659b254fa7392 ("md-cluster: remove a disk asynchronously from
> cluster environment")
> Reviewed-by: Gang He <ghe@suse.com>
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Sorry for the delay. Applied to md-next.
Btw: I think you meant to push change list (v2, v1, ..) after the signed-off-by
section? With this patch as-is, the signed-off-by section was dropped during
git-am.
Thanks,
Song
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2] md-cluster: fix use-after-free issue when removing rdev
2021-04-20 23:15 ` Song Liu
@ 2021-04-21 8:12 ` heming.zhao
0 siblings, 0 replies; 4+ messages in thread
From: heming.zhao @ 2021-04-21 8:12 UTC (permalink / raw)
To: Song Liu; +Cc: linux-raid, ghe, lidong.zhong, Xiao Ni, Coly Li
On 4/21/21 7:15 AM, Song Liu wrote:
> On Thu, Apr 8, 2021 at 12:44 AM Heming Zhao <heming.zhao@suse.com> wrote:
>>
> [...]
>
>>
>> v2:
>> - modify commit comments
>> - add env info for test script
>> - add 'Fixes' filed
>> v1:
>> - create patch
>> ---
>> Fixes: dbb64f8635f5d ("md-cluster: Fix adding of new disk with new
>> reload code")
>> Fixes: 659b254fa7392 ("md-cluster: remove a disk asynchronously from
>> cluster environment")
>> Reviewed-by: Gang He <ghe@suse.com>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>
> Sorry for the delay. Applied to md-next.
>
> Btw: I think you meant to push change list (v2, v1, ..) after the signed-off-by
> section? With this patch as-is, the signed-off-by section was dropped during
> git-am.
>
> Thanks,
> Song
>
I got your meaning and won't make this mistake again.
I will resend v2 patch with correct change log position.
Thanks,
Heming
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-04-21 8:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-08 7:44 [PATCH v2] md-cluster: fix use-after-free issue when removing rdev Heming Zhao
2021-04-20 7:40 ` heming.zhao
2021-04-20 23:15 ` Song Liu
2021-04-21 8:12 ` heming.zhao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.