* [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
@ 2022-04-15 4:04 ` Alice Chao
0 siblings, 0 replies; 6+ messages in thread
From: Alice Chao @ 2022-04-15 4:04 UTC (permalink / raw)
To: jejb, martin.petersen, matthias.bgg, linux-scsi, linux-kernel,
linux-arm-kernel, linux-mediatek
Cc: stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao,
cc.chou, chaotian.jing, jiajie.hao, qilin.tan, lin.gui,
yanxu.wei, wsd_upstream
Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
aces with device init thread which calls device_add() to add kobj.parent
before kobject_uevent_env().
Device init call: Device reset call:
scsi_probe_and_add_lun() scsi_evt_thread()
scsi_add_lun() scsi_evt_emit()
scsi_sysfs_add_sdev() kobject_uevent_env() //get kobj.parent
scsi_target_add() kobject_get_path()
len = get_kobj_path_length () // len=1 because parent hasn't created yet
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path() path = kzalloc()
fill_kobj_path() fill_kobj_path() // --length; length -= cur is a negative value
memcpy(path + length, kobject_name(parent), cur); // slab OOB!
Above backtrace describes the problem, device reset thread will get wrong
kobj.parent when device init thread didn’t add kobj.parent yet. When this
racing happened, it triggers the a KASAN dump on the final iteration:
BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
<snip>
Call trace:
__kasan_report+0x124/0x1c8
kasan_report+0x54/0x84
kasan_check_range+0x200/0x208
memcpy+0xb8/0xf0
kobject_get_path+0xf8/0x1b8
kobject_uevent_env+0x228/0xa88
scsi_evt_thread+0x2d0/0x5b0
process_one_work+0x570/0xf94
worker_thread+0x7cc/0xf80
kthread+0x2c4/0x388
These two jobs are scheduled asynchronously, we can't guaranteed that
kobj.parent will be created in device init thread before device reset
thread calls kobject_get_path().
To resolve the racing issue between device init thread and device reset
thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
to complete the creation of kobj.parent.
Device init call: Device reset call:
ufshcd_async_scan() scsi_evt_thread()
scsi_scan_host() scsi_evt_emit() <- add wait_event()
do_scsi_scan_host() <- add wake_up()
scsi_scan_host_selected()
scsi_scan_channel()
scsi_probe_and_add_lun()
scsi_target_add()
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path()
fill_kobj_path()
do_scan_async() <- wake_up() kobject_uevent_env() // add kobj.parent
kobject_get_path() // get valid kobj.parent
fill_kobj_path()
After we add wake_up at do_scsi_scan_host() in device init thread, we can
ensure that device reset thread will get kobject after device init thread
finishes adding parent.
Signed-off-by: Alice Chao <alice.chao@mediatek.com>
---
Change since v2
-Change commit: Describes the preblem first and then the solution.
-Add commit: Add KASAN error log.
---
drivers/scsi/scsi_lib.c | 1 +
drivers/scsi/scsi_scan.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..abf9a71ed77c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
break;
case SDEV_EVT_POWER_ON_RESET_OCCURRED:
envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
+ wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
break;
default:
/* do nothing */
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f4e6c68ac99e..431f229ac435 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
} else {
scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
SCAN_WILD_CARD, 0);
+ wake_up(&shost->host_wait);
}
}
--
2.18.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
@ 2022-04-15 4:04 ` Alice Chao
0 siblings, 0 replies; 6+ messages in thread
From: Alice Chao @ 2022-04-15 4:04 UTC (permalink / raw)
To: jejb, martin.petersen, matthias.bgg, linux-scsi, linux-kernel,
linux-arm-kernel, linux-mediatek
Cc: stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao,
cc.chou, chaotian.jing, jiajie.hao, qilin.tan, lin.gui,
yanxu.wei, wsd_upstream
Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
aces with device init thread which calls device_add() to add kobj.parent
before kobject_uevent_env().
Device init call: Device reset call:
scsi_probe_and_add_lun() scsi_evt_thread()
scsi_add_lun() scsi_evt_emit()
scsi_sysfs_add_sdev() kobject_uevent_env() //get kobj.parent
scsi_target_add() kobject_get_path()
len = get_kobj_path_length () // len=1 because parent hasn't created yet
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path() path = kzalloc()
fill_kobj_path() fill_kobj_path() // --length; length -= cur is a negative value
memcpy(path + length, kobject_name(parent), cur); // slab OOB!
Above backtrace describes the problem, device reset thread will get wrong
kobj.parent when device init thread didn’t add kobj.parent yet. When this
racing happened, it triggers the a KASAN dump on the final iteration:
BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
<snip>
Call trace:
__kasan_report+0x124/0x1c8
kasan_report+0x54/0x84
kasan_check_range+0x200/0x208
memcpy+0xb8/0xf0
kobject_get_path+0xf8/0x1b8
kobject_uevent_env+0x228/0xa88
scsi_evt_thread+0x2d0/0x5b0
process_one_work+0x570/0xf94
worker_thread+0x7cc/0xf80
kthread+0x2c4/0x388
These two jobs are scheduled asynchronously, we can't guaranteed that
kobj.parent will be created in device init thread before device reset
thread calls kobject_get_path().
To resolve the racing issue between device init thread and device reset
thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
to complete the creation of kobj.parent.
Device init call: Device reset call:
ufshcd_async_scan() scsi_evt_thread()
scsi_scan_host() scsi_evt_emit() <- add wait_event()
do_scsi_scan_host() <- add wake_up()
scsi_scan_host_selected()
scsi_scan_channel()
scsi_probe_and_add_lun()
scsi_target_add()
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path()
fill_kobj_path()
do_scan_async() <- wake_up() kobject_uevent_env() // add kobj.parent
kobject_get_path() // get valid kobj.parent
fill_kobj_path()
After we add wake_up at do_scsi_scan_host() in device init thread, we can
ensure that device reset thread will get kobject after device init thread
finishes adding parent.
Signed-off-by: Alice Chao <alice.chao@mediatek.com>
---
Change since v2
-Change commit: Describes the preblem first and then the solution.
-Add commit: Add KASAN error log.
---
drivers/scsi/scsi_lib.c | 1 +
drivers/scsi/scsi_scan.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..abf9a71ed77c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
break;
case SDEV_EVT_POWER_ON_RESET_OCCURRED:
envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
+ wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
break;
default:
/* do nothing */
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f4e6c68ac99e..431f229ac435 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
} else {
scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
SCAN_WILD_CARD, 0);
+ wake_up(&shost->host_wait);
}
}
--
2.18.0
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
@ 2022-04-15 4:04 ` Alice Chao
0 siblings, 0 replies; 6+ messages in thread
From: Alice Chao @ 2022-04-15 4:04 UTC (permalink / raw)
To: jejb, martin.petersen, matthias.bgg, linux-scsi, linux-kernel,
linux-arm-kernel, linux-mediatek
Cc: stanley.chu, peter.wang, chun-hung.wu, alice.chao, powen.kao,
cc.chou, chaotian.jing, jiajie.hao, qilin.tan, lin.gui,
yanxu.wei, wsd_upstream
Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
aces with device init thread which calls device_add() to add kobj.parent
before kobject_uevent_env().
Device init call: Device reset call:
scsi_probe_and_add_lun() scsi_evt_thread()
scsi_add_lun() scsi_evt_emit()
scsi_sysfs_add_sdev() kobject_uevent_env() //get kobj.parent
scsi_target_add() kobject_get_path()
len = get_kobj_path_length () // len=1 because parent hasn't created yet
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path() path = kzalloc()
fill_kobj_path() fill_kobj_path() // --length; length -= cur is a negative value
memcpy(path + length, kobject_name(parent), cur); // slab OOB!
Above backtrace describes the problem, device reset thread will get wrong
kobj.parent when device init thread didn’t add kobj.parent yet. When this
racing happened, it triggers the a KASAN dump on the final iteration:
BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
<snip>
Call trace:
__kasan_report+0x124/0x1c8
kasan_report+0x54/0x84
kasan_check_range+0x200/0x208
memcpy+0xb8/0xf0
kobject_get_path+0xf8/0x1b8
kobject_uevent_env+0x228/0xa88
scsi_evt_thread+0x2d0/0x5b0
process_one_work+0x570/0xf94
worker_thread+0x7cc/0xf80
kthread+0x2c4/0x388
These two jobs are scheduled asynchronously, we can't guaranteed that
kobj.parent will be created in device init thread before device reset
thread calls kobject_get_path().
To resolve the racing issue between device init thread and device reset
thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
to complete the creation of kobj.parent.
Device init call: Device reset call:
ufshcd_async_scan() scsi_evt_thread()
scsi_scan_host() scsi_evt_emit() <- add wait_event()
do_scsi_scan_host() <- add wake_up()
scsi_scan_host_selected()
scsi_scan_channel()
scsi_probe_and_add_lun()
scsi_target_add()
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path()
fill_kobj_path()
do_scan_async() <- wake_up() kobject_uevent_env() // add kobj.parent
kobject_get_path() // get valid kobj.parent
fill_kobj_path()
After we add wake_up at do_scsi_scan_host() in device init thread, we can
ensure that device reset thread will get kobject after device init thread
finishes adding parent.
Signed-off-by: Alice Chao <alice.chao@mediatek.com>
---
Change since v2
-Change commit: Describes the preblem first and then the solution.
-Add commit: Add KASAN error log.
---
drivers/scsi/scsi_lib.c | 1 +
drivers/scsi/scsi_scan.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..abf9a71ed77c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
break;
case SDEV_EVT_POWER_ON_RESET_OCCURRED:
envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
+ wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
break;
default:
/* do nothing */
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f4e6c68ac99e..431f229ac435 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
} else {
scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
SCAN_WILD_CARD, 0);
+ wake_up(&shost->host_wait);
}
}
--
2.18.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
2022-04-15 4:04 ` Alice Chao
(?)
@ 2022-04-15 5:52 ` Miles Chen
-1 siblings, 0 replies; 6+ messages in thread
From: Miles Chen @ 2022-04-15 5:52 UTC (permalink / raw)
To: alice.chao
Cc: cc.chou, chaotian.jing, chun-hung.wu, jejb, jiajie.hao, lin.gui,
linux-arm-kernel, linux-kernel, linux-mediatek, linux-scsi,
martin.petersen, matthias.bgg, peter.wang, powen.kao, qilin.tan,
stanley.chu, wsd_upstream, yanxu.wei
Hi Alice,
> Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
> aces with device init thread which calls device_add() to add kobj.parent
"aces" may be "races"?
> before kobject_uevent_env().
>
> Device init call: Device reset call:
> scsi_probe_and_add_lun() scsi_evt_thread()
> scsi_add_lun() scsi_evt_emit()
> scsi_sysfs_add_sdev() kobject_uevent_env() //get kobj.parent
> scsi_target_add() kobject_get_path()
> len = get_kobj_path_length () // len=1 because parent hasn't created yet
> device_add() // add kobj.parent
> kobject_uevent_env()
> kobject_get_path() path = kzalloc()
> fill_kobj_path() fill_kobj_path() // --length; length -= cur is a negative value
> memcpy(path + length, kobject_name(parent), cur); // slab OOB!
>
> Above backtrace describes the problem, device reset thread will get wrong
> kobj.parent when device init thread didn’t add kobj.parent yet. When this
> racing happened, it triggers the a KASAN dump on the final iteration:
>
> BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
> Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
> <snip>
> Call trace:
> __kasan_report+0x124/0x1c8
> kasan_report+0x54/0x84
> kasan_check_range+0x200/0x208
> memcpy+0xb8/0xf0
> kobject_get_path+0xf8/0x1b8
> kobject_uevent_env+0x228/0xa88
> scsi_evt_thread+0x2d0/0x5b0
> process_one_work+0x570/0xf94
> worker_thread+0x7cc/0xf80
> kthread+0x2c4/0x388
>
> These two jobs are scheduled asynchronously, we can't guaranteed that
> kobj.parent will be created in device init thread before device reset
> thread calls kobject_get_path().
>
> To resolve the racing issue between device init thread and device reset
> thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
> to complete the creation of kobj.parent.
>
> Device init call: Device reset call:
> ufshcd_async_scan() scsi_evt_thread()
> scsi_scan_host() scsi_evt_emit() <- add wait_event()
> do_scsi_scan_host() <- add wake_up()
> scsi_scan_host_selected()
> scsi_scan_channel()
> scsi_probe_and_add_lun()
> scsi_target_add()
> device_add() // add kobj.parent
> kobject_uevent_env()
> kobject_get_path()
> fill_kobj_path()
> do_scan_async() <- wake_up() kobject_uevent_env() // add kobj.parent
There is no do_scan_async() changes in this patch. It this a typo?
From the patch, the flow looks like:
Device init call Device reset call:
do_scsi_scan_host() scsi_evt_thread()
scsi_scan_host_selected() scsi_evt_emit() <- add wait_event()
scsi_scan_channel()
scsi_probe_and_add_lun()
scsi_target_add()
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path()
fill_kobj_path()
//call wake_up() after scsi_scan_host_selected is done
kobject_uevent_env()
kobject_get_path() // get valid kobj.parent
...
fill_kobj_path()
> kobject_get_path() // get valid kobj.parent
> fill_kobj_path()
>
> After we add wake_up at do_scsi_scan_host() in device init thread, we can
> ensure that device reset thread will get kobject after device init thread
> finishes adding parent.
>
> Signed-off-by: Alice Chao <alice.chao@mediatek.com>
>
> ---
>
> Change since v2
> -Change commit: Describes the preblem first and then the solution.
> -Add commit: Add KASAN error log.
Please keep all change history.
e.g.,
See https://lore.kernel.org/lkml/20220326022728.2969-1-jianjun.wang@mediatek.com/
as an example
Thanks,
Miles
>
> ---
> drivers/scsi/scsi_lib.c | 1 +
> drivers/scsi/scsi_scan.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 0a70aa763a96..abf9a71ed77c 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
> break;
> case SDEV_EVT_POWER_ON_RESET_OCCURRED:
> envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
> + wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
> break;
> default:
> /* do nothing */
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index f4e6c68ac99e..431f229ac435 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
> } else {
> scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
> SCAN_WILD_CARD, 0);
> + wake_up(&shost->host_wait);
> }
> }
>
> --
> 2.18.0
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
@ 2022-04-15 5:52 ` Miles Chen
0 siblings, 0 replies; 6+ messages in thread
From: Miles Chen @ 2022-04-15 5:52 UTC (permalink / raw)
To: alice.chao
Cc: cc.chou, chaotian.jing, chun-hung.wu, jejb, jiajie.hao, lin.gui,
linux-arm-kernel, linux-kernel, linux-mediatek, linux-scsi,
martin.petersen, matthias.bgg, peter.wang, powen.kao, qilin.tan,
stanley.chu, wsd_upstream, yanxu.wei
Hi Alice,
> Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
> aces with device init thread which calls device_add() to add kobj.parent
"aces" may be "races"?
> before kobject_uevent_env().
>
> Device init call: Device reset call:
> scsi_probe_and_add_lun() scsi_evt_thread()
> scsi_add_lun() scsi_evt_emit()
> scsi_sysfs_add_sdev() kobject_uevent_env() //get kobj.parent
> scsi_target_add() kobject_get_path()
> len = get_kobj_path_length () // len=1 because parent hasn't created yet
> device_add() // add kobj.parent
> kobject_uevent_env()
> kobject_get_path() path = kzalloc()
> fill_kobj_path() fill_kobj_path() // --length; length -= cur is a negative value
> memcpy(path + length, kobject_name(parent), cur); // slab OOB!
>
> Above backtrace describes the problem, device reset thread will get wrong
> kobj.parent when device init thread didn’t add kobj.parent yet. When this
> racing happened, it triggers the a KASAN dump on the final iteration:
>
> BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
> Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
> <snip>
> Call trace:
> __kasan_report+0x124/0x1c8
> kasan_report+0x54/0x84
> kasan_check_range+0x200/0x208
> memcpy+0xb8/0xf0
> kobject_get_path+0xf8/0x1b8
> kobject_uevent_env+0x228/0xa88
> scsi_evt_thread+0x2d0/0x5b0
> process_one_work+0x570/0xf94
> worker_thread+0x7cc/0xf80
> kthread+0x2c4/0x388
>
> These two jobs are scheduled asynchronously, we can't guaranteed that
> kobj.parent will be created in device init thread before device reset
> thread calls kobject_get_path().
>
> To resolve the racing issue between device init thread and device reset
> thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
> to complete the creation of kobj.parent.
>
> Device init call: Device reset call:
> ufshcd_async_scan() scsi_evt_thread()
> scsi_scan_host() scsi_evt_emit() <- add wait_event()
> do_scsi_scan_host() <- add wake_up()
> scsi_scan_host_selected()
> scsi_scan_channel()
> scsi_probe_and_add_lun()
> scsi_target_add()
> device_add() // add kobj.parent
> kobject_uevent_env()
> kobject_get_path()
> fill_kobj_path()
> do_scan_async() <- wake_up() kobject_uevent_env() // add kobj.parent
There is no do_scan_async() changes in this patch. It this a typo?
From the patch, the flow looks like:
Device init call Device reset call:
do_scsi_scan_host() scsi_evt_thread()
scsi_scan_host_selected() scsi_evt_emit() <- add wait_event()
scsi_scan_channel()
scsi_probe_and_add_lun()
scsi_target_add()
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path()
fill_kobj_path()
//call wake_up() after scsi_scan_host_selected is done
kobject_uevent_env()
kobject_get_path() // get valid kobj.parent
...
fill_kobj_path()
> kobject_get_path() // get valid kobj.parent
> fill_kobj_path()
>
> After we add wake_up at do_scsi_scan_host() in device init thread, we can
> ensure that device reset thread will get kobject after device init thread
> finishes adding parent.
>
> Signed-off-by: Alice Chao <alice.chao@mediatek.com>
>
> ---
>
> Change since v2
> -Change commit: Describes the preblem first and then the solution.
> -Add commit: Add KASAN error log.
Please keep all change history.
e.g.,
See https://lore.kernel.org/lkml/20220326022728.2969-1-jianjun.wang@mediatek.com/
as an example
Thanks,
Miles
>
> ---
> drivers/scsi/scsi_lib.c | 1 +
> drivers/scsi/scsi_scan.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 0a70aa763a96..abf9a71ed77c 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
> break;
> case SDEV_EVT_POWER_ON_RESET_OCCURRED:
> envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
> + wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
> break;
> default:
> /* do nothing */
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index f4e6c68ac99e..431f229ac435 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
> } else {
> scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
> SCAN_WILD_CARD, 0);
> + wake_up(&shost->host_wait);
> }
> }
>
> --
> 2.18.0
>
>
_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
@ 2022-04-15 5:52 ` Miles Chen
0 siblings, 0 replies; 6+ messages in thread
From: Miles Chen @ 2022-04-15 5:52 UTC (permalink / raw)
To: alice.chao
Cc: cc.chou, chaotian.jing, chun-hung.wu, jejb, jiajie.hao, lin.gui,
linux-arm-kernel, linux-kernel, linux-mediatek, linux-scsi,
martin.petersen, matthias.bgg, peter.wang, powen.kao, qilin.tan,
stanley.chu, wsd_upstream, yanxu.wei
Hi Alice,
> Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
> aces with device init thread which calls device_add() to add kobj.parent
"aces" may be "races"?
> before kobject_uevent_env().
>
> Device init call: Device reset call:
> scsi_probe_and_add_lun() scsi_evt_thread()
> scsi_add_lun() scsi_evt_emit()
> scsi_sysfs_add_sdev() kobject_uevent_env() //get kobj.parent
> scsi_target_add() kobject_get_path()
> len = get_kobj_path_length () // len=1 because parent hasn't created yet
> device_add() // add kobj.parent
> kobject_uevent_env()
> kobject_get_path() path = kzalloc()
> fill_kobj_path() fill_kobj_path() // --length; length -= cur is a negative value
> memcpy(path + length, kobject_name(parent), cur); // slab OOB!
>
> Above backtrace describes the problem, device reset thread will get wrong
> kobj.parent when device init thread didn’t add kobj.parent yet. When this
> racing happened, it triggers the a KASAN dump on the final iteration:
>
> BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
> Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
> <snip>
> Call trace:
> __kasan_report+0x124/0x1c8
> kasan_report+0x54/0x84
> kasan_check_range+0x200/0x208
> memcpy+0xb8/0xf0
> kobject_get_path+0xf8/0x1b8
> kobject_uevent_env+0x228/0xa88
> scsi_evt_thread+0x2d0/0x5b0
> process_one_work+0x570/0xf94
> worker_thread+0x7cc/0xf80
> kthread+0x2c4/0x388
>
> These two jobs are scheduled asynchronously, we can't guaranteed that
> kobj.parent will be created in device init thread before device reset
> thread calls kobject_get_path().
>
> To resolve the racing issue between device init thread and device reset
> thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
> to complete the creation of kobj.parent.
>
> Device init call: Device reset call:
> ufshcd_async_scan() scsi_evt_thread()
> scsi_scan_host() scsi_evt_emit() <- add wait_event()
> do_scsi_scan_host() <- add wake_up()
> scsi_scan_host_selected()
> scsi_scan_channel()
> scsi_probe_and_add_lun()
> scsi_target_add()
> device_add() // add kobj.parent
> kobject_uevent_env()
> kobject_get_path()
> fill_kobj_path()
> do_scan_async() <- wake_up() kobject_uevent_env() // add kobj.parent
There is no do_scan_async() changes in this patch. It this a typo?
From the patch, the flow looks like:
Device init call Device reset call:
do_scsi_scan_host() scsi_evt_thread()
scsi_scan_host_selected() scsi_evt_emit() <- add wait_event()
scsi_scan_channel()
scsi_probe_and_add_lun()
scsi_target_add()
device_add() // add kobj.parent
kobject_uevent_env()
kobject_get_path()
fill_kobj_path()
//call wake_up() after scsi_scan_host_selected is done
kobject_uevent_env()
kobject_get_path() // get valid kobj.parent
...
fill_kobj_path()
> kobject_get_path() // get valid kobj.parent
> fill_kobj_path()
>
> After we add wake_up at do_scsi_scan_host() in device init thread, we can
> ensure that device reset thread will get kobject after device init thread
> finishes adding parent.
>
> Signed-off-by: Alice Chao <alice.chao@mediatek.com>
>
> ---
>
> Change since v2
> -Change commit: Describes the preblem first and then the solution.
> -Add commit: Add KASAN error log.
Please keep all change history.
e.g.,
See https://lore.kernel.org/lkml/20220326022728.2969-1-jianjun.wang@mediatek.com/
as an example
Thanks,
Miles
>
> ---
> drivers/scsi/scsi_lib.c | 1 +
> drivers/scsi/scsi_scan.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 0a70aa763a96..abf9a71ed77c 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
> break;
> case SDEV_EVT_POWER_ON_RESET_OCCURRED:
> envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
> + wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
> break;
> default:
> /* do nothing */
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index f4e6c68ac99e..431f229ac435 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
> } else {
> scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
> SCAN_WILD_CARD, 0);
> + wake_up(&shost->host_wait);
> }
> }
>
> --
> 2.18.0
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-04-15 5:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-15 4:04 [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset Alice Chao
2022-04-15 4:04 ` Alice Chao
2022-04-15 4:04 ` Alice Chao
2022-04-15 5:52 ` Miles Chen
2022-04-15 5:52 ` Miles Chen
2022-04-15 5:52 ` Miles Chen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.