All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alice Chao <alice.chao@mediatek.com>
To: <jejb@linux.ibm.com>, <martin.petersen@oracle.com>,
	<matthias.bgg@gmail.com>, <linux-scsi@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-mediatek@lists.infradead.org>
Cc: <stanley.chu@mediatek.com>, <peter.wang@mediatek.com>,
	<chun-hung.wu@mediatek.com>, <alice.chao@mediatek.com>,
	<powen.kao@mediatek.com>, <cc.chou@mediatek.com>,
	<chaotian.jing@mediatek.com>, <jiajie.hao@mediatek.com>,
	<qilin.tan@mediatek.com>, <lin.gui@mediatek.com>,
	<yanxu.wei@mediatek.com>, <wsd_upstream@mediatek.com>
Subject: [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
Date: Fri, 15 Apr 2022 12:04:47 +0800	[thread overview]
Message-ID: <20220415040446.26451-2-alice.chao@mediatek.com> (raw)

Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
aces with device init thread which calls device_add() to add kobj.parent
before kobject_uevent_env().

Device init call:           Device reset call:
 scsi_probe_and_add_lun()    scsi_evt_thread()
  scsi_add_lun()             scsi_evt_emit()
   scsi_sysfs_add_sdev()      kobject_uevent_env() //get kobj.parent
    scsi_target_add()           kobject_get_path()
                                 len = get_kobj_path_length () // len=1 because parent hasn't created yet
    device_add() // add kobj.parent
      kobject_uevent_env()
       kobject_get_path()         path = kzalloc()
        fill_kobj_path()           fill_kobj_path() // --length; length -= cur is a negative value
                                    memcpy(path + length, kobject_name(parent), cur); // slab OOB!

Above backtrace describes the problem, device reset thread will get wrong
kobj.parent when device init thread didn’t add kobj.parent yet. When this
racing happened, it triggers the a KASAN dump on the final iteration:

BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
<snip>
Call trace:
 __kasan_report+0x124/0x1c8
 kasan_report+0x54/0x84
 kasan_check_range+0x200/0x208
 memcpy+0xb8/0xf0
 kobject_get_path+0xf8/0x1b8
 kobject_uevent_env+0x228/0xa88
 scsi_evt_thread+0x2d0/0x5b0
 process_one_work+0x570/0xf94
 worker_thread+0x7cc/0xf80
 kthread+0x2c4/0x388

These two jobs are scheduled asynchronously, we can't guaranteed that
kobj.parent will be created in device init thread before device reset
thread calls kobject_get_path().

To resolve the racing issue between device init thread and device reset
thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
to complete the creation of kobj.parent.

Device init call:                Device reset call:
ufshcd_async_scan()              scsi_evt_thread()
 scsi_scan_host()                 scsi_evt_emit() <- add wait_event()
  do_scsi_scan_host() <- add wake_up()
   scsi_scan_host_selected()
    scsi_scan_channel()
     scsi_probe_and_add_lun()
      scsi_target_add()
       device_add() // add kobj.parent
        kobject_uevent_env()
         kobject_get_path()
          fill_kobj_path()
  do_scan_async() <- wake_up()     kobject_uevent_env() // add kobj.parent
                                    kobject_get_path() // get valid kobj.parent
                                     fill_kobj_path()

After we add wake_up at do_scsi_scan_host() in device init thread, we can
ensure that device reset thread will get kobject after device init thread
finishes adding parent.

Signed-off-by: Alice Chao <alice.chao@mediatek.com>

---

Change since v2
-Change commit: Describes the preblem first and then the solution.
-Add commit: Add KASAN error log.

---
 drivers/scsi/scsi_lib.c  | 1 +
 drivers/scsi/scsi_scan.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..abf9a71ed77c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
 		break;
 	case SDEV_EVT_POWER_ON_RESET_OCCURRED:
 		envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
+		wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
 		break;
 	default:
 		/* do nothing */
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f4e6c68ac99e..431f229ac435 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
 	} else {
 		scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
 				SCAN_WILD_CARD, 0);
+		wake_up(&shost->host_wait);
 	}
 }
 
-- 
2.18.0


WARNING: multiple messages have this Message-ID (diff)
From: Alice Chao <alice.chao@mediatek.com>
To: <jejb@linux.ibm.com>, <martin.petersen@oracle.com>,
	<matthias.bgg@gmail.com>, <linux-scsi@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-mediatek@lists.infradead.org>
Cc: <stanley.chu@mediatek.com>, <peter.wang@mediatek.com>,
	<chun-hung.wu@mediatek.com>, <alice.chao@mediatek.com>,
	<powen.kao@mediatek.com>, <cc.chou@mediatek.com>,
	<chaotian.jing@mediatek.com>, <jiajie.hao@mediatek.com>,
	<qilin.tan@mediatek.com>, <lin.gui@mediatek.com>,
	<yanxu.wei@mediatek.com>, <wsd_upstream@mediatek.com>
Subject: [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
Date: Fri, 15 Apr 2022 12:04:47 +0800	[thread overview]
Message-ID: <20220415040446.26451-2-alice.chao@mediatek.com> (raw)

Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
aces with device init thread which calls device_add() to add kobj.parent
before kobject_uevent_env().

Device init call:           Device reset call:
 scsi_probe_and_add_lun()    scsi_evt_thread()
  scsi_add_lun()             scsi_evt_emit()
   scsi_sysfs_add_sdev()      kobject_uevent_env() //get kobj.parent
    scsi_target_add()           kobject_get_path()
                                 len = get_kobj_path_length () // len=1 because parent hasn't created yet
    device_add() // add kobj.parent
      kobject_uevent_env()
       kobject_get_path()         path = kzalloc()
        fill_kobj_path()           fill_kobj_path() // --length; length -= cur is a negative value
                                    memcpy(path + length, kobject_name(parent), cur); // slab OOB!

Above backtrace describes the problem, device reset thread will get wrong
kobj.parent when device init thread didn’t add kobj.parent yet. When this
racing happened, it triggers the a KASAN dump on the final iteration:

BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
<snip>
Call trace:
 __kasan_report+0x124/0x1c8
 kasan_report+0x54/0x84
 kasan_check_range+0x200/0x208
 memcpy+0xb8/0xf0
 kobject_get_path+0xf8/0x1b8
 kobject_uevent_env+0x228/0xa88
 scsi_evt_thread+0x2d0/0x5b0
 process_one_work+0x570/0xf94
 worker_thread+0x7cc/0xf80
 kthread+0x2c4/0x388

These two jobs are scheduled asynchronously, we can't guaranteed that
kobj.parent will be created in device init thread before device reset
thread calls kobject_get_path().

To resolve the racing issue between device init thread and device reset
thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
to complete the creation of kobj.parent.

Device init call:                Device reset call:
ufshcd_async_scan()              scsi_evt_thread()
 scsi_scan_host()                 scsi_evt_emit() <- add wait_event()
  do_scsi_scan_host() <- add wake_up()
   scsi_scan_host_selected()
    scsi_scan_channel()
     scsi_probe_and_add_lun()
      scsi_target_add()
       device_add() // add kobj.parent
        kobject_uevent_env()
         kobject_get_path()
          fill_kobj_path()
  do_scan_async() <- wake_up()     kobject_uevent_env() // add kobj.parent
                                    kobject_get_path() // get valid kobj.parent
                                     fill_kobj_path()

After we add wake_up at do_scsi_scan_host() in device init thread, we can
ensure that device reset thread will get kobject after device init thread
finishes adding parent.

Signed-off-by: Alice Chao <alice.chao@mediatek.com>

---

Change since v2
-Change commit: Describes the preblem first and then the solution.
-Add commit: Add KASAN error log.

---
 drivers/scsi/scsi_lib.c  | 1 +
 drivers/scsi/scsi_scan.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..abf9a71ed77c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
 		break;
 	case SDEV_EVT_POWER_ON_RESET_OCCURRED:
 		envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
+		wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
 		break;
 	default:
 		/* do nothing */
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f4e6c68ac99e..431f229ac435 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
 	} else {
 		scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
 				SCAN_WILD_CARD, 0);
+		wake_up(&shost->host_wait);
 	}
 }
 
-- 
2.18.0


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

WARNING: multiple messages have this Message-ID (diff)
From: Alice Chao <alice.chao@mediatek.com>
To: <jejb@linux.ibm.com>, <martin.petersen@oracle.com>,
	<matthias.bgg@gmail.com>, <linux-scsi@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-mediatek@lists.infradead.org>
Cc: <stanley.chu@mediatek.com>, <peter.wang@mediatek.com>,
	<chun-hung.wu@mediatek.com>, <alice.chao@mediatek.com>,
	<powen.kao@mediatek.com>, <cc.chou@mediatek.com>,
	<chaotian.jing@mediatek.com>, <jiajie.hao@mediatek.com>,
	<qilin.tan@mediatek.com>, <lin.gui@mediatek.com>,
	<yanxu.wei@mediatek.com>, <wsd_upstream@mediatek.com>
Subject: [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset
Date: Fri, 15 Apr 2022 12:04:47 +0800	[thread overview]
Message-ID: <20220415040446.26451-2-alice.chao@mediatek.com> (raw)

Device reset thread uses kobject_uevent_env() to get kobj.parent, and it
aces with device init thread which calls device_add() to add kobj.parent
before kobject_uevent_env().

Device init call:           Device reset call:
 scsi_probe_and_add_lun()    scsi_evt_thread()
  scsi_add_lun()             scsi_evt_emit()
   scsi_sysfs_add_sdev()      kobject_uevent_env() //get kobj.parent
    scsi_target_add()           kobject_get_path()
                                 len = get_kobj_path_length () // len=1 because parent hasn't created yet
    device_add() // add kobj.parent
      kobject_uevent_env()
       kobject_get_path()         path = kzalloc()
        fill_kobj_path()           fill_kobj_path() // --length; length -= cur is a negative value
                                    memcpy(path + length, kobject_name(parent), cur); // slab OOB!

Above backtrace describes the problem, device reset thread will get wrong
kobj.parent when device init thread didn’t add kobj.parent yet. When this
racing happened, it triggers the a KASAN dump on the final iteration:

BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8
Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58
<snip>
Call trace:
 __kasan_report+0x124/0x1c8
 kasan_report+0x54/0x84
 kasan_check_range+0x200/0x208
 memcpy+0xb8/0xf0
 kobject_get_path+0xf8/0x1b8
 kobject_uevent_env+0x228/0xa88
 scsi_evt_thread+0x2d0/0x5b0
 process_one_work+0x570/0xf94
 worker_thread+0x7cc/0xf80
 kthread+0x2c4/0x388

These two jobs are scheduled asynchronously, we can't guaranteed that
kobj.parent will be created in device init thread before device reset
thread calls kobject_get_path().

To resolve the racing issue between device init thread and device reset
thread, we use wait_event() in scsi_evt_emit() to wait for device_add()
to complete the creation of kobj.parent.

Device init call:                Device reset call:
ufshcd_async_scan()              scsi_evt_thread()
 scsi_scan_host()                 scsi_evt_emit() <- add wait_event()
  do_scsi_scan_host() <- add wake_up()
   scsi_scan_host_selected()
    scsi_scan_channel()
     scsi_probe_and_add_lun()
      scsi_target_add()
       device_add() // add kobj.parent
        kobject_uevent_env()
         kobject_get_path()
          fill_kobj_path()
  do_scan_async() <- wake_up()     kobject_uevent_env() // add kobj.parent
                                    kobject_get_path() // get valid kobj.parent
                                     fill_kobj_path()

After we add wake_up at do_scsi_scan_host() in device init thread, we can
ensure that device reset thread will get kobject after device init thread
finishes adding parent.

Signed-off-by: Alice Chao <alice.chao@mediatek.com>

---

Change since v2
-Change commit: Describes the preblem first and then the solution.
-Add commit: Add KASAN error log.

---
 drivers/scsi/scsi_lib.c  | 1 +
 drivers/scsi/scsi_scan.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..abf9a71ed77c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt)
 		break;
 	case SDEV_EVT_POWER_ON_RESET_OCCURRED:
 		envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED";
+		wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL);
 		break;
 	default:
 		/* do nothing */
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f4e6c68ac99e..431f229ac435 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
 	} else {
 		scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD,
 				SCAN_WILD_CARD, 0);
+		wake_up(&shost->host_wait);
 	}
 }
 
-- 
2.18.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2022-04-15  4:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-15  4:04 Alice Chao [this message]
2022-04-15  4:04 ` [PATCH v3 1/1] scsi: Fix racing between dev init and dev reset Alice Chao
2022-04-15  4:04 ` Alice Chao
2022-04-15  5:52 ` Miles Chen
2022-04-15  5:52   ` Miles Chen
2022-04-15  5:52   ` Miles Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220415040446.26451-2-alice.chao@mediatek.com \
    --to=alice.chao@mediatek.com \
    --cc=cc.chou@mediatek.com \
    --cc=chaotian.jing@mediatek.com \
    --cc=chun-hung.wu@mediatek.com \
    --cc=jejb@linux.ibm.com \
    --cc=jiajie.hao@mediatek.com \
    --cc=lin.gui@mediatek.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=matthias.bgg@gmail.com \
    --cc=peter.wang@mediatek.com \
    --cc=powen.kao@mediatek.com \
    --cc=qilin.tan@mediatek.com \
    --cc=stanley.chu@mediatek.com \
    --cc=wsd_upstream@mediatek.com \
    --cc=yanxu.wei@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.