All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv3 0/4] scsi: use xarray for devices and targets
@ 2020-05-28 16:36 Hannes Reinecke
  2020-05-28 16:36 ` [PATCH 1/4] scsi: convert target lookup to xarray Hannes Reinecke
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-28 16:36 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi, Hannes Reinecke

Hi all,

based on the ideas from Doug Gilbert here's now my take on using
xarrays for devices and targets.
It revolves around two ideas:

 - The scsi target 'channel' and 'id' numbers are never ever used
   to the full 32 bit range; channels are well below 10, and no
   driver is using more than 16 bits for the id. So we can reduce
   the type of 'channel' and 'id' to 16 bits, and use the 32 bit
   value 'channel << 16 | id' as the index into the target xarray.
 - Most SCSI LUNs are below 256 (to ensure compability with older
   systems). So there we can use the LUN number as the index into
   the xarray; for larger LUN numbers we'll allocate a separate
   index.

With these changes we can implement an efficient lookup mechanism,
devolving into direct lookup for most cases.
And iteration should be as efficient as the current, list-based,
approach.

This patchset now survives basic testing, hence I've removed
the 'RFC' tag from the initial patchset.

Changes to v1:
- Fixup __scsi_iterate_devices()
- Include reviews from Johannes
- Minor fixes
- Include comments from Doug

Changes to v2:
- Reshuffle hunks as per suggestion from Johannes

Hannes Reinecke (4):
  scsi: convert target lookup to xarray
  target_core_pscsi: use __scsi_device_lookup()
  scsi: move target device list to xarray
  scsi: remove direct device lookup per host

 drivers/scsi/hosts.c               |   3 +-
 drivers/scsi/scsi.c                | 131 ++++++++++++++++++++++++++++---------
 drivers/scsi/scsi_lib.c            |   9 ++-
 drivers/scsi/scsi_scan.c           |  66 ++++++++-----------
 drivers/scsi/scsi_sysfs.c          |  42 ++++++++----
 drivers/target/target_core_pscsi.c |   8 +--
 include/scsi/scsi_device.h         |  21 +++---
 include/scsi/scsi_host.h           |   5 +-
 8 files changed, 179 insertions(+), 106 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] scsi: convert target lookup to xarray
  2020-05-28 16:36 [PATCHv3 0/4] scsi: use xarray for devices and targets Hannes Reinecke
@ 2020-05-28 16:36 ` Hannes Reinecke
  2020-05-28 17:18   ` Douglas Gilbert
  2020-05-28 17:48   ` Bart Van Assche
  2020-05-28 16:36 ` [PATCH 2/4] target_core_pscsi: use __scsi_device_lookup() Hannes Reinecke
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-28 16:36 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi, Hannes Reinecke

Use an xarray instead of lists for holding the scsi targets.
I've also shortened the 'channel' and 'id' values to 16 bit
as none of the drivers requires a full 32bit range for either
of them, and by shortening them we can use them as the index
into the xarray for storing the scsi_target pointer.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/hosts.c       |  2 +-
 drivers/scsi/scsi.c        | 32 ++++++++++++++++++++------------
 drivers/scsi/scsi_scan.c   | 43 ++++++++++++++++---------------------------
 drivers/scsi/scsi_sysfs.c  | 15 +++++++++++----
 include/scsi/scsi_device.h |  4 ++--
 include/scsi/scsi_host.h   |  2 +-
 6 files changed, 51 insertions(+), 47 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 7ec91c3a66ca..7109afad0183 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -383,7 +383,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	spin_lock_init(shost->host_lock);
 	shost->shost_state = SHOST_CREATED;
 	INIT_LIST_HEAD(&shost->__devices);
-	INIT_LIST_HEAD(&shost->__targets);
+	xa_init(&shost->__targets);
 	INIT_LIST_HEAD(&shost->eh_cmd_q);
 	INIT_LIST_HEAD(&shost->starved_list);
 	init_waitqueue_head(&shost->host_wait);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 56c24a73e0c7..d601424e32b2 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -575,6 +575,19 @@ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
 }
 EXPORT_SYMBOL(__scsi_iterate_devices);
 
+/**
+ * __scsi_target_lookup  -  find a target based on channel and target id
+ * @shost:	SCSI host pointer
+ * @channel:	channel number of the target
+ * @id:		ID of the target
+ *
+ */
+static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
+					 u16 channel, u16 id)
+{
+	return xa_load(&shost->__targets, (channel << 16) | id);
+}
+
 /**
  * starget_for_each_device  -  helper to walk all devices of a target
  * @starget:	target whose devices we want to iterate over.
@@ -701,19 +714,14 @@ EXPORT_SYMBOL(scsi_device_lookup_by_target);
  * really want to use scsi_device_lookup instead.
  **/
 struct scsi_device *__scsi_device_lookup(struct Scsi_Host *shost,
-		uint channel, uint id, u64 lun)
+		u16 channel, u16 id, u64 lun)
 {
-	struct scsi_device *sdev;
+	struct scsi_target *starget;
 
-	list_for_each_entry(sdev, &shost->__devices, siblings) {
-		if (sdev->sdev_state == SDEV_DEL)
-			continue;
-		if (sdev->channel == channel && sdev->id == id &&
-				sdev->lun ==lun)
-			return sdev;
-	}
-
-	return NULL;
+	starget = __scsi_target_lookup(shost, channel, id);
+	if (!starget)
+		return NULL;
+	return __scsi_device_lookup_by_target(starget, lun);
 }
 EXPORT_SYMBOL(__scsi_device_lookup);
 
@@ -729,7 +737,7 @@ EXPORT_SYMBOL(__scsi_device_lookup);
  * needs to be released with scsi_device_put once you're done with it.
  **/
 struct scsi_device *scsi_device_lookup(struct Scsi_Host *shost,
-		uint channel, uint id, u64 lun)
+		u16 channel, u16 id, u64 lun)
 {
 	struct scsi_device *sdev;
 	unsigned long flags;
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f2437a7570ce..dc2656df495b 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -304,11 +304,15 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	return NULL;
 }
 
+#define scsi_target_index(s) \
+	((((unsigned long)(s)->channel) << 16) | (s)->id)
+
 static void scsi_target_destroy(struct scsi_target *starget)
 {
 	struct device *dev = &starget->dev;
 	struct Scsi_Host *shost = dev_to_shost(dev->parent);
 	unsigned long flags;
+	unsigned long tid = scsi_target_index(starget);
 
 	BUG_ON(starget->state == STARGET_DEL);
 	starget->state = STARGET_DEL;
@@ -316,7 +320,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (shost->hostt->target_destroy)
 		shost->hostt->target_destroy(starget);
-	list_del_init(&starget->siblings);
+	xa_erase(&shost->__targets, tid);
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	put_device(dev);
 }
@@ -341,27 +345,6 @@ int scsi_is_target_device(const struct device *dev)
 }
 EXPORT_SYMBOL(scsi_is_target_device);
 
-static struct scsi_target *__scsi_find_target(struct device *parent,
-					      int channel, uint id)
-{
-	struct scsi_target *starget, *found_starget = NULL;
-	struct Scsi_Host *shost = dev_to_shost(parent);
-	/*
-	 * Search for an existing target for this sdev.
-	 */
-	list_for_each_entry(starget, &shost->__targets, siblings) {
-		if (starget->id == id &&
-		    starget->channel == channel) {
-			found_starget = starget;
-			break;
-		}
-	}
-	if (found_starget)
-		get_device(&found_starget->dev);
-
-	return found_starget;
-}
-
 /**
  * scsi_target_reap_ref_release - remove target from visibility
  * @kref: the reap_ref in the target being released
@@ -417,6 +400,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	struct scsi_target *starget;
 	struct scsi_target *found_target;
 	int error, ref_got;
+	unsigned long tid;
 
 	starget = kzalloc(size, GFP_KERNEL);
 	if (!starget) {
@@ -433,19 +417,24 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	starget->id = id;
 	starget->channel = channel;
 	starget->can_queue = 0;
-	INIT_LIST_HEAD(&starget->siblings);
 	INIT_LIST_HEAD(&starget->devices);
 	starget->state = STARGET_CREATED;
 	starget->scsi_level = SCSI_2;
 	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
+	tid = scsi_target_index(starget);
  retry:
 	spin_lock_irqsave(shost->host_lock, flags);
 
-	found_target = __scsi_find_target(parent, channel, id);
-	if (found_target)
+	found_target = xa_load(&shost->__targets, tid);
+	if (found_target) {
+		get_device(&found_target->dev);
 		goto found;
-
-	list_add_tail(&starget->siblings, &shost->__targets);
+	}
+	if (xa_insert(&shost->__targets, tid, starget, GFP_KERNEL)) {
+		dev_printk(KERN_ERR, dev, "target index busy\n");
+		kfree(starget);
+		return NULL;
+	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	/* allocate and add */
 	transport_setup_device(dev);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 163dbcb741c1..95aaa96ce03b 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1512,15 +1512,19 @@ void scsi_remove_target(struct device *dev)
 {
 	struct Scsi_Host *shost = dev_to_shost(dev->parent);
 	struct scsi_target *starget;
+	unsigned long tid = 0;
 	unsigned long flags;
 
-restart:
 	spin_lock_irqsave(shost->host_lock, flags);
-	list_for_each_entry(starget, &shost->__targets, siblings) {
+	starget = xa_find(&shost->__targets, &tid, ULONG_MAX, XA_PRESENT);
+	while (starget) {
 		if (starget->state == STARGET_DEL ||
 		    starget->state == STARGET_REMOVE ||
-		    starget->state == STARGET_CREATED_REMOVE)
+		    starget->state == STARGET_CREATED_REMOVE) {
+			starget = xa_find_after(&shost->__targets, &tid,
+						ULONG_MAX, XA_PRESENT);
 			continue;
+		}
 		if (starget->dev.parent == dev || &starget->dev == dev) {
 			kref_get(&starget->reap_ref);
 			if (starget->state == STARGET_CREATED)
@@ -1530,7 +1534,10 @@ void scsi_remove_target(struct device *dev)
 			spin_unlock_irqrestore(shost->host_lock, flags);
 			__scsi_remove_target(starget);
 			scsi_target_reap(starget);
-			goto restart;
+			spin_lock_irqsave(shost->host_lock, flags);
+			starget = xa_find_after(&shost->__targets, &tid,
+						ULONG_MAX, XA_PRESENT);
+			continue;
 		}
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index c3cba2aaf934..28034cc0fce5 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -345,9 +345,9 @@ extern struct scsi_device *scsi_device_from_queue(struct request_queue *q);
 extern int __must_check scsi_device_get(struct scsi_device *);
 extern void scsi_device_put(struct scsi_device *);
 extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *,
-					      uint, uint, u64);
+					      u16, u16, u64);
 extern struct scsi_device *__scsi_device_lookup(struct Scsi_Host *,
-						uint, uint, u64);
+						u16, u16, u64);
 extern struct scsi_device *scsi_device_lookup_by_target(struct scsi_target *,
 							u64);
 extern struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *,
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 822e8cda8d9b..b9395676c75b 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -521,7 +521,7 @@ struct Scsi_Host {
 	 * access this list directly from a driver.
 	 */
 	struct list_head	__devices;
-	struct list_head	__targets;
+	struct xarray		__targets;
 	
 	struct list_head	starved_list;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/4] target_core_pscsi: use __scsi_device_lookup()
  2020-05-28 16:36 [PATCHv3 0/4] scsi: use xarray for devices and targets Hannes Reinecke
  2020-05-28 16:36 ` [PATCH 1/4] scsi: convert target lookup to xarray Hannes Reinecke
@ 2020-05-28 16:36 ` Hannes Reinecke
  2020-05-28 17:55   ` Bart Van Assche
  2020-05-29  7:02   ` Daniel Wagner
  2020-05-28 16:36 ` [PATCH 3/4] scsi: move target device list to xarray Hannes Reinecke
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-28 16:36 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi, Hannes Reinecke

Instead of walking the list of devices manually use the helper
function to return the device directly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 drivers/target/target_core_pscsi.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index 4e37fa9b409d..38799e47b590 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -496,11 +496,9 @@ static int pscsi_configure_device(struct se_device *dev)
 	}
 
 	spin_lock_irq(sh->host_lock);
-	list_for_each_entry(sd, &sh->__devices, siblings) {
-		if ((pdv->pdv_channel_id != sd->channel) ||
-		    (pdv->pdv_target_id != sd->id) ||
-		    (pdv->pdv_lun_id != sd->lun))
-			continue;
+	sd = __scsi_device_lookup(sh, pdv->pdv_channel_id,
+				  pdv->pdv_target_id, pdv->pdv_lun_id);
+	if (sd) {
 		/*
 		 * Functions will release the held struct scsi_host->host_lock
 		 * before calling calling pscsi_add_device_to_list() to register
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/4] scsi: move target device list to xarray
  2020-05-28 16:36 [PATCHv3 0/4] scsi: use xarray for devices and targets Hannes Reinecke
  2020-05-28 16:36 ` [PATCH 1/4] scsi: convert target lookup to xarray Hannes Reinecke
  2020-05-28 16:36 ` [PATCH 2/4] target_core_pscsi: use __scsi_device_lookup() Hannes Reinecke
@ 2020-05-28 16:36 ` Hannes Reinecke
  2020-05-28 17:50   ` Douglas Gilbert
  2020-05-28 16:36 ` [PATCH 4/4] scsi: remove direct device lookup per host Hannes Reinecke
  2020-05-29  4:21 ` [PATCHv3 0/4] scsi: use xarray ... scsi_alloc_target() Douglas Gilbert
  4 siblings, 1 reply; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-28 16:36 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi, Hannes Reinecke

Use xarray for device lookup by target. LUNs below 256 are linear,
and can be used directly as the index into the xarray.
LUNs above 256 have a distinct LUN format, and are not necessarily
linear. They'll be stored in indices above 256 in the xarray, with
the next free index in the xarray.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 drivers/scsi/scsi.c        | 32 +++++++++++++++++++-------------
 drivers/scsi/scsi_lib.c    |  9 ++++-----
 drivers/scsi/scsi_scan.c   |  3 +--
 drivers/scsi/scsi_sysfs.c  | 17 +++++++++++++++--
 include/scsi/scsi_device.h |  4 ++--
 5 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index d601424e32b2..9dbbc51a1eb5 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -601,13 +601,14 @@ static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
 void starget_for_each_device(struct scsi_target *starget, void *data,
 		     void (*fn)(struct scsi_device *, void *))
 {
-	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
 	struct scsi_device *sdev;
+	unsigned long lun_id = 0;
 
-	shost_for_each_device(sdev, shost) {
-		if ((sdev->channel == starget->channel) &&
-		    (sdev->id == starget->id))
-			fn(sdev, data);
+	xa_for_each(&starget->devices, lun_id, sdev) {
+		if (scsi_device_get(sdev))
+			continue;
+		fn(sdev, data);
+		scsi_device_put(sdev);
 	}
 }
 EXPORT_SYMBOL(starget_for_each_device);
@@ -629,14 +630,11 @@ EXPORT_SYMBOL(starget_for_each_device);
 void __starget_for_each_device(struct scsi_target *starget, void *data,
 			       void (*fn)(struct scsi_device *, void *))
 {
-	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
 	struct scsi_device *sdev;
+	unsigned long lun_id = 0;
 
-	__shost_for_each_device(sdev, shost) {
-		if ((sdev->channel == starget->channel) &&
-		    (sdev->id == starget->id))
-			fn(sdev, data);
-	}
+	xa_for_each(&starget->devices, lun_id, sdev)
+		fn(sdev, data);
 }
 EXPORT_SYMBOL(__starget_for_each_device);
 
@@ -659,11 +657,19 @@ struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *starget,
 						   u64 lun)
 {
 	struct scsi_device *sdev;
+	unsigned long lun_idx = 256;
+
+	if (lun < lun_idx) {
+		sdev = xa_load(&starget->devices, lun);
+		if (sdev && sdev->sdev_state == SDEV_DEL)
+			sdev = NULL;
+		return sdev;
+	}
 
-	list_for_each_entry(sdev, &starget->devices, same_target_siblings) {
+	xa_for_each(&starget->devices, lun_idx, sdev) {
 		if (sdev->sdev_state == SDEV_DEL)
 			continue;
-		if (sdev->lun ==lun)
+		if (sdev->lun == lun)
 			return sdev;
 	}
 
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c163fa22267c..f9829fc0d84e 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -361,9 +361,9 @@ static void scsi_kick_queue(struct request_queue *q)
 static void scsi_single_lun_run(struct scsi_device *current_sdev)
 {
 	struct Scsi_Host *shost = current_sdev->host;
-	struct scsi_device *sdev, *tmp;
+	struct scsi_device *sdev;
 	struct scsi_target *starget = scsi_target(current_sdev);
-	unsigned long flags;
+	unsigned long flags, lun_id = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
 	starget->starget_sdev_user = NULL;
@@ -380,8 +380,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (starget->starget_sdev_user)
 		goto out;
-	list_for_each_entry_safe(sdev, tmp, &starget->devices,
-			same_target_siblings) {
+	xa_for_each(&starget->devices, lun_id, sdev) {
 		if (sdev == current_sdev)
 			continue;
 		if (scsi_device_get(sdev))
@@ -390,7 +389,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
 		spin_unlock_irqrestore(shost->host_lock, flags);
 		scsi_kick_queue(sdev->request_queue);
 		spin_lock_irqsave(shost->host_lock, flags);
-	
+
 		scsi_device_put(sdev);
 	}
  out:
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index dc2656df495b..c7aba9ba5c0c 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -235,7 +235,6 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	mutex_init(&sdev->state_mutex);
 	sdev->sdev_state = SDEV_CREATED;
 	INIT_LIST_HEAD(&sdev->siblings);
-	INIT_LIST_HEAD(&sdev->same_target_siblings);
 	INIT_LIST_HEAD(&sdev->starved_entry);
 	INIT_LIST_HEAD(&sdev->event_list);
 	spin_lock_init(&sdev->list_lock);
@@ -417,7 +416,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	starget->id = id;
 	starget->channel = channel;
 	starget->can_queue = 0;
-	INIT_LIST_HEAD(&starget->devices);
+	xa_init(&starget->devices);
 	starget->state = STARGET_CREATED;
 	starget->scsi_level = SCSI_2;
 	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 95aaa96ce03b..27c19232f175 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -434,6 +434,7 @@ static void scsi_device_cls_release(struct device *class_dev)
 static void scsi_device_dev_release_usercontext(struct work_struct *work)
 {
 	struct scsi_device *sdev;
+	struct scsi_target *starget;
 	struct device *parent;
 	struct list_head *this, *tmp;
 	struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL;
@@ -441,6 +442,7 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	unsigned long flags;
 
 	sdev = container_of(work, struct scsi_device, ew.work);
+	starget = sdev->sdev_target;
 
 	scsi_dh_release_device(sdev);
 
@@ -448,7 +450,7 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 
 	spin_lock_irqsave(sdev->host->host_lock, flags);
 	list_del(&sdev->siblings);
-	list_del(&sdev->same_target_siblings);
+	xa_erase(&starget->devices, sdev->lun_idx);
 	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
 
@@ -1621,7 +1623,18 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
 
 	transport_setup_device(&sdev->sdev_gendev);
 	spin_lock_irqsave(shost->host_lock, flags);
-	list_add_tail(&sdev->same_target_siblings, &starget->devices);
+	if (sdev->lun < 256) {
+		sdev->lun_idx = sdev->lun;
+		WARN_ON(xa_insert(&starget->devices, sdev->lun_idx,
+				   sdev, GFP_KERNEL) < 0);
+	} else {
+		struct xa_limit scsi_lun_limit = {
+			.min = 256,
+			.max = UINT_MAX,
+		};
+		WARN_ON(xa_alloc(&starget->devices, &sdev->lun_idx,
+				  sdev, scsi_lun_limit, GFP_KERNEL) < 0);
+	}
 	list_add_tail(&sdev->siblings, &shost->__devices);
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	/*
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 28034cc0fce5..2c6b9d8bc40e 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -104,7 +104,6 @@ struct scsi_device {
 
 	/* the next two are protected by the host->host_lock */
 	struct list_head    siblings;   /* list of all devices on this host */
-	struct list_head    same_target_siblings; /* just the devices sharing same target id */
 
 	atomic_t device_busy;		/* commands actually active on LLDD */
 	atomic_t device_blocked;	/* Device returned QUEUE_FULL. */
@@ -123,6 +122,7 @@ struct scsi_device {
 
 	unsigned int id, channel;
 	u64 lun;
+	unsigned int lun_idx;		/* Index into target device xarray */
 	unsigned int manufacturer;	/* Manufacturer of device, for using 
 					 * vendor-specific cmd's */
 	unsigned sector_size;	/* size in bytes */
@@ -285,7 +285,7 @@ enum scsi_target_state {
 struct scsi_target {
 	struct scsi_device	*starget_sdev_user;
 	struct list_head	siblings;
-	struct list_head	devices;
+	struct xarray		devices;
 	struct device		dev;
 	struct kref		reap_ref; /* last put renders target invisible */
 	unsigned int		channel;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/4] scsi: remove direct device lookup per host
  2020-05-28 16:36 [PATCHv3 0/4] scsi: use xarray for devices and targets Hannes Reinecke
                   ` (2 preceding siblings ...)
  2020-05-28 16:36 ` [PATCH 3/4] scsi: move target device list to xarray Hannes Reinecke
@ 2020-05-28 16:36 ` Hannes Reinecke
  2020-05-29  4:21 ` [PATCHv3 0/4] scsi: use xarray ... scsi_alloc_target() Douglas Gilbert
  4 siblings, 0 replies; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-28 16:36 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi, Hannes Reinecke

Drop the per-host device list for direct lookup and iterate
over the targets and devices xarrays instead.
As both are now using xarrays the lookup is more efficient
as we can use the provided indices based on the HCTL id
to do a direct lookup instead of traversing lists.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/hosts.c       |  1 -
 drivers/scsi/scsi.c        | 67 +++++++++++++++++++++++++++++++++++++++++-----
 drivers/scsi/scsi_scan.c   | 20 ++++++++------
 drivers/scsi/scsi_sysfs.c  | 10 ++-----
 include/scsi/scsi_device.h | 13 +++++----
 include/scsi/scsi_host.h   |  3 +--
 6 files changed, 84 insertions(+), 30 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 7109afad0183..004a50a95560 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -382,7 +382,6 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	shost->host_lock = &shost->default_lock;
 	spin_lock_init(shost->host_lock);
 	shost->shost_state = SHOST_CREATED;
-	INIT_LIST_HEAD(&shost->__devices);
 	xa_init(&shost->__targets);
 	INIT_LIST_HEAD(&shost->eh_cmd_q);
 	INIT_LIST_HEAD(&shost->starved_list);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 9dbbc51a1eb5..eac505a169a7 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -554,18 +554,40 @@ EXPORT_SYMBOL(scsi_device_put);
 struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
 					   struct scsi_device *prev)
 {
-	struct list_head *list = (prev ? &prev->siblings : &shost->__devices);
+	struct scsi_target *starget;
 	struct scsi_device *next = NULL;
 	unsigned long flags;
+	unsigned long tid = 0, lun_idx = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
-	while (list->next != &shost->__devices) {
-		next = list_entry(list->next, struct scsi_device, siblings);
-		/* skip devices that we can't get a reference to */
+	if (!prev) {
+		starget = xa_find(&shost->__targets, &tid,
+				       ULONG_MAX, XA_PRESENT);
+		if (starget) {
+			next = xa_find(&starget->devices, &lun_idx,
+				       ULONG_MAX, XA_PRESENT);
+			if (!scsi_device_get(next)) {
+				spin_unlock_irqrestore(shost->host_lock, flags);
+				return next;
+			}
+		}
+	} else {
+		starget = prev->sdev_target;
+		tid = (starget->channel << 16) | starget->id;
+		lun_idx = prev->lun_idx;
+	}
+	while (starget) {
+		next = xa_find_after(&starget->devices, &lun_idx,
+				     ULONG_MAX, XA_PRESENT);
+		if (!next) {
+			/* No more LUNs on this target, switch to the next */
+			lun_idx = 0;
+			starget = xa_find_after(&shost->__targets, &tid,
+						ULONG_MAX, XA_PRESENT);
+			continue;
+		}
 		if (!scsi_device_get(next))
 			break;
-		next = NULL;
-		list = list->next;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 
@@ -575,6 +597,39 @@ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
 }
 EXPORT_SYMBOL(__scsi_iterate_devices);
 
+/* helper for __shost_for_each_device, see that for documentation */
+struct scsi_device *__scsi_iterate_devices_unlocked(struct Scsi_Host *shost,
+						    struct scsi_device *prev)
+{
+	struct scsi_target *starget;
+	struct scsi_device *next = NULL;
+	unsigned long tid = 0, lun_idx = 0;
+
+	if (!prev) {
+		starget = xa_find(&shost->__targets, &tid,
+				       ULONG_MAX, XA_PRESENT);
+		if (starget)
+			return xa_find(&starget->devices, &lun_idx,
+				       ULONG_MAX, XA_PRESENT);
+	} else {
+		starget = prev->sdev_target;
+		tid = (starget->channel << 16) | starget->id;
+		lun_idx = prev->lun_idx;
+	}
+	while (starget) {
+		next = xa_find_after(&starget->devices, &lun_idx,
+				     ULONG_MAX, XA_PRESENT);
+		if (next)
+			return next;
+		/* No more LUNs on this target, switch to the next */
+		lun_idx = 0;
+		starget = xa_find_after(&shost->__targets, &tid,
+					ULONG_MAX, XA_PRESENT);
+	}
+	return NULL;
+}
+EXPORT_SYMBOL(__scsi_iterate_devices_unlocked);
+
 /**
  * __scsi_target_lookup  -  find a target based on channel and target id
  * @shost:	SCSI host pointer
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index c7aba9ba5c0c..e75e7f93f8f6 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -234,7 +234,6 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	sdev->channel = starget->channel;
 	mutex_init(&sdev->state_mutex);
 	sdev->sdev_state = SDEV_CREATED;
-	INIT_LIST_HEAD(&sdev->siblings);
 	INIT_LIST_HEAD(&sdev->starved_entry);
 	INIT_LIST_HEAD(&sdev->event_list);
 	spin_lock_init(&sdev->list_lock);
@@ -1847,17 +1846,22 @@ EXPORT_SYMBOL(scsi_scan_host);
 
 void scsi_forget_host(struct Scsi_Host *shost)
 {
+	struct scsi_target *starget;
 	struct scsi_device *sdev;
 	unsigned long flags;
+	unsigned long tid = 0;
 
- restart:
 	spin_lock_irqsave(shost->host_lock, flags);
-	list_for_each_entry(sdev, &shost->__devices, siblings) {
-		if (sdev->sdev_state == SDEV_DEL)
-			continue;
-		spin_unlock_irqrestore(shost->host_lock, flags);
-		__scsi_remove_device(sdev);
-		goto restart;
+	xa_for_each(&shost->__targets, tid, starget) {
+		unsigned long lun_id = 0;
+
+		xa_for_each(&starget->devices, lun_id, sdev) {
+			if (sdev->sdev_state == SDEV_DEL)
+				continue;
+			spin_unlock_irqrestore(shost->host_lock, flags);
+			__scsi_remove_device(sdev);
+			spin_lock_irqsave(shost->host_lock, flags);
+		}
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 }
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 27c19232f175..63fa57684782 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -449,7 +449,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	parent = sdev->sdev_gendev.parent;
 
 	spin_lock_irqsave(sdev->host->host_lock, flags);
-	list_del(&sdev->siblings);
 	xa_erase(&starget->devices, sdev->lun_idx);
 	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
@@ -1476,19 +1475,16 @@ static void __scsi_remove_target(struct scsi_target *starget)
 	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
 	unsigned long flags;
 	struct scsi_device *sdev;
+	unsigned long lun_idx = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
- restart:
-	list_for_each_entry(sdev, &shost->__devices, siblings) {
+	xa_for_each(&starget->devices, lun_idx, sdev) {
 		/*
 		 * We cannot call scsi_device_get() here, as
 		 * we might've been called from rmmod() causing
 		 * scsi_device_get() to fail the module_is_live()
 		 * check.
 		 */
-		if (sdev->channel != starget->channel ||
-		    sdev->id != starget->id)
-			continue;
 		if (sdev->sdev_state == SDEV_DEL ||
 		    sdev->sdev_state == SDEV_CANCEL ||
 		    !get_device(&sdev->sdev_gendev))
@@ -1497,7 +1493,6 @@ static void __scsi_remove_target(struct scsi_target *starget)
 		scsi_remove_device(sdev);
 		put_device(&sdev->sdev_gendev);
 		spin_lock_irqsave(shost->host_lock, flags);
-		goto restart;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 }
@@ -1635,7 +1630,6 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
 		WARN_ON(xa_alloc(&starget->devices, &sdev->lun_idx,
 				  sdev, scsi_lun_limit, GFP_KERNEL) < 0);
 	}
-	list_add_tail(&sdev->siblings, &shost->__devices);
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	/*
 	 * device can now only be removed via __scsi_remove_device() so hold
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 2c6b9d8bc40e..b811b5e3adfe 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -102,9 +102,6 @@ struct scsi_device {
 	struct Scsi_Host *host;
 	struct request_queue *request_queue;
 
-	/* the next two are protected by the host->host_lock */
-	struct list_head    siblings;   /* list of all devices on this host */
-
 	atomic_t device_busy;		/* commands actually active on LLDD */
 	atomic_t device_blocked;	/* Device returned QUEUE_FULL. */
 
@@ -376,6 +373,10 @@ extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *,
 	     (sdev); \
 	     (sdev) = __scsi_iterate_devices((shost), (sdev)))
 
+/* only exposed to implement shost_for_each_device */
+struct scsi_device *__scsi_iterate_devices_unlocked(struct Scsi_Host *,
+						    struct scsi_device *);
+
 /**
  * __shost_for_each_device - iterate over all devices of a host (UNLOCKED)
  * @sdev: the &struct scsi_device to use as a cursor
@@ -389,8 +390,10 @@ extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *,
  * device list in interrupt context.  Otherwise you really want to use
  * shost_for_each_device instead.
  */
-#define __shost_for_each_device(sdev, shost) \
-	list_for_each_entry((sdev), &((shost)->__devices), siblings)
+#define __shost_for_each_device(sdev, shost)				\
+	for((sdev) = __scsi_iterate_devices_unlocked((shost), NULL);	\
+	    (sdev);							\
+	    (sdev) = __scsi_iterate_devices_unlocked((shost),(sdev)))
 
 extern int scsi_change_queue_depth(struct scsi_device *, int);
 extern int scsi_track_queue_full(struct scsi_device *, int);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index b9395676c75b..ee0b72075e9f 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -520,9 +520,8 @@ struct Scsi_Host {
 	 * their __ prefixed variants with the lock held. NEVER
 	 * access this list directly from a driver.
 	 */
-	struct list_head	__devices;
 	struct xarray		__targets;
-	
+
 	struct list_head	starved_list;
 
 	spinlock_t		default_lock;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] scsi: convert target lookup to xarray
  2020-05-28 16:36 ` [PATCH 1/4] scsi: convert target lookup to xarray Hannes Reinecke
@ 2020-05-28 17:18   ` Douglas Gilbert
  2020-05-28 19:08     ` Douglas Gilbert
  2020-05-28 17:48   ` Bart Van Assche
  1 sibling, 1 reply; 25+ messages in thread
From: Douglas Gilbert @ 2020-05-28 17:18 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: Christoph Hellwig, Daniel Wagner, Johannes Thumshirn,
	James Bottomley, linux-scsi

The following suggestions are based on trying to _run_ this code ...
foolish, I know.

On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
> Use an xarray instead of lists for holding the scsi targets.
> I've also shortened the 'channel' and 'id' values to 16 bit
> as none of the drivers requires a full 32bit range for either
> of them, and by shortening them we can use them as the index
> into the xarray for storing the scsi_target pointer.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   drivers/scsi/hosts.c       |  2 +-
>   drivers/scsi/scsi.c        | 32 ++++++++++++++++++++------------
>   drivers/scsi/scsi_scan.c   | 43 ++++++++++++++++---------------------------
>   drivers/scsi/scsi_sysfs.c  | 15 +++++++++++----
>   include/scsi/scsi_device.h |  4 ++--
>   include/scsi/scsi_host.h   |  2 +-
>   6 files changed, 51 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 7ec91c3a66ca..7109afad0183 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -383,7 +383,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
>   	spin_lock_init(shost->host_lock);
>   	shost->shost_state = SHOST_CREATED;
>   	INIT_LIST_HEAD(&shost->__devices);
> -	INIT_LIST_HEAD(&shost->__targets);
> +	xa_init(&shost->__targets);

Please use: xa_init_flags(&shost->__targets, XA_FLAGS_LOCK_IRQ);

If you ever touch (modify?) that array in interrupt context then I
believe you need this. See the first 50 or so lines in lib/xarray.c
to see that it is used for xarray internals (e.g. xas_lock_type() ).

The xarray interface (include/linux/xarray.h) also has *_irq() and
*_bh() variants (e.g. xa_insert(), xa_insert_bh(), xa_insert_irq() )
but setting that flag in the xa_init_flags() call seems to be a
safer approach. IOWs it is an alternative to using the *_irq()
variants on all the _modifying_ xarray calls.

>   	INIT_LIST_HEAD(&shost->eh_cmd_q);
>   	INIT_LIST_HEAD(&shost->starved_list);
>   	init_waitqueue_head(&shost->host_wait);
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index 56c24a73e0c7..d601424e32b2 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -575,6 +575,19 @@ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
>   }
>   EXPORT_SYMBOL(__scsi_iterate_devices);
>   
> +/**
> + * __scsi_target_lookup  -  find a target based on channel and target id
> + * @shost:	SCSI host pointer
> + * @channel:	channel number of the target
> + * @id:		ID of the target
> + *
> + */
> +static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
> +					 u16 channel, u16 id)
> +{
> +	return xa_load(&shost->__targets, (channel << 16) | id);
> +}
> +
>   /**
>    * starget_for_each_device  -  helper to walk all devices of a target
>    * @starget:	target whose devices we want to iterate over.
> @@ -701,19 +714,14 @@ EXPORT_SYMBOL(scsi_device_lookup_by_target);
>    * really want to use scsi_device_lookup instead.
>    **/
>   struct scsi_device *__scsi_device_lookup(struct Scsi_Host *shost,
> -		uint channel, uint id, u64 lun)
> +		u16 channel, u16 id, u64 lun)
>   {
> -	struct scsi_device *sdev;
> +	struct scsi_target *starget;
>   
> -	list_for_each_entry(sdev, &shost->__devices, siblings) {
> -		if (sdev->sdev_state == SDEV_DEL)
> -			continue;
> -		if (sdev->channel == channel && sdev->id == id &&
> -				sdev->lun ==lun)
> -			return sdev;
> -	}
> -
> -	return NULL;
> +	starget = __scsi_target_lookup(shost, channel, id);
> +	if (!starget)
> +		return NULL;
> +	return __scsi_device_lookup_by_target(starget, lun);
>   }
>   EXPORT_SYMBOL(__scsi_device_lookup);
>   
> @@ -729,7 +737,7 @@ EXPORT_SYMBOL(__scsi_device_lookup);
>    * needs to be released with scsi_device_put once you're done with it.
>    **/
>   struct scsi_device *scsi_device_lookup(struct Scsi_Host *shost,
> -		uint channel, uint id, u64 lun)
> +		u16 channel, u16 id, u64 lun)
>   {
>   	struct scsi_device *sdev;
>   	unsigned long flags;
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index f2437a7570ce..dc2656df495b 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -304,11 +304,15 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
>   	return NULL;
>   }
>   
> +#define scsi_target_index(s) \
> +	((((unsigned long)(s)->channel) << 16) | (s)->id)
> +
>   static void scsi_target_destroy(struct scsi_target *starget)
>   {
>   	struct device *dev = &starget->dev;
>   	struct Scsi_Host *shost = dev_to_shost(dev->parent);
>   	unsigned long flags;
> +	unsigned long tid = scsi_target_index(starget);
>   
>   	BUG_ON(starget->state == STARGET_DEL);
>   	starget->state = STARGET_DEL;
> @@ -316,7 +320,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
>   	spin_lock_irqsave(shost->host_lock, flags);
>   	if (shost->hostt->target_destroy)
>   		shost->hostt->target_destroy(starget);
> -	list_del_init(&starget->siblings);
> +	xa_erase(&shost->__targets, tid);
>   	spin_unlock_irqrestore(shost->host_lock, flags);
>   	put_device(dev);
>   }
> @@ -341,27 +345,6 @@ int scsi_is_target_device(const struct device *dev)
>   }
>   EXPORT_SYMBOL(scsi_is_target_device);
>   
> -static struct scsi_target *__scsi_find_target(struct device *parent,
> -					      int channel, uint id)
> -{
> -	struct scsi_target *starget, *found_starget = NULL;
> -	struct Scsi_Host *shost = dev_to_shost(parent);
> -	/*
> -	 * Search for an existing target for this sdev.
> -	 */
> -	list_for_each_entry(starget, &shost->__targets, siblings) {
> -		if (starget->id == id &&
> -		    starget->channel == channel) {
> -			found_starget = starget;
> -			break;
> -		}
> -	}
> -	if (found_starget)
> -		get_device(&found_starget->dev);
> -
> -	return found_starget;
> -}
> -
>   /**
>    * scsi_target_reap_ref_release - remove target from visibility
>    * @kref: the reap_ref in the target being released
> @@ -417,6 +400,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
>   	struct scsi_target *starget;
>   	struct scsi_target *found_target;
>   	int error, ref_got;
> +	unsigned long tid;
>   
>   	starget = kzalloc(size, GFP_KERNEL);
>   	if (!starget) {
> @@ -433,19 +417,24 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
>   	starget->id = id;
>   	starget->channel = channel;
>   	starget->can_queue = 0;
> -	INIT_LIST_HEAD(&starget->siblings);
>   	INIT_LIST_HEAD(&starget->devices);
>   	starget->state = STARGET_CREATED;
>   	starget->scsi_level = SCSI_2;
>   	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
> +	tid = scsi_target_index(starget);
>    retry:
>   	spin_lock_irqsave(shost->host_lock, flags);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Interrupts now masked on this CPU.

>   
> -	found_target = __scsi_find_target(parent, channel, id);
> -	if (found_target)
> +	found_target = xa_load(&shost->__targets, tid);
> +	if (found_target) {
> +		get_device(&found_target->dev);
>   		goto found;
> -
> -	list_add_tail(&starget->siblings, &shost->__targets);
> +	}
> +	if (xa_insert(&shost->__targets, tid, starget, GFP_KERNEL)) {

You have interrupts masked (on this CPU), you can't use GFP_KERNEL.
If you run this code, you will be told :-)
In this case, you must use GFP_ATOMIC.

> +		dev_printk(KERN_ERR, dev, "target index busy\n");
> +		kfree(starget);
> +		return NULL;
> +	}
>   	spin_unlock_irqrestore(shost->host_lock, flags);
^^^^^^^^^^^^^^^^^^^^
Now you can use GFP_KERNEL, if you want.

>   	/* allocate and add */
>   	transport_setup_device(dev);
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 163dbcb741c1..95aaa96ce03b 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1512,15 +1512,19 @@ void scsi_remove_target(struct device *dev)
>   {
>   	struct Scsi_Host *shost = dev_to_shost(dev->parent);
>   	struct scsi_target *starget;
> +	unsigned long tid = 0;
>   	unsigned long flags;
>   
> -restart:
>   	spin_lock_irqsave(shost->host_lock, flags);
> -	list_for_each_entry(starget, &shost->__targets, siblings) {
> +	starget = xa_find(&shost->__targets, &tid, ULONG_MAX, XA_PRESENT);
> +	while (starget) {
>   		if (starget->state == STARGET_DEL ||
>   		    starget->state == STARGET_REMOVE ||
> -		    starget->state == STARGET_CREATED_REMOVE)
> +		    starget->state == STARGET_CREATED_REMOVE) {
> +			starget = xa_find_after(&shost->__targets, &tid,
> +						ULONG_MAX, XA_PRESENT);
>   			continue;
> +		}
>   		if (starget->dev.parent == dev || &starget->dev == dev) {
>   			kref_get(&starget->reap_ref);
>   			if (starget->state == STARGET_CREATED)
> @@ -1530,7 +1534,10 @@ void scsi_remove_target(struct device *dev)
>   			spin_unlock_irqrestore(shost->host_lock, flags);
>   			__scsi_remove_target(starget);
>   			scsi_target_reap(starget);
> -			goto restart;
> +			spin_lock_irqsave(shost->host_lock, flags);
> +			starget = xa_find_after(&shost->__targets, &tid,
> +						ULONG_MAX, XA_PRESENT);
> +			continue;
>   		}
>   	}
>   	spin_unlock_irqrestore(shost->host_lock, flags);
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index c3cba2aaf934..28034cc0fce5 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -345,9 +345,9 @@ extern struct scsi_device *scsi_device_from_queue(struct request_queue *q);
>   extern int __must_check scsi_device_get(struct scsi_device *);
>   extern void scsi_device_put(struct scsi_device *);
>   extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *,
> -					      uint, uint, u64);
> +					      u16, u16, u64);
>   extern struct scsi_device *__scsi_device_lookup(struct Scsi_Host *,
> -						uint, uint, u64);
> +						u16, u16, u64);
>   extern struct scsi_device *scsi_device_lookup_by_target(struct scsi_target *,
>   							u64);
>   extern struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *,
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index 822e8cda8d9b..b9395676c75b 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -521,7 +521,7 @@ struct Scsi_Host {
>   	 * access this list directly from a driver.
>   	 */
>   	struct list_head	__devices;
> -	struct list_head	__targets;
> +	struct xarray		__targets;
>   	
>   	struct list_head	starved_list;
>   
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] scsi: convert target lookup to xarray
  2020-05-28 16:36 ` [PATCH 1/4] scsi: convert target lookup to xarray Hannes Reinecke
  2020-05-28 17:18   ` Douglas Gilbert
@ 2020-05-28 17:48   ` Bart Van Assche
  2020-05-29  6:24     ` Hannes Reinecke
  1 sibling, 1 reply; 25+ messages in thread
From: Bart Van Assche @ 2020-05-28 17:48 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi

On 2020-05-28 09:36, Hannes Reinecke wrote:
> +#define scsi_target_index(s) \
> +	((((unsigned long)(s)->channel) << 16) | (s)->id)

Please define scsi_target_index() as an inline function instead of a macro.

> +	if (xa_insert(&shost->__targets, tid, starget, GFP_KERNEL)) {
> +		dev_printk(KERN_ERR, dev, "target index busy\n");
> +		kfree(starget);
> +		return NULL;
> +	}

So the above code passes GFP_KERNEL to xa_insert() while holding a
spinlock? That doesn't seem correct to me. Since xa_insert() provides
locking itself, can the spin_lock_irqsave(shost->host_lock, flags) /
spin_unlock_irqrestore(shost->host_lock, flags) pair be removed and can
the xa_load() and xa_insert() calls be combined into a single
xa_insert() call? I think xa_insert() returns -EBUSY if an entry already
exists.

> -restart:
>  	spin_lock_irqsave(shost->host_lock, flags);
> -	list_for_each_entry(starget, &shost->__targets, siblings) {
> +	starget = xa_find(&shost->__targets, &tid, ULONG_MAX, XA_PRESENT);
> +	while (starget) {
>  		if (starget->state == STARGET_DEL ||
>  		    starget->state == STARGET_REMOVE ||
> -		    starget->state == STARGET_CREATED_REMOVE)
> +		    starget->state == STARGET_CREATED_REMOVE) {
> +			starget = xa_find_after(&shost->__targets, &tid,
> +						ULONG_MAX, XA_PRESENT);
>  			continue;
> +		}
>  		if (starget->dev.parent == dev || &starget->dev == dev) {
>  			kref_get(&starget->reap_ref);
>  			if (starget->state == STARGET_CREATED)
> @@ -1530,7 +1534,10 @@ void scsi_remove_target(struct device *dev)
>  			spin_unlock_irqrestore(shost->host_lock, flags);
>  			__scsi_remove_target(starget);
>  			scsi_target_reap(starget);
> -			goto restart;
> +			spin_lock_irqsave(shost->host_lock, flags);
> +			starget = xa_find_after(&shost->__targets, &tid,
> +						ULONG_MAX, XA_PRESENT);
> +			continue;
>  		}
>  	}

How about using a for loop instead of a while loop such that the
xa_find_after() statement does not have to be duplicated?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-28 16:36 ` [PATCH 3/4] scsi: move target device list to xarray Hannes Reinecke
@ 2020-05-28 17:50   ` Douglas Gilbert
  2020-05-28 18:54     ` Matthew Wilcox
  0 siblings, 1 reply; 25+ messages in thread
From: Douglas Gilbert @ 2020-05-28 17:50 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: Christoph Hellwig, Daniel Wagner, Johannes Thumshirn,
	James Bottomley, linux-scsi, Matthew Wilcox

On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
> Use xarray for device lookup by target. LUNs below 256 are linear,
> and can be used directly as the index into the xarray.
> LUNs above 256 have a distinct LUN format, and are not necessarily
> linear. They'll be stored in indices above 256 in the xarray, with
> the next free index in the xarray.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>   drivers/scsi/scsi.c        | 32 +++++++++++++++++++-------------
>   drivers/scsi/scsi_lib.c    |  9 ++++-----
>   drivers/scsi/scsi_scan.c   |  3 +--
>   drivers/scsi/scsi_sysfs.c  | 17 +++++++++++++++--
>   include/scsi/scsi_device.h |  4 ++--
>   5 files changed, 41 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index d601424e32b2..9dbbc51a1eb5 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -601,13 +601,14 @@ static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
>   void starget_for_each_device(struct scsi_target *starget, void *data,
>   		     void (*fn)(struct scsi_device *, void *))
>   {
> -	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
>   	struct scsi_device *sdev;
> +	unsigned long lun_id = 0;
>   
> -	shost_for_each_device(sdev, shost) {
> -		if ((sdev->channel == starget->channel) &&
> -		    (sdev->id == starget->id))
> -			fn(sdev, data);
> +	xa_for_each(&starget->devices, lun_id, sdev) {
> +		if (scsi_device_get(sdev))
> +			continue;
> +		fn(sdev, data);
> +		scsi_device_put(sdev);
>   	}
>   }
>   EXPORT_SYMBOL(starget_for_each_device);
> @@ -629,14 +630,11 @@ EXPORT_SYMBOL(starget_for_each_device);
>   void __starget_for_each_device(struct scsi_target *starget, void *data,
>   			       void (*fn)(struct scsi_device *, void *))
>   {
> -	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
>   	struct scsi_device *sdev;
> +	unsigned long lun_id = 0;
>   
> -	__shost_for_each_device(sdev, shost) {
> -		if ((sdev->channel == starget->channel) &&
> -		    (sdev->id == starget->id))
> -			fn(sdev, data);
> -	}
> +	xa_for_each(&starget->devices, lun_id, sdev)
> +		fn(sdev, data);
>   }
>   EXPORT_SYMBOL(__starget_for_each_device);
>   
> @@ -659,11 +657,19 @@ struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *starget,
>   						   u64 lun)
>   {
>   	struct scsi_device *sdev;
> +	unsigned long lun_idx = 256;
> +
> +	if (lun < lun_idx) {
> +		sdev = xa_load(&starget->devices, lun);
> +		if (sdev && sdev->sdev_state == SDEV_DEL)
> +			sdev = NULL;
> +		return sdev;
> +	}
>   
> -	list_for_each_entry(sdev, &starget->devices, same_target_siblings) {
> +	xa_for_each(&starget->devices, lun_idx, sdev) {
>   		if (sdev->sdev_state == SDEV_DEL)
>   			continue;
> -		if (sdev->lun ==lun)
> +		if (sdev->lun == lun)
>   			return sdev;
>   	}
>   
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index c163fa22267c..f9829fc0d84e 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -361,9 +361,9 @@ static void scsi_kick_queue(struct request_queue *q)
>   static void scsi_single_lun_run(struct scsi_device *current_sdev)
>   {
>   	struct Scsi_Host *shost = current_sdev->host;
> -	struct scsi_device *sdev, *tmp;
> +	struct scsi_device *sdev;
>   	struct scsi_target *starget = scsi_target(current_sdev);
> -	unsigned long flags;
> +	unsigned long flags, lun_id = 0;
>   
>   	spin_lock_irqsave(shost->host_lock, flags);
>   	starget->starget_sdev_user = NULL;
> @@ -380,8 +380,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
>   	spin_lock_irqsave(shost->host_lock, flags);
>   	if (starget->starget_sdev_user)
>   		goto out;
> -	list_for_each_entry_safe(sdev, tmp, &starget->devices,
> -			same_target_siblings) {
> +	xa_for_each(&starget->devices, lun_id, sdev) {
>   		if (sdev == current_sdev)
>   			continue;
>   		if (scsi_device_get(sdev))
> @@ -390,7 +389,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
>   		spin_unlock_irqrestore(shost->host_lock, flags);
>   		scsi_kick_queue(sdev->request_queue);
>   		spin_lock_irqsave(shost->host_lock, flags);
> -	
> +
>   		scsi_device_put(sdev);
>   	}
>    out:
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index dc2656df495b..c7aba9ba5c0c 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -235,7 +235,6 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
>   	mutex_init(&sdev->state_mutex);
>   	sdev->sdev_state = SDEV_CREATED;
>   	INIT_LIST_HEAD(&sdev->siblings);
> -	INIT_LIST_HEAD(&sdev->same_target_siblings);
>   	INIT_LIST_HEAD(&sdev->starved_entry);
>   	INIT_LIST_HEAD(&sdev->event_list);
>   	spin_lock_init(&sdev->list_lock);
> @@ -417,7 +416,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
>   	starget->id = id;
>   	starget->channel = channel;
>   	starget->can_queue = 0;
> -	INIT_LIST_HEAD(&starget->devices);
> +	xa_init(&starget->devices);
>   	starget->state = STARGET_CREATED;
>   	starget->scsi_level = SCSI_2;
>   	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 95aaa96ce03b..27c19232f175 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -434,6 +434,7 @@ static void scsi_device_cls_release(struct device *class_dev)
>   static void scsi_device_dev_release_usercontext(struct work_struct *work)
>   {
>   	struct scsi_device *sdev;
> +	struct scsi_target *starget;
>   	struct device *parent;
>   	struct list_head *this, *tmp;
>   	struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL;
> @@ -441,6 +442,7 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
>   	unsigned long flags;
>   
>   	sdev = container_of(work, struct scsi_device, ew.work);
> +	starget = sdev->sdev_target;
>   
>   	scsi_dh_release_device(sdev);
>   
> @@ -448,7 +450,7 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
>   
>   	spin_lock_irqsave(sdev->host->host_lock, flags);
>   	list_del(&sdev->siblings);
> -	list_del(&sdev->same_target_siblings);
> +	xa_erase(&starget->devices, sdev->lun_idx);
>   	list_del(&sdev->starved_entry);
>   	spin_unlock_irqrestore(sdev->host->host_lock, flags);
>   
> @@ -1621,7 +1623,18 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
>   
>   	transport_setup_device(&sdev->sdev_gendev);
>   	spin_lock_irqsave(shost->host_lock, flags);
> -	list_add_tail(&sdev->same_target_siblings, &starget->devices);
> +	if (sdev->lun < 256) {
> +		sdev->lun_idx = sdev->lun;
> +		WARN_ON(xa_insert(&starget->devices, sdev->lun_idx,
> +				   sdev, GFP_KERNEL) < 0);

You have interrupts masked again, so GFP_KERNEL is a non-no.

Also, like most modern kernel libraries, xarray returns 0 for good
and a negated errno for an error. If you are going to WARN, I would
like to know what that negated errno is.

> +	} else {
> +		struct xa_limit scsi_lun_limit = {
> +			.min = 256,
> +			.max = UINT_MAX,
> +		};
> +		WARN_ON(xa_alloc(&starget->devices, &sdev->lun_idx,
> +				  sdev, scsi_lun_limit, GFP_KERNEL) < 0);

dito

> +	}
>   	list_add_tail(&sdev->siblings, &shost->__devices);
>   	spin_unlock_irqrestore(shost->host_lock, flags);
>   	/*
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index 28034cc0fce5..2c6b9d8bc40e 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -104,7 +104,6 @@ struct scsi_device {
>   
>   	/* the next two are protected by the host->host_lock */
>   	struct list_head    siblings;   /* list of all devices on this host */
> -	struct list_head    same_target_siblings; /* just the devices sharing same target id */
>   
>   	atomic_t device_busy;		/* commands actually active on LLDD */
>   	atomic_t device_blocked;	/* Device returned QUEUE_FULL. */
> @@ -123,6 +122,7 @@ struct scsi_device {
>   
>   	unsigned int id, channel;
>   	u64 lun;

Is 'lun' a T10 LUN as in a big-endian array of bytes, converted to a 64 bit
native unsigned integer, or is that the Linux 8 no, 16 no, 32 no, 64 bit
hack? /* A comment might help there. */

> +	unsigned int lun_idx;		/* Index into target device xarray */

Xarray gives you _two_ choices for the type of an index :-)
It is either u32 (as used in xa_alloc() and its variants) _or_
unsigned long (as used everywhere else in xarray where an index
is needed). So it's definitely 32 bits for xa_alloc() and either
32 or 64 bits for the rest of xarray depending on the machine
architecture.
Matthew Wilcox may care to explain why this is ...

By using 'unsigned int lun_idx;' you make a confusing picture even
more so. A comment wouldn't hurt there either since the splat the
compiler spits out it if you use 'unsigned int' for the second argument
to the xa_for_each() macro [which is called 'index'] is not very
instructive. [That macro takes the address of the given index and
feeds it to xa_find_after() which expects 'unsigned long *' .]

>   	unsigned int manufacturer;	/* Manufacturer of device, for using
>   					 * vendor-specific cmd's */
>   	unsigned sector_size;	/* size in bytes */
> @@ -285,7 +285,7 @@ enum scsi_target_state {
>   struct scsi_target {
>   	struct scsi_device	*starget_sdev_user;
>   	struct list_head	siblings;
> -	struct list_head	devices;
> +	struct xarray		devices;
>   	struct device		dev;
>   	struct kref		reap_ref; /* last put renders target invisible */
>   	unsigned int		channel;
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] target_core_pscsi: use __scsi_device_lookup()
  2020-05-28 16:36 ` [PATCH 2/4] target_core_pscsi: use __scsi_device_lookup() Hannes Reinecke
@ 2020-05-28 17:55   ` Bart Van Assche
  2020-05-29  7:02   ` Daniel Wagner
  1 sibling, 0 replies; 25+ messages in thread
From: Bart Van Assche @ 2020-05-28 17:55 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi

On 2020-05-28 09:36, Hannes Reinecke wrote:
> Instead of walking the list of devices manually use the helper
> function to return the device directly.

Reviewed-by: Bart van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-28 17:50   ` Douglas Gilbert
@ 2020-05-28 18:54     ` Matthew Wilcox
  2020-05-28 19:44       ` Douglas Gilbert
  2020-05-28 20:58       ` Hannes Reinecke
  0 siblings, 2 replies; 25+ messages in thread
From: Matthew Wilcox @ 2020-05-28 18:54 UTC (permalink / raw)
  To: Douglas Gilbert
  Cc: Hannes Reinecke, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On Thu, May 28, 2020 at 01:50:02PM -0400, Douglas Gilbert wrote:
> On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
> > Use xarray for device lookup by target. LUNs below 256 are linear,
> > and can be used directly as the index into the xarray.
> > LUNs above 256 have a distinct LUN format, and are not necessarily
> > linear. They'll be stored in indices above 256 in the xarray, with
> > the next free index in the xarray.

I don't understand why you think this is an improvement over just
using an allocating XArray for all LUNs.  It seems like more code,
and doesn't actually save you any storage space ... ?

> > +++ b/drivers/scsi/scsi.c
> > @@ -601,13 +601,14 @@ static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
> >   void starget_for_each_device(struct scsi_target *starget, void *data,
> >   		     void (*fn)(struct scsi_device *, void *))
> >   {
> > -	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
> >   	struct scsi_device *sdev;
> > +	unsigned long lun_id = 0;
> > -	shost_for_each_device(sdev, shost) {
> > -		if ((sdev->channel == starget->channel) &&
> > -		    (sdev->id == starget->id))
> > -			fn(sdev, data);
> > +	xa_for_each(&starget->devices, lun_id, sdev) {
> > +		if (scsi_device_get(sdev))
> > +			continue;
> > +		fn(sdev, data);
> > +		scsi_device_put(sdev);
> >   	}
> >   }

I appreciate you're implementing the API that's currently in use, but I would recommend open-coding this.  Looking at the functions called, lots of them
are two-three liners, eg:

static void __scsi_report_device_reset(struct scsi_device *sdev, void *data)
{
        sdev->was_reset = 1;
        sdev->expecting_cc_ua = 1;
}

which would just be more readable:

        if (rtn == SUCCESS) {
		struct scsi_target *starget = scsi_target(scmd->device);
		struct scsi_device *sdev;
		unsigned long index;

                spin_lock_irqsave(host->host_lock, flags);
		xa_for_each(&starget->devices, index, sdev) {
			sdev->was_reset = 1;
			sdev->expecting_cc_ua = 1;
		}
                spin_unlock_irqrestore(host->host_lock, flags);
        }

And then maybe you could make a convincing argument that the spin_lock
there could be turned into an xa_lock since that will prevent sdevs from
coming & going during the iteration of the device list.

> > @@ -1621,7 +1623,18 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
> >   	transport_setup_device(&sdev->sdev_gendev);
> >   	spin_lock_irqsave(shost->host_lock, flags);
> > -	list_add_tail(&sdev->same_target_siblings, &starget->devices);
> > +	if (sdev->lun < 256) {
> > +		sdev->lun_idx = sdev->lun;
> > +		WARN_ON(xa_insert(&starget->devices, sdev->lun_idx,
> > +				   sdev, GFP_KERNEL) < 0);
> 
> You have interrupts masked again, so GFP_KERNEL is a non-no.

Actually, I would say this is a great place to not use the host_lock.

+	int err;
...
+	if (sdev->lun < 256) {
+		sdev->lun_idx = sdev->lun;
+		err = xa_insert_irq(&starget->devices, sdev->lun_idx, sdev,
+				GFP_KERNEL);
+	} else {
+		err = xa_alloc_irq(&starget->devices, &sdev->lun_idx, sdev,
+				scsi_lun_limit, GFP_KERNEL);
+	}
+	... maybe you want to convert scsi_sysfs_device_initialize()
+	to return an errno, although honestly this should never fail ...
        spin_lock_irqsave(shost->host_lock, flags);
-       list_add_tail(&sdev->same_target_siblings, &starget->devices);
        list_add_tail(&sdev->siblings, &shost->__devices);
        spin_unlock_irqrestore(shost->host_lock, flags);


> > +	unsigned int lun_idx;		/* Index into target device xarray */
> 
> Xarray gives you _two_ choices for the type of an index :-)
> It is either u32 (as used in xa_alloc() and its variants) _or_
> unsigned long (as used everywhere else in xarray where an index
> is needed). So it's definitely 32 bits for xa_alloc() and either
> 32 or 64 bits for the rest of xarray depending on the machine
> architecture.
> Matthew Wilcox may care to explain why this is ...

Saving space in data structures, mostly.  It's not really useful to
have 4 billion things in an allocating XArray.  Indeed, it's not really
useful to have 4 billion of almost anything.  So we have XArrays which
are indexed by something sensible like page index in file, and they're
nice and well-behaved (most people don't create files in excess of 16TB,
and those who do don't expect amazing performance when seeking around
in them at random because they've lost all caching effects).  And then
you have people who probably want one scsi device.  Maybe a dozen.
Possibly a thousand.  Definitely not a million.  If you get 4 billion scsi
devices, something's gone very wrong.  You've run out of drive letters,
for a start (/dev/sdmzabcd anybody?)

It doesn't cost us anything to just use unsigned long as the index of an
XArray, due to how it's constructed.  Except in this one place where we
need to point to somewhere to store the index so that we can update the
index within an sdev before anybody iterating the data structure under
RCU can find the uninitialised index.

This patch seems decent enough ... I just think the decision to optimise
the layout for LUNs < 256 is a bit odd.  But hey, your subsystem,
not mine.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] scsi: convert target lookup to xarray
  2020-05-28 17:18   ` Douglas Gilbert
@ 2020-05-28 19:08     ` Douglas Gilbert
  0 siblings, 0 replies; 25+ messages in thread
From: Douglas Gilbert @ 2020-05-28 19:08 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: Christoph Hellwig, Daniel Wagner, Johannes Thumshirn,
	James Bottomley, linux-scsi

On 2020-05-28 1:18 p.m., Douglas Gilbert wrote:
> The following suggestions are based on trying to _run_ this code ...
> foolish, I know.

Now with the patches I've suggested it runs. So I ran my modprobe,
rmmod test building gradually out to 6000 odd devices. It ran
for a few minutes then got clobbered by the OOM killer.

Perhaps there are some misplaced xa_destroy()s, I thought.
After grep-ing, "un-placed" would be a better description.

So Hannes, for each xa_init*() call made, there needs to be a
xa_destroy() at the appropriate place and time, perferably before
the OOM killer strikes :-)

Doug Gilbert

> On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
>> Use an xarray instead of lists for holding the scsi targets.
>> I've also shortened the 'channel' and 'id' values to 16 bit
>> as none of the drivers requires a full 32bit range for either
>> of them, and by shortening them we can use them as the index
>> into the xarray for storing the scsi_target pointer.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   drivers/scsi/hosts.c       |  2 +-
>>   drivers/scsi/scsi.c        | 32 ++++++++++++++++++++------------
>>   drivers/scsi/scsi_scan.c   | 43 ++++++++++++++++---------------------------
>>   drivers/scsi/scsi_sysfs.c  | 15 +++++++++++----
>>   include/scsi/scsi_device.h |  4 ++--
>>   include/scsi/scsi_host.h   |  2 +-
>>   6 files changed, 51 insertions(+), 47 deletions(-)
>>
>> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
>> index 7ec91c3a66ca..7109afad0183 100644
>> --- a/drivers/scsi/hosts.c
>> +++ b/drivers/scsi/hosts.c
>> @@ -383,7 +383,7 @@ struct Scsi_Host *scsi_host_alloc(struct 
>> scsi_host_template *sht, int privsize)
>>       spin_lock_init(shost->host_lock);
>>       shost->shost_state = SHOST_CREATED;
>>       INIT_LIST_HEAD(&shost->__devices);
>> -    INIT_LIST_HEAD(&shost->__targets);
>> +    xa_init(&shost->__targets);
> 
> Please use: xa_init_flags(&shost->__targets, XA_FLAGS_LOCK_IRQ);
> 
> If you ever touch (modify?) that array in interrupt context then I
> believe you need this. See the first 50 or so lines in lib/xarray.c
> to see that it is used for xarray internals (e.g. xas_lock_type() ).
> 
> The xarray interface (include/linux/xarray.h) also has *_irq() and
> *_bh() variants (e.g. xa_insert(), xa_insert_bh(), xa_insert_irq() )
> but setting that flag in the xa_init_flags() call seems to be a
> safer approach. IOWs it is an alternative to using the *_irq()
> variants on all the _modifying_ xarray calls.
> 
>>       INIT_LIST_HEAD(&shost->eh_cmd_q);
>>       INIT_LIST_HEAD(&shost->starved_list);
>>       init_waitqueue_head(&shost->host_wait);
>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
>> index 56c24a73e0c7..d601424e32b2 100644
>> --- a/drivers/scsi/scsi.c
>> +++ b/drivers/scsi/scsi.c
>> @@ -575,6 +575,19 @@ struct scsi_device *__scsi_iterate_devices(struct 
>> Scsi_Host *shost,
>>   }
>>   EXPORT_SYMBOL(__scsi_iterate_devices);
>> +/**
>> + * __scsi_target_lookup  -  find a target based on channel and target id
>> + * @shost:    SCSI host pointer
>> + * @channel:    channel number of the target
>> + * @id:        ID of the target
>> + *
>> + */
>> +static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
>> +                     u16 channel, u16 id)
>> +{
>> +    return xa_load(&shost->__targets, (channel << 16) | id);
>> +}
>> +
>>   /**
>>    * starget_for_each_device  -  helper to walk all devices of a target
>>    * @starget:    target whose devices we want to iterate over.
>> @@ -701,19 +714,14 @@ EXPORT_SYMBOL(scsi_device_lookup_by_target);
>>    * really want to use scsi_device_lookup instead.
>>    **/
>>   struct scsi_device *__scsi_device_lookup(struct Scsi_Host *shost,
>> -        uint channel, uint id, u64 lun)
>> +        u16 channel, u16 id, u64 lun)
>>   {
>> -    struct scsi_device *sdev;
>> +    struct scsi_target *starget;
>> -    list_for_each_entry(sdev, &shost->__devices, siblings) {
>> -        if (sdev->sdev_state == SDEV_DEL)
>> -            continue;
>> -        if (sdev->channel == channel && sdev->id == id &&
>> -                sdev->lun ==lun)
>> -            return sdev;
>> -    }
>> -
>> -    return NULL;
>> +    starget = __scsi_target_lookup(shost, channel, id);
>> +    if (!starget)
>> +        return NULL;
>> +    return __scsi_device_lookup_by_target(starget, lun);
>>   }
>>   EXPORT_SYMBOL(__scsi_device_lookup);
>> @@ -729,7 +737,7 @@ EXPORT_SYMBOL(__scsi_device_lookup);
>>    * needs to be released with scsi_device_put once you're done with it.
>>    **/
>>   struct scsi_device *scsi_device_lookup(struct Scsi_Host *shost,
>> -        uint channel, uint id, u64 lun)
>> +        u16 channel, u16 id, u64 lun)
>>   {
>>       struct scsi_device *sdev;
>>       unsigned long flags;
>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
>> index f2437a7570ce..dc2656df495b 100644
>> --- a/drivers/scsi/scsi_scan.c
>> +++ b/drivers/scsi/scsi_scan.c
>> @@ -304,11 +304,15 @@ static struct scsi_device *scsi_alloc_sdev(struct 
>> scsi_target *starget,
>>       return NULL;
>>   }
>> +#define scsi_target_index(s) \
>> +    ((((unsigned long)(s)->channel) << 16) | (s)->id)
>> +
>>   static void scsi_target_destroy(struct scsi_target *starget)
>>   {
>>       struct device *dev = &starget->dev;
>>       struct Scsi_Host *shost = dev_to_shost(dev->parent);
>>       unsigned long flags;
>> +    unsigned long tid = scsi_target_index(starget);
>>       BUG_ON(starget->state == STARGET_DEL);
>>       starget->state = STARGET_DEL;
>> @@ -316,7 +320,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
>>       spin_lock_irqsave(shost->host_lock, flags);
>>       if (shost->hostt->target_destroy)
>>           shost->hostt->target_destroy(starget);
>> -    list_del_init(&starget->siblings);
>> +    xa_erase(&shost->__targets, tid);
>>       spin_unlock_irqrestore(shost->host_lock, flags);
>>       put_device(dev);
>>   }
>> @@ -341,27 +345,6 @@ int scsi_is_target_device(const struct device *dev)
>>   }
>>   EXPORT_SYMBOL(scsi_is_target_device);
>> -static struct scsi_target *__scsi_find_target(struct device *parent,
>> -                          int channel, uint id)
>> -{
>> -    struct scsi_target *starget, *found_starget = NULL;
>> -    struct Scsi_Host *shost = dev_to_shost(parent);
>> -    /*
>> -     * Search for an existing target for this sdev.
>> -     */
>> -    list_for_each_entry(starget, &shost->__targets, siblings) {
>> -        if (starget->id == id &&
>> -            starget->channel == channel) {
>> -            found_starget = starget;
>> -            break;
>> -        }
>> -    }
>> -    if (found_starget)
>> -        get_device(&found_starget->dev);
>> -
>> -    return found_starget;
>> -}
>> -
>>   /**
>>    * scsi_target_reap_ref_release - remove target from visibility
>>    * @kref: the reap_ref in the target being released
>> @@ -417,6 +400,7 @@ static struct scsi_target *scsi_alloc_target(struct device 
>> *parent,
>>       struct scsi_target *starget;
>>       struct scsi_target *found_target;
>>       int error, ref_got;
>> +    unsigned long tid;
>>       starget = kzalloc(size, GFP_KERNEL);
>>       if (!starget) {
>> @@ -433,19 +417,24 @@ static struct scsi_target *scsi_alloc_target(struct 
>> device *parent,
>>       starget->id = id;
>>       starget->channel = channel;
>>       starget->can_queue = 0;
>> -    INIT_LIST_HEAD(&starget->siblings);
>>       INIT_LIST_HEAD(&starget->devices);
>>       starget->state = STARGET_CREATED;
>>       starget->scsi_level = SCSI_2;
>>       starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
>> +    tid = scsi_target_index(starget);
>>    retry:
>>       spin_lock_irqsave(shost->host_lock, flags);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Interrupts now masked on this CPU.
> 
>> -    found_target = __scsi_find_target(parent, channel, id);
>> -    if (found_target)
>> +    found_target = xa_load(&shost->__targets, tid);
>> +    if (found_target) {
>> +        get_device(&found_target->dev);
>>           goto found;
>> -
>> -    list_add_tail(&starget->siblings, &shost->__targets);
>> +    }
>> +    if (xa_insert(&shost->__targets, tid, starget, GFP_KERNEL)) {
> 
> You have interrupts masked (on this CPU), you can't use GFP_KERNEL.
> If you run this code, you will be told :-)
> In this case, you must use GFP_ATOMIC.
> 
>> +        dev_printk(KERN_ERR, dev, "target index busy\n");
>> +        kfree(starget);
>> +        return NULL;
>> +    }
>>       spin_unlock_irqrestore(shost->host_lock, flags);
> ^^^^^^^^^^^^^^^^^^^^
> Now you can use GFP_KERNEL, if you want.
> 
>>       /* allocate and add */
>>       transport_setup_device(dev);
>> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
>> index 163dbcb741c1..95aaa96ce03b 100644
>> --- a/drivers/scsi/scsi_sysfs.c
>> +++ b/drivers/scsi/scsi_sysfs.c
>> @@ -1512,15 +1512,19 @@ void scsi_remove_target(struct device *dev)
>>   {
>>       struct Scsi_Host *shost = dev_to_shost(dev->parent);
>>       struct scsi_target *starget;
>> +    unsigned long tid = 0;
>>       unsigned long flags;
>> -restart:
>>       spin_lock_irqsave(shost->host_lock, flags);
>> -    list_for_each_entry(starget, &shost->__targets, siblings) {
>> +    starget = xa_find(&shost->__targets, &tid, ULONG_MAX, XA_PRESENT);
>> +    while (starget) {
>>           if (starget->state == STARGET_DEL ||
>>               starget->state == STARGET_REMOVE ||
>> -            starget->state == STARGET_CREATED_REMOVE)
>> +            starget->state == STARGET_CREATED_REMOVE) {
>> +            starget = xa_find_after(&shost->__targets, &tid,
>> +                        ULONG_MAX, XA_PRESENT);
>>               continue;
>> +        }
>>           if (starget->dev.parent == dev || &starget->dev == dev) {
>>               kref_get(&starget->reap_ref);
>>               if (starget->state == STARGET_CREATED)
>> @@ -1530,7 +1534,10 @@ void scsi_remove_target(struct device *dev)
>>               spin_unlock_irqrestore(shost->host_lock, flags);
>>               __scsi_remove_target(starget);
>>               scsi_target_reap(starget);
>> -            goto restart;
>> +            spin_lock_irqsave(shost->host_lock, flags);
>> +            starget = xa_find_after(&shost->__targets, &tid,
>> +                        ULONG_MAX, XA_PRESENT);
>> +            continue;
>>           }
>>       }
>>       spin_unlock_irqrestore(shost->host_lock, flags);
>> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
>> index c3cba2aaf934..28034cc0fce5 100644
>> --- a/include/scsi/scsi_device.h
>> +++ b/include/scsi/scsi_device.h
>> @@ -345,9 +345,9 @@ extern struct scsi_device *scsi_device_from_queue(struct 
>> request_queue *q);
>>   extern int __must_check scsi_device_get(struct scsi_device *);
>>   extern void scsi_device_put(struct scsi_device *);
>>   extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *,
>> -                          uint, uint, u64);
>> +                          u16, u16, u64);
>>   extern struct scsi_device *__scsi_device_lookup(struct Scsi_Host *,
>> -                        uint, uint, u64);
>> +                        u16, u16, u64);
>>   extern struct scsi_device *scsi_device_lookup_by_target(struct scsi_target *,
>>                               u64);
>>   extern struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *,
>> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
>> index 822e8cda8d9b..b9395676c75b 100644
>> --- a/include/scsi/scsi_host.h
>> +++ b/include/scsi/scsi_host.h
>> @@ -521,7 +521,7 @@ struct Scsi_Host {
>>        * access this list directly from a driver.
>>        */
>>       struct list_head    __devices;
>> -    struct list_head    __targets;
>> +    struct xarray        __targets;
>>
>>       struct list_head    starved_list;
>>
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-28 18:54     ` Matthew Wilcox
@ 2020-05-28 19:44       ` Douglas Gilbert
  2020-05-28 19:53         ` Matthew Wilcox
  2020-05-29  6:45         ` Hannes Reinecke
  2020-05-28 20:58       ` Hannes Reinecke
  1 sibling, 2 replies; 25+ messages in thread
From: Douglas Gilbert @ 2020-05-28 19:44 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Hannes Reinecke, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On 2020-05-28 2:54 p.m., Matthew Wilcox wrote:
> On Thu, May 28, 2020 at 01:50:02PM -0400, Douglas Gilbert wrote:
>> On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
>>> Use xarray for device lookup by target. LUNs below 256 are linear,
>>> and can be used directly as the index into the xarray.
>>> LUNs above 256 have a distinct LUN format, and are not necessarily
>>> linear. They'll be stored in indices above 256 in the xarray, with
>>> the next free index in the xarray.
> 
> I don't understand why you think this is an improvement over just
> using an allocating XArray for all LUNs.  It seems like more code,
> and doesn't actually save you any storage space ... ?
> 
>>> +++ b/drivers/scsi/scsi.c
>>> @@ -601,13 +601,14 @@ static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
>>>    void starget_for_each_device(struct scsi_target *starget, void *data,
>>>    		     void (*fn)(struct scsi_device *, void *))
>>>    {
>>> -	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
>>>    	struct scsi_device *sdev;
>>> +	unsigned long lun_id = 0;
>>> -	shost_for_each_device(sdev, shost) {
>>> -		if ((sdev->channel == starget->channel) &&
>>> -		    (sdev->id == starget->id))
>>> -			fn(sdev, data);
>>> +	xa_for_each(&starget->devices, lun_id, sdev) {
>>> +		if (scsi_device_get(sdev))
>>> +			continue;
>>> +		fn(sdev, data);
>>> +		scsi_device_put(sdev);
>>>    	}
>>>    }
> 
> I appreciate you're implementing the API that's currently in use, but I would recommend open-coding this.  Looking at the functions called, lots of them
> are two-three liners, eg:
> 
> static void __scsi_report_device_reset(struct scsi_device *sdev, void *data)
> {
>          sdev->was_reset = 1;
>          sdev->expecting_cc_ua = 1;
> }
> 
> which would just be more readable:
> 
>          if (rtn == SUCCESS) {
> 		struct scsi_target *starget = scsi_target(scmd->device);
> 		struct scsi_device *sdev;
> 		unsigned long index;
> 
>                  spin_lock_irqsave(host->host_lock, flags);
> 		xa_for_each(&starget->devices, index, sdev) {
> 			sdev->was_reset = 1;
> 			sdev->expecting_cc_ua = 1;
> 		}
>                  spin_unlock_irqrestore(host->host_lock, flags);
>          }
> 
> And then maybe you could make a convincing argument that the spin_lock
> there could be turned into an xa_lock since that will prevent sdevs from
> coming & going during the iteration of the device list.

I tried that but ran into problems. The xarray model is clear enough,
but there is a (non-atomic) enumerated state in each sdev (struct
scsi_device object (pointer)) that is protected by a mutex.
I was unable to escape those pesky (but very useful) warnings out of
the depths of xarray that the locking was awry. When I've burrowed
into xarray.c I have usually found that it was correct. So now I regard
it as a pesky feature :-)

That said, in my work on the sg driver rewrite, I got rid of all
the explicit spinlocks in favour of xa_locks and that works fine, IMO.
The locking in the SCSI ML is a more complex beast. Maybe it doesn't
need to be ...


Hannes, why not make sdev_state an atomic and get rid of that mutex?

Virtually none of this code is on the command fastpath, but Ming Lei
found that trimming down the size and position of members in sdev
could lead to performance improvements.

These changes may improve the time taken to stabilize a system with large
disk arrays attached at startup, shutdown and when there are disruptions
or errors.

>>> @@ -1621,7 +1623,18 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
>>>    	transport_setup_device(&sdev->sdev_gendev);
>>>    	spin_lock_irqsave(shost->host_lock, flags);
>>> -	list_add_tail(&sdev->same_target_siblings, &starget->devices);
>>> +	if (sdev->lun < 256) {
>>> +		sdev->lun_idx = sdev->lun;
>>> +		WARN_ON(xa_insert(&starget->devices, sdev->lun_idx,
>>> +				   sdev, GFP_KERNEL) < 0);
>>
>> You have interrupts masked again, so GFP_KERNEL is a non-no.
> 
> Actually, I would say this is a great place to not use the host_lock.
> 
> +	int err;
> ...
> +	if (sdev->lun < 256) {
> +		sdev->lun_idx = sdev->lun;
> +		err = xa_insert_irq(&starget->devices, sdev->lun_idx, sdev,
> +				GFP_KERNEL);
> +	} else {
> +		err = xa_alloc_irq(&starget->devices, &sdev->lun_idx, sdev,
> +				scsi_lun_limit, GFP_KERNEL);
> +	}
> +	... maybe you want to convert scsi_sysfs_device_initialize()
> +	to return an errno, although honestly this should never fail ...
>          spin_lock_irqsave(shost->host_lock, flags);
> -       list_add_tail(&sdev->same_target_siblings, &starget->devices);
>          list_add_tail(&sdev->siblings, &shost->__devices);
>          spin_unlock_irqrestore(shost->host_lock, flags);
> 
> 
>>> +	unsigned int lun_idx;		/* Index into target device xarray */
>>
>> Xarray gives you _two_ choices for the type of an index :-)
>> It is either u32 (as used in xa_alloc() and its variants) _or_
>> unsigned long (as used everywhere else in xarray where an index
>> is needed). So it's definitely 32 bits for xa_alloc() and either
>> 32 or 64 bits for the rest of xarray depending on the machine
>> architecture.
>> Matthew Wilcox may care to explain why this is ...
> 
> Saving space in data structures, mostly.  It's not really useful to
> have 4 billion things in an allocating XArray.  Indeed, it's not really
> useful to have 4 billion of almost anything.  So we have XArrays which
> are indexed by something sensible like page index in file, and they're
> nice and well-behaved (most people don't create files in excess of 16TB,
> and those who do don't expect amazing performance when seeking around
> in them at random because they've lost all caching effects).  And then
> you have people who probably want one scsi device.  Maybe a dozen.
> Possibly a thousand.  Definitely not a million.  If you get 4 billion scsi
> devices, something's gone very wrong.  You've run out of drive letters,
> for a start (/dev/sdmzabcd anybody?)

    no_uld=1     # RAID cards might do this on its physicals so smartmontools
                 # can check them

Then you only have to worry about /dev/sg<n> and /dev/bsg/<hctl) numbers
or tuples going through the roof :-)

> It doesn't cost us anything to just use unsigned long as the index of an
> XArray, due to how it's constructed.  Except in this one place where we
> need to point to somewhere to store the index so that we can update the
> index within an sdev before anybody iterating the data structure under
> RCU can find the uninitialised index.
> 
> This patch seems decent enough ... I just think the decision to optimise
> the layout for LUNs < 256 is a bit odd.  But hey, your subsystem,
> not mine.

Hannes has the most experience in that area. He has only been using xarray
for a week or so (I think). There is something important missing from his
patchset: any xa_destroy() calls :-)

Doug Gilbert



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-28 19:44       ` Douglas Gilbert
@ 2020-05-28 19:53         ` Matthew Wilcox
  2020-05-29  6:45         ` Hannes Reinecke
  1 sibling, 0 replies; 25+ messages in thread
From: Matthew Wilcox @ 2020-05-28 19:53 UTC (permalink / raw)
  To: Douglas Gilbert
  Cc: Hannes Reinecke, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On Thu, May 28, 2020 at 03:44:37PM -0400, Douglas Gilbert wrote:
> > And then maybe you could make a convincing argument that the spin_lock
> > there could be turned into an xa_lock since that will prevent sdevs from
> > coming & going during the iteration of the device list.
> 
> I tried that but ran into problems. The xarray model is clear enough,
> but there is a (non-atomic) enumerated state in each sdev (struct
> scsi_device object (pointer)) that is protected by a mutex.
> I was unable to escape those pesky (but very useful) warnings out of
> the depths of xarray that the locking was awry. When I've burrowed
> into xarray.c I have usually found that it was correct. So now I regard
> it as a pesky feature :-)

Fascinating.  So much of SCSI has to be IRQ-safe, and yet you can't look
at the sdev state outside process context ...

> > This patch seems decent enough ... I just think the decision to optimise
> > the layout for LUNs < 256 is a bit odd.  But hey, your subsystem,
> > not mine.
> 
> Hannes has the most experience in that area. He has only been using xarray
> for a week or so (I think). There is something important missing from his
> patchset: any xa_destroy() calls :-)

They're not necessarily needed.  If the XArray is empty, there's nothing
to do.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-28 18:54     ` Matthew Wilcox
  2020-05-28 19:44       ` Douglas Gilbert
@ 2020-05-28 20:58       ` Hannes Reinecke
  2020-05-29  0:20         ` Matthew Wilcox
  1 sibling, 1 reply; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-28 20:58 UTC (permalink / raw)
  To: Matthew Wilcox, Douglas Gilbert
  Cc: Martin K. Petersen, Christoph Hellwig, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi

On 5/28/20 8:54 PM, Matthew Wilcox wrote:
> On Thu, May 28, 2020 at 01:50:02PM -0400, Douglas Gilbert wrote:
>> On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
>>> Use xarray for device lookup by target. LUNs below 256 are linear,
>>> and can be used directly as the index into the xarray.
>>> LUNs above 256 have a distinct LUN format, and are not necessarily
>>> linear. They'll be stored in indices above 256 in the xarray, with
>>> the next free index in the xarray.
> 
> I don't understand why you think this is an improvement over just
> using an allocating XArray for all LUNs.  It seems like more code,
> and doesn't actually save you any storage space ... ?
> 
The LUN range is 64 bit.
I was under the impression that xarray can only handle up to unsigned 
long; which probably would work for 64bit systems, but there _are_ users 
running on 32 bit, and they get patently unhappy when we have to tell 
them 'sorry, not for you'.

At least I _know_ that I'll get complaints if I try a stunt like this.

If you can make xarray 64 bit on all systems, sure, I'm game.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-28 20:58       ` Hannes Reinecke
@ 2020-05-29  0:20         ` Matthew Wilcox
  2020-05-29  6:50           ` Hannes Reinecke
  0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2020-05-29  0:20 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Douglas Gilbert, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On Thu, May 28, 2020 at 10:58:31PM +0200, Hannes Reinecke wrote:
> On 5/28/20 8:54 PM, Matthew Wilcox wrote:
> > On Thu, May 28, 2020 at 01:50:02PM -0400, Douglas Gilbert wrote:
> > > On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
> > > > Use xarray for device lookup by target. LUNs below 256 are linear,
> > > > and can be used directly as the index into the xarray.
> > > > LUNs above 256 have a distinct LUN format, and are not necessarily
> > > > linear. They'll be stored in indices above 256 in the xarray, with
> > > > the next free index in the xarray.
> > 
> > I don't understand why you think this is an improvement over just
> > using an allocating XArray for all LUNs.  It seems like more code,
> > and doesn't actually save you any storage space ... ?
> > 
> The LUN range is 64 bit.
> I was under the impression that xarray can only handle up to unsigned long;
> which probably would work for 64bit systems, but there _are_ users running
> on 32 bit, and they get patently unhappy when we have to tell them 'sorry,
> not for you'.

I meant just use xa_alloc() for everything instead of using xa_insert for
0-255.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCHv3 0/4] scsi: use xarray ... scsi_alloc_target()
  2020-05-28 16:36 [PATCHv3 0/4] scsi: use xarray for devices and targets Hannes Reinecke
                   ` (3 preceding siblings ...)
  2020-05-28 16:36 ` [PATCH 4/4] scsi: remove direct device lookup per host Hannes Reinecke
@ 2020-05-29  4:21 ` Douglas Gilbert
  4 siblings, 0 replies; 25+ messages in thread
From: Douglas Gilbert @ 2020-05-29  4:21 UTC (permalink / raw)
  To: Hannes Reinecke, Martin K. Petersen
  Cc: Christoph Hellwig, Daniel Wagner, Johannes Thumshirn,
	James Bottomley, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

Hannes,
I believe scsi_alloc_target() is broken in the found_target case.
For starters starget is created, built and _not_ put in the xarray,
hence it is leaked?  Seems to me that the code shouldn't go
as far as it does before it detects the found_target case.

What I'm seeing in testing (script attached) after applying my
patches outlined in earlier posts (second attachment) is that
LUN 0 is missing on all targets within a host apart from
target_id==0 . For example:

# modprobe scsi_debug ptype=0x9 no_uld=1 max_luns=2 num_tgts=2
# lsscsi -gs
[0:0:0:0]    comms   Linux    scsi_debug       0189  -  /dev/sg0        -
[0:0:0:1]    comms   Linux    scsi_debug       0189  -  /dev/sg1        -
[0:0:1:1]    comms   Linux    scsi_debug       0189  -  /dev/sg2        -
[N:0:1:1]    disk    INTEL SSDPEKKF256G7L__1     /dev/nvme0n1  -      256GB

# sg_luns /dev/sg2
Lun list length = 16 which imples 2 lun entries
Report luns [select_report=0x0]:
     0000000000000000
     0001000000000000

So where did [0:0:1:0] go? The response to REPORT LUNS says it should
be there.

I thought about jumping into scsi_alloc_target() but it is horribly
complicated, so I'll let you deal with it :-)

Doug Gilbert

[-- Attachment #2: tst_sdebug_modpr_rmmod.sh --]
[-- Type: application/x-shellscript, Size: 808 bytes --]

[-- Attachment #3: hr2_fixes.patch --]
[-- Type: text/x-patch, Size: 4382 bytes --]

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 004a50a95560..7dff8ac850ab 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -345,6 +345,7 @@ static void scsi_host_dev_release(struct device *dev)
 
 	if (parent)
 		put_device(parent);
+	xa_destroy(&shost->__targets);
 	kfree(shost);
 }
 
@@ -382,7 +383,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	shost->host_lock = &shost->default_lock;
 	spin_lock_init(shost->host_lock);
 	shost->shost_state = SHOST_CREATED;
-	xa_init(&shost->__targets);
+	xa_init_flags(&shost->__targets, XA_FLAGS_LOCK_IRQ);
 	INIT_LIST_HEAD(&shost->eh_cmd_q);
 	INIT_LIST_HEAD(&shost->starved_list);
 	init_waitqueue_head(&shost->host_wait);
@@ -501,6 +502,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
  fail_index_remove:
 	ida_simple_remove(&host_index_ida, shost->host_no);
  fail_kfree:
+	xa_destroy(&shost->__targets);
 	kfree(shost);
 	return NULL;
 }
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index e75e7f93f8f6..e146c36adcf3 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -309,6 +309,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
 {
 	struct device *dev = &starget->dev;
 	struct Scsi_Host *shost = dev_to_shost(dev->parent);
+	struct scsi_target *e_starg;
 	unsigned long flags;
 	unsigned long tid = scsi_target_index(starget);
 
@@ -318,8 +319,10 @@ static void scsi_target_destroy(struct scsi_target *starget)
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (shost->hostt->target_destroy)
 		shost->hostt->target_destroy(starget);
-	xa_erase(&shost->__targets, tid);
+	e_starg = xa_erase(&shost->__targets, tid);
 	spin_unlock_irqrestore(shost->host_lock, flags);
+	if (e_starg != starget)
+		pr_err("%s: bad xa_erase()\n", __func__);
 	put_device(dev);
 }
 
@@ -328,6 +331,7 @@ static void scsi_target_dev_release(struct device *dev)
 	struct device *parent = dev->parent;
 	struct scsi_target *starget = to_scsi_target(dev);
 
+	xa_destroy(&starget->devices);
 	kfree(starget);
 	put_device(parent);
 }
@@ -415,7 +419,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	starget->id = id;
 	starget->channel = channel;
 	starget->can_queue = 0;
-	xa_init(&starget->devices);
+	xa_init_flags(&starget->devices, XA_FLAGS_LOCK_IRQ);
 	starget->state = STARGET_CREATED;
 	starget->scsi_level = SCSI_2;
 	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
@@ -428,7 +432,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 		get_device(&found_target->dev);
 		goto found;
 	}
-	if (xa_insert(&shost->__targets, tid, starget, GFP_KERNEL)) {
+	if (xa_insert(&shost->__targets, tid, starget, GFP_ATOMIC)) {
 		dev_printk(KERN_ERR, dev, "target index busy\n");
 		kfree(starget);
 		return NULL;
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 63fa57684782..8a5e6c0a4bed 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -434,6 +434,7 @@ static void scsi_device_cls_release(struct device *class_dev)
 static void scsi_device_dev_release_usercontext(struct work_struct *work)
 {
 	struct scsi_device *sdev;
+	struct scsi_device *e_sdev;
 	struct scsi_target *starget;
 	struct device *parent;
 	struct list_head *this, *tmp;
@@ -449,9 +450,11 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	parent = sdev->sdev_gendev.parent;
 
 	spin_lock_irqsave(sdev->host->host_lock, flags);
-	xa_erase(&starget->devices, sdev->lun_idx);
+	e_sdev = xa_erase(&starget->devices, sdev->lun_idx);
 	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
+	if (e_sdev != sdev)
+		pr_err("%s: bad sdev erase\n", __func__);
 
 	cancel_work_sync(&sdev->event_work);
 
@@ -1621,14 +1624,14 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
 	if (sdev->lun < 256) {
 		sdev->lun_idx = sdev->lun;
 		WARN_ON(xa_insert(&starget->devices, sdev->lun_idx,
-				   sdev, GFP_KERNEL) < 0);
+				   sdev, GFP_ATOMIC) < 0);
 	} else {
 		struct xa_limit scsi_lun_limit = {
 			.min = 256,
 			.max = UINT_MAX,
 		};
 		WARN_ON(xa_alloc(&starget->devices, &sdev->lun_idx,
-				  sdev, scsi_lun_limit, GFP_KERNEL) < 0);
+				  sdev, scsi_lun_limit, GFP_ATOMIC) < 0);
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	/*

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] scsi: convert target lookup to xarray
  2020-05-28 17:48   ` Bart Van Assche
@ 2020-05-29  6:24     ` Hannes Reinecke
  0 siblings, 0 replies; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-29  6:24 UTC (permalink / raw)
  To: Bart Van Assche, Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi

On 5/28/20 7:48 PM, Bart Van Assche wrote:
> On 2020-05-28 09:36, Hannes Reinecke wrote:
>> +#define scsi_target_index(s) \
>> +	((((unsigned long)(s)->channel) << 16) | (s)->id)
> 
> Please define scsi_target_index() as an inline function instead of a macro.
> 
>> +	if (xa_insert(&shost->__targets, tid, starget, GFP_KERNEL)) {
>> +		dev_printk(KERN_ERR, dev, "target index busy\n");
>> +		kfree(starget);
>> +		return NULL;
>> +	}
> 
> So the above code passes GFP_KERNEL to xa_insert() while holding a
> spinlock? That doesn't seem correct to me. Since xa_insert() provides
> locking itself, can the spin_lock_irqsave(shost->host_lock, flags) /
> spin_unlock_irqrestore(shost->host_lock, flags) pair be removed and can
> the xa_load() and xa_insert() calls be combined into a single
> xa_insert() call? I think xa_insert() returns -EBUSY if an entry already
> exists.
> 
>> -restart:
>>   	spin_lock_irqsave(shost->host_lock, flags);
>> -	list_for_each_entry(starget, &shost->__targets, siblings) {
>> +	starget = xa_find(&shost->__targets, &tid, ULONG_MAX, XA_PRESENT);
>> +	while (starget) {
>>   		if (starget->state == STARGET_DEL ||
>>   		    starget->state == STARGET_REMOVE ||
>> -		    starget->state == STARGET_CREATED_REMOVE)
>> +		    starget->state == STARGET_CREATED_REMOVE) {
>> +			starget = xa_find_after(&shost->__targets, &tid,
>> +						ULONG_MAX, XA_PRESENT);
>>   			continue;
>> +		}
>>   		if (starget->dev.parent == dev || &starget->dev == dev) {
>>   			kref_get(&starget->reap_ref);
>>   			if (starget->state == STARGET_CREATED)
>> @@ -1530,7 +1534,10 @@ void scsi_remove_target(struct device *dev)
>>   			spin_unlock_irqrestore(shost->host_lock, flags);
>>   			__scsi_remove_target(starget);
>>   			scsi_target_reap(starget);
>> -			goto restart;
>> +			spin_lock_irqsave(shost->host_lock, flags);
>> +			starget = xa_find_after(&shost->__targets, &tid,
>> +						ULONG_MAX, XA_PRESENT);
>> +			continue;
>>   		}
>>   	}
> 
> How about using a for loop instead of a while loop such that the
> xa_find_after() statement does not have to be duplicated?
> 
Ok, will do for the next round.

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-28 19:44       ` Douglas Gilbert
  2020-05-28 19:53         ` Matthew Wilcox
@ 2020-05-29  6:45         ` Hannes Reinecke
  1 sibling, 0 replies; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-29  6:45 UTC (permalink / raw)
  To: dgilbert, Matthew Wilcox
  Cc: Martin K. Petersen, Christoph Hellwig, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi

On 5/28/20 9:44 PM, Douglas Gilbert wrote:
> On 2020-05-28 2:54 p.m., Matthew Wilcox wrote:
>> On Thu, May 28, 2020 at 01:50:02PM -0400, Douglas Gilbert wrote:
>>> On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
>>>> Use xarray for device lookup by target. LUNs below 256 are linear,
>>>> and can be used directly as the index into the xarray.
>>>> LUNs above 256 have a distinct LUN format, and are not necessarily
>>>> linear. They'll be stored in indices above 256 in the xarray, with
>>>> the next free index in the xarray.
>>
>> I don't understand why you think this is an improvement over just
>> using an allocating XArray for all LUNs.  It seems like more code,
>> and doesn't actually save you any storage space ... ?
>>
>>>> +++ b/drivers/scsi/scsi.c
>>>> @@ -601,13 +601,14 @@ static struct scsi_target 
>>>> *__scsi_target_lookup(struct Scsi_Host *shost,
>>>>    void starget_for_each_device(struct scsi_target *starget, void 
>>>> *data,
>>>>                 void (*fn)(struct scsi_device *, void *))
>>>>    {
>>>> -    struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
>>>>        struct scsi_device *sdev;
>>>> +    unsigned long lun_id = 0;
>>>> -    shost_for_each_device(sdev, shost) {
>>>> -        if ((sdev->channel == starget->channel) &&
>>>> -            (sdev->id == starget->id))
>>>> -            fn(sdev, data);
>>>> +    xa_for_each(&starget->devices, lun_id, sdev) {
>>>> +        if (scsi_device_get(sdev))
>>>> +            continue;
>>>> +        fn(sdev, data);
>>>> +        scsi_device_put(sdev);
>>>>        }
>>>>    }
>>
>> I appreciate you're implementing the API that's currently in use, but 
>> I would recommend open-coding this.  Looking at the functions called, 
>> lots of them
>> are two-three liners, eg:
>>
>> static void __scsi_report_device_reset(struct scsi_device *sdev, void 
>> *data)
>> {
>>          sdev->was_reset = 1;
>>          sdev->expecting_cc_ua = 1;
>> }
>>
>> which would just be more readable:
>>
>>          if (rtn == SUCCESS) {
>>         struct scsi_target *starget = scsi_target(scmd->device);
>>         struct scsi_device *sdev;
>>         unsigned long index;
>>
>>                  spin_lock_irqsave(host->host_lock, flags);
>>         xa_for_each(&starget->devices, index, sdev) {
>>             sdev->was_reset = 1;
>>             sdev->expecting_cc_ua = 1;
>>         }
>>                  spin_unlock_irqrestore(host->host_lock, flags);
>>          }
>>
>> And then maybe you could make a convincing argument that the spin_lock
>> there could be turned into an xa_lock since that will prevent sdevs from
>> coming & going during the iteration of the device list.
> 
> I tried that but ran into problems. The xarray model is clear enough,
> but there is a (non-atomic) enumerated state in each sdev (struct
> scsi_device object (pointer)) that is protected by a mutex.
> I was unable to escape those pesky (but very useful) warnings out of
> the depths of xarray that the locking was awry. When I've burrowed
> into xarray.c I have usually found that it was correct. So now I regard
> it as a pesky feature :-)
> 
> That said, in my work on the sg driver rewrite, I got rid of all
> the explicit spinlocks in favour of xa_locks and that works fine, IMO.
> The locking in the SCSI ML is a more complex beast. Maybe it doesn't
> need to be ...
> 
This whole code will need to be updated once we decided to move to xarrays.
At this time I'm just trying to get the minimal patches in; cleanups can 
(and will) follow later.

> 
> Hannes, why not make sdev_state an atomic and get rid of that mutex?
> 
> Virtually none of this code is on the command fastpath, but Ming Lei
> found that trimming down the size and position of members in sdev
> could lead to performance improvements.
> 
> These changes may improve the time taken to stabilize a system with large
> disk arrays attached at startup, shutdown and when there are disruptions
> or errors.
> 
Virtually none is not identical to none.
And locking is an audit unto itself, so I would _not_ attempt to revise 
the locking model at the same time when changing the lookup.

Plan is to first change the lookup to xarray, and _then_ audit the 
locking such than we can move the xa_lock() in appropriate places.

>>>> @@ -1621,7 +1623,18 @@ void scsi_sysfs_device_initialize(struct 
>>>> scsi_device *sdev)
>>>>        transport_setup_device(&sdev->sdev_gendev);
>>>>        spin_lock_irqsave(shost->host_lock, flags);
>>>> -    list_add_tail(&sdev->same_target_siblings, &starget->devices);
>>>> +    if (sdev->lun < 256) {
>>>> +        sdev->lun_idx = sdev->lun;
>>>> +        WARN_ON(xa_insert(&starget->devices, sdev->lun_idx,
>>>> +                   sdev, GFP_KERNEL) < 0);
>>>
>>> You have interrupts masked again, so GFP_KERNEL is a non-no.
>>
>> Actually, I would say this is a great place to not use the host_lock.
>>
>> +    int err;
>> ...
>> +    if (sdev->lun < 256) {
>> +        sdev->lun_idx = sdev->lun;
>> +        err = xa_insert_irq(&starget->devices, sdev->lun_idx, sdev,
>> +                GFP_KERNEL);
>> +    } else {
>> +        err = xa_alloc_irq(&starget->devices, &sdev->lun_idx, sdev,
>> +                scsi_lun_limit, GFP_KERNEL);
>> +    }
>> +    ... maybe you want to convert scsi_sysfs_device_initialize()
>> +    to return an errno, although honestly this should never fail ...
>>          spin_lock_irqsave(shost->host_lock, flags);
>> -       list_add_tail(&sdev->same_target_siblings, &starget->devices);
>>          list_add_tail(&sdev->siblings, &shost->__devices);
>>          spin_unlock_irqrestore(shost->host_lock, flags);
>>
>>

See above.
Yes, it is worth revisiting.
But in a next step, not now.

>>>> +    unsigned int lun_idx;        /* Index into target device xarray */
>>>
>>> Xarray gives you _two_ choices for the type of an index :-)
>>> It is either u32 (as used in xa_alloc() and its variants) _or_
>>> unsigned long (as used everywhere else in xarray where an index
>>> is needed). So it's definitely 32 bits for xa_alloc() and either
>>> 32 or 64 bits for the rest of xarray depending on the machine
>>> architecture.
>>> Matthew Wilcox may care to explain why this is ...
>>
>> Saving space in data structures, mostly.  It's not really useful to
>> have 4 billion things in an allocating XArray.  Indeed, it's not really
>> useful to have 4 billion of almost anything.  So we have XArrays which
>> are indexed by something sensible like page index in file, and they're
>> nice and well-behaved (most people don't create files in excess of 16TB,
>> and those who do don't expect amazing performance when seeking around
>> in them at random because they've lost all caching effects).  And then
>> you have people who probably want one scsi device.  Maybe a dozen.
>> Possibly a thousand.  Definitely not a million.  If you get 4 billion 
>> scsi
>> devices, something's gone very wrong.  You've run out of drive letters,
>> for a start (/dev/sdmzabcd anybody?)
> 
>     no_uld=1     # RAID cards might do this on its physicals so 
> smartmontools
>                  # can check them
> 
> Then you only have to worry about /dev/sg<n> and /dev/bsg/<hctl) numbers
> or tuples going through the roof :-)
> 
>> It doesn't cost us anything to just use unsigned long as the index of an
>> XArray, due to how it's constructed.  Except in this one place where we
>> need to point to somewhere to store the index so that we can update the
>> index within an sdev before anybody iterating the data structure under
>> RCU can find the uninitialised index.
>>
>> This patch seems decent enough ... I just think the decision to optimise
>> the layout for LUNs < 256 is a bit odd.  But hey, your subsystem,
>> not mine.
> 
> Hannes has the most experience in that area. He has only been using xarray
> for a week or so (I think). There is something important missing from his
> patchset: any xa_destroy() calls :-)
> 
Yes! I know!
Working on it.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-29  0:20         ` Matthew Wilcox
@ 2020-05-29  6:50           ` Hannes Reinecke
  2020-05-29 11:21             ` Matthew Wilcox
  0 siblings, 1 reply; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-29  6:50 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Douglas Gilbert, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On 5/29/20 2:20 AM, Matthew Wilcox wrote:
> On Thu, May 28, 2020 at 10:58:31PM +0200, Hannes Reinecke wrote:
>> On 5/28/20 8:54 PM, Matthew Wilcox wrote:
>>> On Thu, May 28, 2020 at 01:50:02PM -0400, Douglas Gilbert wrote:
>>>> On 2020-05-28 12:36 p.m., Hannes Reinecke wrote:
>>>>> Use xarray for device lookup by target. LUNs below 256 are linear,
>>>>> and can be used directly as the index into the xarray.
>>>>> LUNs above 256 have a distinct LUN format, and are not necessarily
>>>>> linear. They'll be stored in indices above 256 in the xarray, with
>>>>> the next free index in the xarray.
>>>
>>> I don't understand why you think this is an improvement over just
>>> using an allocating XArray for all LUNs.  It seems like more code,
>>> and doesn't actually save you any storage space ... ?
>>>
>> The LUN range is 64 bit.
>> I was under the impression that xarray can only handle up to unsigned long;
>> which probably would work for 64bit systems, but there _are_ users running
>> on 32 bit, and they get patently unhappy when we have to tell them 'sorry,
>> not for you'.
> 
> I meant just use xa_alloc() for everything instead of using xa_insert for
> 0-255.
> 
But then I'll have to use xa_find() to get to the actual element as the 
1:1 mapping between SCSI LUN and array index is lost.
And seeing that most storage arrays will expose only up to 256 LUNs I 
thought this was a good improvement on lookup.
Of course, this only makes sense if xa_load() is more efficient than 
xa_find(). If not then of course it's a bit futile.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] target_core_pscsi: use __scsi_device_lookup()
  2020-05-28 16:36 ` [PATCH 2/4] target_core_pscsi: use __scsi_device_lookup() Hannes Reinecke
  2020-05-28 17:55   ` Bart Van Assche
@ 2020-05-29  7:02   ` Daniel Wagner
  1 sibling, 0 replies; 25+ messages in thread
From: Daniel Wagner @ 2020-05-29  7:02 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Martin K. Petersen, Christoph Hellwig, Doug Gilbert,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On Thu, May 28, 2020 at 06:36:23PM +0200, Hannes Reinecke wrote:
> Instead of walking the list of devices manually use the helper
> function to return the device directly.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>

Reviewed-by: Daniel Wagner <dwagner@suse.de>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-29  6:50           ` Hannes Reinecke
@ 2020-05-29 11:21             ` Matthew Wilcox
  2020-05-29 12:46               ` Hannes Reinecke
  0 siblings, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2020-05-29 11:21 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Douglas Gilbert, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On Fri, May 29, 2020 at 08:50:21AM +0200, Hannes Reinecke wrote:
> > I meant just use xa_alloc() for everything instead of using xa_insert for
> > 0-255.
> > 
> But then I'll have to use xa_find() to get to the actual element as the 1:1
> mapping between SCSI LUN and array index is lost.
> And seeing that most storage arrays will expose only up to 256 LUNs I
> thought this was a good improvement on lookup.
> Of course, this only makes sense if xa_load() is more efficient than
> xa_find(). If not then of course it's a bit futile.

xa_load() is absolutely more efficient than xa_find().  It's just a
question of whether it matters ;-)  Carry on ...

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-29 11:21             ` Matthew Wilcox
@ 2020-05-29 12:46               ` Hannes Reinecke
  2020-05-29 12:50                 ` Matthew Wilcox
  2020-05-29 16:24                 ` Douglas Gilbert
  0 siblings, 2 replies; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-29 12:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Douglas Gilbert, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On 5/29/20 1:21 PM, Matthew Wilcox wrote:
> On Fri, May 29, 2020 at 08:50:21AM +0200, Hannes Reinecke wrote:
>>> I meant just use xa_alloc() for everything instead of using xa_insert for
>>> 0-255.
>>>
>> But then I'll have to use xa_find() to get to the actual element as the 1:1
>> mapping between SCSI LUN and array index is lost.
>> And seeing that most storage arrays will expose only up to 256 LUNs I
>> thought this was a good improvement on lookup.
>> Of course, this only makes sense if xa_load() is more efficient than
>> xa_find(). If not then of course it's a bit futile.
> 
> xa_load() is absolutely more efficient than xa_find().  It's just a
> question of whether it matters ;-)  Carry on ...
> 
Thanks. I will.

BTW, care to update the documentation?

  * Return: 0 on success, -ENOMEM if memory could not be allocated or
  * -EBUSY if there are no free entries in @limit.
  */
int __xa_alloc(struct xarray *xa, u32 *id, void *entry,
		struct xa_limit limit, gfp_t gfp)
{
	XA_STATE(xas, xa, 0);

	if (WARN_ON_ONCE(xa_is_advanced(entry)))
		return -EINVAL;
	if (WARN_ON_ONCE(!xa_track_free(xa)))
		return -EINVAL;

So looks as if the documentation is in need of updating to cover -EINVAL.
And, please, state somewhere that whenever you want to use xa_alloc() 
you _need_ to specify XA_ALLOC_FLAGS in xa_init_flags(), otherwise
you trip across the above.

It's not entirely obvious, and took me the better half of a day to 
figure out.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-29 12:46               ` Hannes Reinecke
@ 2020-05-29 12:50                 ` Matthew Wilcox
  2020-05-29 13:17                   ` Hannes Reinecke
  2020-05-29 16:24                 ` Douglas Gilbert
  1 sibling, 1 reply; 25+ messages in thread
From: Matthew Wilcox @ 2020-05-29 12:50 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Douglas Gilbert, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On Fri, May 29, 2020 at 02:46:48PM +0200, Hannes Reinecke wrote:
> On 5/29/20 1:21 PM, Matthew Wilcox wrote:
> > On Fri, May 29, 2020 at 08:50:21AM +0200, Hannes Reinecke wrote:
> > > > I meant just use xa_alloc() for everything instead of using xa_insert for
> > > > 0-255.
> > > > 
> > > But then I'll have to use xa_find() to get to the actual element as the 1:1
> > > mapping between SCSI LUN and array index is lost.
> > > And seeing that most storage arrays will expose only up to 256 LUNs I
> > > thought this was a good improvement on lookup.
> > > Of course, this only makes sense if xa_load() is more efficient than
> > > xa_find(). If not then of course it's a bit futile.
> > 
> > xa_load() is absolutely more efficient than xa_find().  It's just a
> > question of whether it matters ;-)  Carry on ...
> > 
> Thanks. I will.
> 
> BTW, care to update the documentation?
> 
>  * Return: 0 on success, -ENOMEM if memory could not be allocated or
>  * -EBUSY if there are no free entries in @limit.
>  */
> int __xa_alloc(struct xarray *xa, u32 *id, void *entry,
> 		struct xa_limit limit, gfp_t gfp)
> {
> 	XA_STATE(xas, xa, 0);
> 
> 	if (WARN_ON_ONCE(xa_is_advanced(entry)))
> 		return -EINVAL;
> 	if (WARN_ON_ONCE(!xa_track_free(xa)))
> 		return -EINVAL;
> 
> So looks as if the documentation is in need of updating to cover -EINVAL.
> And, please, state somewhere that whenever you want to use xa_alloc() you
> _need_ to specify XA_ALLOC_FLAGS in xa_init_flags(), otherwise
> you trip across the above.

Like this?

Allocating XArrays
------------------

If you use DEFINE_XARRAY_ALLOC() to define the XArray, or
initialise it by passing ``XA_FLAGS_ALLOC`` to xa_init_flags(),
the XArray changes to track whether entries are in use or not.

You can call xa_alloc() to store the entry at an unused index
in the XArray.  If you need to modify the array from interrupt context,
you can use xa_alloc_bh() or xa_alloc_irq() to disable
interrupts while allocating the ID.


> It's not entirely obvious, and took me the better half of a day to figure
> out.

Really?  You get a nice WARN_ON and backtrace so you can see where
you went wrong ... What could I change to make this easier?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-29 12:50                 ` Matthew Wilcox
@ 2020-05-29 13:17                   ` Hannes Reinecke
  0 siblings, 0 replies; 25+ messages in thread
From: Hannes Reinecke @ 2020-05-29 13:17 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Douglas Gilbert, Martin K. Petersen, Christoph Hellwig,
	Daniel Wagner, Johannes Thumshirn, James Bottomley, linux-scsi

On 5/29/20 2:50 PM, Matthew Wilcox wrote:
> On Fri, May 29, 2020 at 02:46:48PM +0200, Hannes Reinecke wrote:
>> On 5/29/20 1:21 PM, Matthew Wilcox wrote:
>>> On Fri, May 29, 2020 at 08:50:21AM +0200, Hannes Reinecke wrote:
>>>>> I meant just use xa_alloc() for everything instead of using xa_insert for
>>>>> 0-255.
>>>>>
>>>> But then I'll have to use xa_find() to get to the actual element as the 1:1
>>>> mapping between SCSI LUN and array index is lost.
>>>> And seeing that most storage arrays will expose only up to 256 LUNs I
>>>> thought this was a good improvement on lookup.
>>>> Of course, this only makes sense if xa_load() is more efficient than
>>>> xa_find(). If not then of course it's a bit futile.
>>>
>>> xa_load() is absolutely more efficient than xa_find().  It's just a
>>> question of whether it matters ;-)  Carry on ...
>>>
>> Thanks. I will.
>>
>> BTW, care to update the documentation?
>>
>>   * Return: 0 on success, -ENOMEM if memory could not be allocated or
>>   * -EBUSY if there are no free entries in @limit.
>>   */
>> int __xa_alloc(struct xarray *xa, u32 *id, void *entry,
>> 		struct xa_limit limit, gfp_t gfp)
>> {
>> 	XA_STATE(xas, xa, 0);
>>
>> 	if (WARN_ON_ONCE(xa_is_advanced(entry)))
>> 		return -EINVAL;
>> 	if (WARN_ON_ONCE(!xa_track_free(xa)))
>> 		return -EINVAL;
>>
>> So looks as if the documentation is in need of updating to cover -EINVAL.
>> And, please, state somewhere that whenever you want to use xa_alloc() you
>> _need_ to specify XA_ALLOC_FLAGS in xa_init_flags(), otherwise
>> you trip across the above.
> 
> Like this?
> 
> Allocating XArrays
> ------------------
> 
> If you use DEFINE_XARRAY_ALLOC() to define the XArray, or
> initialise it by passing ``XA_FLAGS_ALLOC`` to xa_init_flags(),
> the XArray changes to track whether entries are in use or not.
> 
> You can call xa_alloc() to store the entry at an unused index
> in the XArray.  If you need to modify the array from interrupt context,
> you can use xa_alloc_bh() or xa_alloc_irq() to disable
> interrupts while allocating the ID.
> 
> 
What I'm missing is the connection between the first paragraph and the 
second.
It starts with 'If you use ...', making no indication what would happen 
if you don't.
And the second paragraph just says '... to store the entry ...'; is 
never said anything about tracking entries.

Why not simply append this sentenct to the second paragraph:

In order to use xa_alloc() you need to pass the XA_FLAGS_ALLOC when
calling xa_init_flags()l

>> It's not entirely obvious, and took me the better half of a day to figure
>> out.
> 
> Really?  You get a nice WARN_ON and backtrace so you can see where
> you went wrong ... What could I change to make this easier?
> 
It _could_ have said 'You forgot to pass XA_ALLOC_FLAGS in xa_init_flags()'.
Then it would've been immediately obvious without having to delve into 
xarray code.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] scsi: move target device list to xarray
  2020-05-29 12:46               ` Hannes Reinecke
  2020-05-29 12:50                 ` Matthew Wilcox
@ 2020-05-29 16:24                 ` Douglas Gilbert
  1 sibling, 0 replies; 25+ messages in thread
From: Douglas Gilbert @ 2020-05-29 16:24 UTC (permalink / raw)
  To: Hannes Reinecke, Matthew Wilcox
  Cc: Martin K. Petersen, Christoph Hellwig, Daniel Wagner,
	Johannes Thumshirn, James Bottomley, linux-scsi

On 2020-05-29 8:46 a.m., Hannes Reinecke wrote:
> On 5/29/20 1:21 PM, Matthew Wilcox wrote:
>> On Fri, May 29, 2020 at 08:50:21AM +0200, Hannes Reinecke wrote:
>>>> I meant just use xa_alloc() for everything instead of using xa_insert for
>>>> 0-255.
>>>>
>>> But then I'll have to use xa_find() to get to the actual element as the 1:1
>>> mapping between SCSI LUN and array index is lost.
>>> And seeing that most storage arrays will expose only up to 256 LUNs I
>>> thought this was a good improvement on lookup.
>>> Of course, this only makes sense if xa_load() is more efficient than
>>> xa_find(). If not then of course it's a bit futile.
>>
>> xa_load() is absolutely more efficient than xa_find().  It's just a
>> question of whether it matters ;-)  Carry on ...
>>
> Thanks. I will.
> 
> BTW, care to update the documentation?
> 
>   * Return: 0 on success, -ENOMEM if memory could not be allocated or
>   * -EBUSY if there are no free entries in @limit.
>   */
> int __xa_alloc(struct xarray *xa, u32 *id, void *entry,
>          struct xa_limit limit, gfp_t gfp)
> {
>      XA_STATE(xas, xa, 0);
> 
>      if (WARN_ON_ONCE(xa_is_advanced(entry)))
>          return -EINVAL;
>      if (WARN_ON_ONCE(!xa_track_free(xa)))
>          return -EINVAL;
> 
> So looks as if the documentation is in need of updating to cover -EINVAL.
> And, please, state somewhere that whenever you want to use xa_alloc() you _need_ 
> to specify XA_ALLOC_FLAGS in xa_init_flags(), otherwise
> you trip across the above.
> 
> It's not entirely obvious, and took me the better half of a day to figure out.

Ditto for me. At the time I did relay my frustration to Matthew who did
address it in the documentation. Another one is that xa_find*() requires the
XA_PRESENT mark, even when you are not using marks (or haven't yet learnt
about them ...). A clearer demarcation of the two xarray modes (i.e. either
the user supplies the index, or xarray does) would be helpful. That mode
impacts which parts of the xarray interface are used, for example in the sg
driver rewrite, xarrays are used for all collections but if memory serves,
there isn't a single xa_load() or xa_store() call.

But writing technical documentation is difficult. Very few third parties step
up to help, leaving the designer/implementer to do it. And it is extremely
difficult for someone who knows the code backwards (and where the skeletons are
buried) to stand on their heads and present the information in a pedagogic
manner.

Also traditional code documentation uses 7 bit ASCII text and "ACSII art" in
a sort of nod to earlier generations of programmers. Surely more modern
techniques, including colour diagrams and even animations, should now be
encouraged. Maybe when compilers start accepting html :-)

Doug Gilbert


P.S. I have tried to practice what I preach:
                http://sg.danny.cz/sg/sg_v40.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-05-29 16:24 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-28 16:36 [PATCHv3 0/4] scsi: use xarray for devices and targets Hannes Reinecke
2020-05-28 16:36 ` [PATCH 1/4] scsi: convert target lookup to xarray Hannes Reinecke
2020-05-28 17:18   ` Douglas Gilbert
2020-05-28 19:08     ` Douglas Gilbert
2020-05-28 17:48   ` Bart Van Assche
2020-05-29  6:24     ` Hannes Reinecke
2020-05-28 16:36 ` [PATCH 2/4] target_core_pscsi: use __scsi_device_lookup() Hannes Reinecke
2020-05-28 17:55   ` Bart Van Assche
2020-05-29  7:02   ` Daniel Wagner
2020-05-28 16:36 ` [PATCH 3/4] scsi: move target device list to xarray Hannes Reinecke
2020-05-28 17:50   ` Douglas Gilbert
2020-05-28 18:54     ` Matthew Wilcox
2020-05-28 19:44       ` Douglas Gilbert
2020-05-28 19:53         ` Matthew Wilcox
2020-05-29  6:45         ` Hannes Reinecke
2020-05-28 20:58       ` Hannes Reinecke
2020-05-29  0:20         ` Matthew Wilcox
2020-05-29  6:50           ` Hannes Reinecke
2020-05-29 11:21             ` Matthew Wilcox
2020-05-29 12:46               ` Hannes Reinecke
2020-05-29 12:50                 ` Matthew Wilcox
2020-05-29 13:17                   ` Hannes Reinecke
2020-05-29 16:24                 ` Douglas Gilbert
2020-05-28 16:36 ` [PATCH 4/4] scsi: remove direct device lookup per host Hannes Reinecke
2020-05-29  4:21 ` [PATCHv3 0/4] scsi: use xarray ... scsi_alloc_target() Douglas Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.