All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv4 0/6] scsi: use xarray for devices and targets
@ 2020-06-02 11:33 Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 1/6] scsi: convert target lookup to xarray Hannes Reinecke
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Hannes Reinecke @ 2020-06-02 11:33 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner, James Bottomley,
	linux-scsi, Hannes Reinecke

Hi all,

based on the ideas from Doug Gilbert here's now my take on using
xarrays for devices and targets.
It revolves around two ideas:

- The scsi target 'channel' and 'id' numbers are never ever used
  to the full 32 bit range; channels are well below 10, and no
  driver is using more than 16 bits for the id. So we can reduce
  the type of 'channel' and 'id' to 16 bits, and use the 32 bit
  value 'channel << 16 | id' as the index into the target xarray.
- Nearly every target only ever uses the first two levels of the
  4-level SCSI LUN structure, which means that we can use the
  linearized SCSI LUN id as an index into the xarray.
  If we ever come across targets utilizing more that 2 levels of
  the LUN structure we'll allocate the first unused index and have
  to resort to a less efficient lookup instead of direct indexing.

With these changes we can implement an efficient lookup mechanism,
devolving into direct lookup for most cases. It also allows us to
detect duplicate entries or accidental overwrites of existing elements
by using xa_cmpxchg().
And iteration over targets and devices should be as efficient as the
current, list-based, approach.

As usual, comments and reviews are welcome.

Changes to v2:
- Implement safe device iteration as noted by Doug
- Add an additional patch to avoid a pointless memory allocation
  in scsi_alloc_target()

Hannes Reinecke (6):
  scsi: convert target lookup to xarray
  target_core_pscsi: use __scsi_device_lookup()
  scsi: move target device list to xarray
  scsi: remove direct device lookup per host
  scsi_error: use xarray lookup instead of wrappers
  scsi: avoid pointless memory allocation in scsi_alloc_target()

 drivers/scsi/hosts.c               |   5 +-
 drivers/scsi/scsi.c                | 151 +++++++++++++++++++++++++++++--------
 drivers/scsi/scsi_error.c          |  35 +++++----
 drivers/scsi/scsi_lib.c            |   9 +--
 drivers/scsi/scsi_priv.h           |   2 +
 drivers/scsi/scsi_scan.c           | 101 +++++++++++++++----------
 drivers/scsi/scsi_sysfs.c          |  72 +++++++++++++-----
 drivers/target/target_core_pscsi.c |   8 +-
 include/scsi/scsi_device.h         |  32 +++++---
 include/scsi/scsi_host.h           |   5 +-
 10 files changed, 288 insertions(+), 132 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/6] scsi: convert target lookup to xarray
  2020-06-02 11:33 [PATCHv4 0/6] scsi: use xarray for devices and targets Hannes Reinecke
@ 2020-06-02 11:33 ` Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 2/6] target_core_pscsi: use __scsi_device_lookup() Hannes Reinecke
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2020-06-02 11:33 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner, James Bottomley,
	linux-scsi, Hannes Reinecke

Use an xarray instead of lists for holding the scsi targets.
I've also shortened the 'channel' and 'id' values to 16 bit
as none of the drivers requires a full 32bit range for either
of them, and by shortening them we can use them as the index
into the xarray for storing the scsi_target pointer.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/hosts.c       |  4 +++-
 drivers/scsi/scsi.c        | 32 ++++++++++++++++++------------
 drivers/scsi/scsi_scan.c   | 49 +++++++++++++++++++---------------------------
 drivers/scsi/scsi_sysfs.c  | 16 ++++++++++-----
 include/scsi/scsi_device.h | 15 +++++++++-----
 include/scsi/scsi_host.h   |  2 +-
 6 files changed, 65 insertions(+), 53 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index 7ec91c3a66ca..be599c855701 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -189,6 +189,8 @@ void scsi_remove_host(struct Scsi_Host *shost)
 	transport_unregister_device(&shost->shost_gendev);
 	device_unregister(&shost->shost_dev);
 	device_del(&shost->shost_gendev);
+	WARN_ON(!xa_empty(&shost->__targets));
+	xa_destroy(&shost->__targets);
 }
 EXPORT_SYMBOL(scsi_remove_host);
 
@@ -383,7 +385,7 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	spin_lock_init(shost->host_lock);
 	shost->shost_state = SHOST_CREATED;
 	INIT_LIST_HEAD(&shost->__devices);
-	INIT_LIST_HEAD(&shost->__targets);
+	xa_init_flags(&shost->__targets, XA_FLAGS_LOCK_IRQ);
 	INIT_LIST_HEAD(&shost->eh_cmd_q);
 	INIT_LIST_HEAD(&shost->starved_list);
 	init_waitqueue_head(&shost->host_wait);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 56c24a73e0c7..d601424e32b2 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -575,6 +575,19 @@ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
 }
 EXPORT_SYMBOL(__scsi_iterate_devices);
 
+/**
+ * __scsi_target_lookup  -  find a target based on channel and target id
+ * @shost:	SCSI host pointer
+ * @channel:	channel number of the target
+ * @id:		ID of the target
+ *
+ */
+static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
+					 u16 channel, u16 id)
+{
+	return xa_load(&shost->__targets, (channel << 16) | id);
+}
+
 /**
  * starget_for_each_device  -  helper to walk all devices of a target
  * @starget:	target whose devices we want to iterate over.
@@ -701,19 +714,14 @@ EXPORT_SYMBOL(scsi_device_lookup_by_target);
  * really want to use scsi_device_lookup instead.
  **/
 struct scsi_device *__scsi_device_lookup(struct Scsi_Host *shost,
-		uint channel, uint id, u64 lun)
+		u16 channel, u16 id, u64 lun)
 {
-	struct scsi_device *sdev;
+	struct scsi_target *starget;
 
-	list_for_each_entry(sdev, &shost->__devices, siblings) {
-		if (sdev->sdev_state == SDEV_DEL)
-			continue;
-		if (sdev->channel == channel && sdev->id == id &&
-				sdev->lun ==lun)
-			return sdev;
-	}
-
-	return NULL;
+	starget = __scsi_target_lookup(shost, channel, id);
+	if (!starget)
+		return NULL;
+	return __scsi_device_lookup_by_target(starget, lun);
 }
 EXPORT_SYMBOL(__scsi_device_lookup);
 
@@ -729,7 +737,7 @@ EXPORT_SYMBOL(__scsi_device_lookup);
  * needs to be released with scsi_device_put once you're done with it.
  **/
 struct scsi_device *scsi_device_lookup(struct Scsi_Host *shost,
-		uint channel, uint id, u64 lun)
+		u16 channel, u16 id, u64 lun)
 {
 	struct scsi_device *sdev;
 	unsigned long flags;
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f2437a7570ce..9fc35b6cbc30 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -309,6 +309,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
 	struct device *dev = &starget->dev;
 	struct Scsi_Host *shost = dev_to_shost(dev->parent);
 	unsigned long flags;
+	unsigned long tid = scsi_target_index(starget);
 
 	BUG_ON(starget->state == STARGET_DEL);
 	starget->state = STARGET_DEL;
@@ -316,7 +317,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (shost->hostt->target_destroy)
 		shost->hostt->target_destroy(starget);
-	list_del_init(&starget->siblings);
+	xa_erase(&shost->__targets, tid);
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	put_device(dev);
 }
@@ -341,27 +342,6 @@ int scsi_is_target_device(const struct device *dev)
 }
 EXPORT_SYMBOL(scsi_is_target_device);
 
-static struct scsi_target *__scsi_find_target(struct device *parent,
-					      int channel, uint id)
-{
-	struct scsi_target *starget, *found_starget = NULL;
-	struct Scsi_Host *shost = dev_to_shost(parent);
-	/*
-	 * Search for an existing target for this sdev.
-	 */
-	list_for_each_entry(starget, &shost->__targets, siblings) {
-		if (starget->id == id &&
-		    starget->channel == channel) {
-			found_starget = starget;
-			break;
-		}
-	}
-	if (found_starget)
-		get_device(&found_starget->dev);
-
-	return found_starget;
-}
-
 /**
  * scsi_target_reap_ref_release - remove target from visibility
  * @kref: the reap_ref in the target being released
@@ -417,6 +397,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	struct scsi_target *starget;
 	struct scsi_target *found_target;
 	int error, ref_got;
+	unsigned long tid;
 
 	starget = kzalloc(size, GFP_KERNEL);
 	if (!starget) {
@@ -433,27 +414,37 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	starget->id = id;
 	starget->channel = channel;
 	starget->can_queue = 0;
-	INIT_LIST_HEAD(&starget->siblings);
 	INIT_LIST_HEAD(&starget->devices);
 	starget->state = STARGET_CREATED;
 	starget->scsi_level = SCSI_2;
 	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
+	tid = scsi_target_index(starget);
  retry:
 	spin_lock_irqsave(shost->host_lock, flags);
-
-	found_target = __scsi_find_target(parent, channel, id);
-	if (found_target)
+	found_target = xa_load(&shost->__targets, tid);
+	if (found_target) {
+		get_device(&found_target->dev);
 		goto found;
-
-	list_add_tail(&starget->siblings, &shost->__targets);
+	}
+	error = xa_insert(&shost->__targets, tid, starget, GFP_ATOMIC);
 	spin_unlock_irqrestore(shost->host_lock, flags);
+	if (error) {
+		dev_printk(KERN_ERR, dev,
+			     "target %u:%u index allocation failed, error %d\n",
+			     channel, id, error);
+		put_device(dev);
+		kfree(starget);
+		return NULL;
+	}
 	/* allocate and add */
 	transport_setup_device(dev);
 	if (shost->hostt->target_alloc) {
 		error = shost->hostt->target_alloc(starget);
 
 		if(error) {
-			dev_printk(KERN_ERR, dev, "target allocation failed, error %d\n", error);
+			dev_printk(KERN_ERR, dev,
+				   "target %u:%u allocation failed, error %d\n",
+				   channel, id, error);
 			/* don't want scsi_target_reap to do the final
 			 * put because it will be under the host lock */
 			scsi_target_destroy(starget);
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 163dbcb741c1..83bdcb032853 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1511,16 +1511,21 @@ static void __scsi_remove_target(struct scsi_target *starget)
 void scsi_remove_target(struct device *dev)
 {
 	struct Scsi_Host *shost = dev_to_shost(dev->parent);
-	struct scsi_target *starget;
+	struct scsi_target *starget, *starget_next;
+	unsigned long tid = 0;
 	unsigned long flags;
 
-restart:
 	spin_lock_irqsave(shost->host_lock, flags);
-	list_for_each_entry(starget, &shost->__targets, siblings) {
+	starget = xa_find(&shost->__targets, &tid, UINT_MAX, XA_PRESENT);
+	while (starget) {
+		starget_next = xa_find_after(&shost->__targets, &tid,
+					     UINT_MAX, XA_PRESENT);
 		if (starget->state == STARGET_DEL ||
 		    starget->state == STARGET_REMOVE ||
-		    starget->state == STARGET_CREATED_REMOVE)
+		    starget->state == STARGET_CREATED_REMOVE) {
+			starget = starget_next;
 			continue;
+		}
 		if (starget->dev.parent == dev || &starget->dev == dev) {
 			kref_get(&starget->reap_ref);
 			if (starget->state == STARGET_CREATED)
@@ -1530,8 +1535,9 @@ void scsi_remove_target(struct device *dev)
 			spin_unlock_irqrestore(shost->host_lock, flags);
 			__scsi_remove_target(starget);
 			scsi_target_reap(starget);
-			goto restart;
+			spin_lock_irqsave(shost->host_lock, flags);
 		}
+		starget = starget_next;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 }
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index c3cba2aaf934..8cb31bbd8a82 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -121,7 +121,7 @@ struct scsi_device {
 
 	unsigned long last_queue_ramp_up;	/* last queue ramp up time */
 
-	unsigned int id, channel;
+	u16 id, channel;
 	u64 lun;
 	unsigned int manufacturer;	/* Manufacturer of device, for using 
 					 * vendor-specific cmd's */
@@ -288,8 +288,8 @@ struct scsi_target {
 	struct list_head	devices;
 	struct device		dev;
 	struct kref		reap_ref; /* last put renders target invisible */
-	unsigned int		channel;
-	unsigned int		id; /* target id ... replace
+	u16			channel;
+	u16			id; /* target id ... replace
 				     * scsi_device.id eventually */
 	unsigned int		create:1; /* signal that it needs to be added */
 	unsigned int		single_lun:1;	/* Indicates we should only
@@ -321,6 +321,11 @@ struct scsi_target {
 	/* starget_data must be the last element!!!! */
 } __attribute__((aligned(sizeof(unsigned long))));
 
+static inline u32 scsi_target_index(struct scsi_target *starget)
+{
+	return (starget->channel << 16) | (starget->id);
+}
+
 #define to_scsi_target(d)	container_of(d, struct scsi_target, dev)
 static inline struct scsi_target *scsi_target(struct scsi_device *sdev)
 {
@@ -345,9 +350,9 @@ extern struct scsi_device *scsi_device_from_queue(struct request_queue *q);
 extern int __must_check scsi_device_get(struct scsi_device *);
 extern void scsi_device_put(struct scsi_device *);
 extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *,
-					      uint, uint, u64);
+					      u16, u16, u64);
 extern struct scsi_device *__scsi_device_lookup(struct Scsi_Host *,
-						uint, uint, u64);
+						u16, u16, u64);
 extern struct scsi_device *scsi_device_lookup_by_target(struct scsi_target *,
 							u64);
 extern struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *,
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 822e8cda8d9b..b9395676c75b 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -521,7 +521,7 @@ struct Scsi_Host {
 	 * access this list directly from a driver.
 	 */
 	struct list_head	__devices;
-	struct list_head	__targets;
+	struct xarray		__targets;
 	
 	struct list_head	starved_list;
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/6] target_core_pscsi: use __scsi_device_lookup()
  2020-06-02 11:33 [PATCHv4 0/6] scsi: use xarray for devices and targets Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 1/6] scsi: convert target lookup to xarray Hannes Reinecke
@ 2020-06-02 11:33 ` Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 3/6] scsi: move target device list to xarray Hannes Reinecke
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2020-06-02 11:33 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner, James Bottomley,
	linux-scsi, Hannes Reinecke

Instead of walking the list of devices manually use the helper
function to return the device directly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Bart van Assche <bvanassche@acm.org>
---
 drivers/target/target_core_pscsi.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/target/target_core_pscsi.c b/drivers/target/target_core_pscsi.c
index 4e37fa9b409d..38799e47b590 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -496,11 +496,9 @@ static int pscsi_configure_device(struct se_device *dev)
 	}
 
 	spin_lock_irq(sh->host_lock);
-	list_for_each_entry(sd, &sh->__devices, siblings) {
-		if ((pdv->pdv_channel_id != sd->channel) ||
-		    (pdv->pdv_target_id != sd->id) ||
-		    (pdv->pdv_lun_id != sd->lun))
-			continue;
+	sd = __scsi_device_lookup(sh, pdv->pdv_channel_id,
+				  pdv->pdv_target_id, pdv->pdv_lun_id);
+	if (sd) {
 		/*
 		 * Functions will release the held struct scsi_host->host_lock
 		 * before calling calling pscsi_add_device_to_list() to register
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/6] scsi: move target device list to xarray
  2020-06-02 11:33 [PATCHv4 0/6] scsi: use xarray for devices and targets Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 1/6] scsi: convert target lookup to xarray Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 2/6] target_core_pscsi: use __scsi_device_lookup() Hannes Reinecke
@ 2020-06-02 11:33 ` Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 4/6] scsi: remove direct device lookup per host Hannes Reinecke
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2020-06-02 11:33 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner, James Bottomley,
	linux-scsi, Hannes Reinecke

Use xarray for device lookup by target. If the LUN only uses
the first two levels (ie 32 bit) it can be used directly
as the index into the xarray.
LUNs using more than two levels will overflow the xarray index;
they will be stored in the next free index in the xarray.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 drivers/scsi/scsi.c        | 35 +++++++++++++++++++++--------------
 drivers/scsi/scsi_lib.c    |  9 ++++-----
 drivers/scsi/scsi_scan.c   |  5 +++--
 drivers/scsi/scsi_sysfs.c  | 36 +++++++++++++++++++++++++++++++++---
 include/scsi/scsi_device.h |  4 ++--
 5 files changed, 63 insertions(+), 26 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index d601424e32b2..a3e01708744f 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -601,13 +601,14 @@ static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
 void starget_for_each_device(struct scsi_target *starget, void *data,
 		     void (*fn)(struct scsi_device *, void *))
 {
-	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
 	struct scsi_device *sdev;
+	unsigned long lun_idx = 0;
 
-	shost_for_each_device(sdev, shost) {
-		if ((sdev->channel == starget->channel) &&
-		    (sdev->id == starget->id))
-			fn(sdev, data);
+	xa_for_each(&starget->__devices, lun_idx, sdev) {
+		if (scsi_device_get(sdev))
+			continue;
+		fn(sdev, data);
+		scsi_device_put(sdev);
 	}
 }
 EXPORT_SYMBOL(starget_for_each_device);
@@ -629,14 +630,11 @@ EXPORT_SYMBOL(starget_for_each_device);
 void __starget_for_each_device(struct scsi_target *starget, void *data,
 			       void (*fn)(struct scsi_device *, void *))
 {
-	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
 	struct scsi_device *sdev;
+	unsigned long lun_idx = 0;
 
-	__shost_for_each_device(sdev, shost) {
-		if ((sdev->channel == starget->channel) &&
-		    (sdev->id == starget->id))
-			fn(sdev, data);
-	}
+	xa_for_each(&starget->__devices, lun_idx, sdev)
+		fn(sdev, data);
 }
 EXPORT_SYMBOL(__starget_for_each_device);
 
@@ -658,12 +656,21 @@ EXPORT_SYMBOL(__starget_for_each_device);
 struct scsi_device *__scsi_device_lookup_by_target(struct scsi_target *starget,
 						   u64 lun)
 {
-	struct scsi_device *sdev;
+	struct scsi_device *sdev = NULL;
+	unsigned long lun_idx = 0;
+
+	if (lun < UINT_MAX)
+		sdev = xa_load(&starget->__devices, lun);
+	if (sdev && sdev->lun == lun) {
+		if (sdev->sdev_state == SDEV_DEL)
+			sdev = NULL;
+		return sdev;
+	}
 
-	list_for_each_entry(sdev, &starget->devices, same_target_siblings) {
+	xa_for_each(&starget->__devices, lun_idx, sdev) {
 		if (sdev->sdev_state == SDEV_DEL)
 			continue;
-		if (sdev->lun ==lun)
+		if (sdev->lun == lun)
 			return sdev;
 	}
 
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c163fa22267c..9337422d864a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -361,9 +361,9 @@ static void scsi_kick_queue(struct request_queue *q)
 static void scsi_single_lun_run(struct scsi_device *current_sdev)
 {
 	struct Scsi_Host *shost = current_sdev->host;
-	struct scsi_device *sdev, *tmp;
+	struct scsi_device *sdev;
 	struct scsi_target *starget = scsi_target(current_sdev);
-	unsigned long flags;
+	unsigned long flags, lun_idx = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
 	starget->starget_sdev_user = NULL;
@@ -380,8 +380,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (starget->starget_sdev_user)
 		goto out;
-	list_for_each_entry_safe(sdev, tmp, &starget->devices,
-			same_target_siblings) {
+	xa_for_each(&starget->__devices, lun_idx, sdev) {
 		if (sdev == current_sdev)
 			continue;
 		if (scsi_device_get(sdev))
@@ -390,7 +389,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
 		spin_unlock_irqrestore(shost->host_lock, flags);
 		scsi_kick_queue(sdev->request_queue);
 		spin_lock_irqsave(shost->host_lock, flags);
-	
+
 		scsi_device_put(sdev);
 	}
  out:
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 9fc35b6cbc30..51faeb0dd742 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -235,7 +235,6 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	mutex_init(&sdev->state_mutex);
 	sdev->sdev_state = SDEV_CREATED;
 	INIT_LIST_HEAD(&sdev->siblings);
-	INIT_LIST_HEAD(&sdev->same_target_siblings);
 	INIT_LIST_HEAD(&sdev->starved_entry);
 	INIT_LIST_HEAD(&sdev->event_list);
 	spin_lock_init(&sdev->list_lock);
@@ -327,6 +326,8 @@ static void scsi_target_dev_release(struct device *dev)
 	struct device *parent = dev->parent;
 	struct scsi_target *starget = to_scsi_target(dev);
 
+	WARN_ON(!xa_empty(&starget->__devices));
+	xa_destroy(&starget->__devices);
 	kfree(starget);
 	put_device(parent);
 }
@@ -414,7 +415,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	starget->id = id;
 	starget->channel = channel;
 	starget->can_queue = 0;
-	INIT_LIST_HEAD(&starget->devices);
+	xa_init_flags(&starget->__devices, XA_FLAGS_ALLOC);
 	starget->state = STARGET_CREATED;
 	starget->scsi_level = SCSI_2;
 	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 83bdcb032853..5b009ec97250 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -433,7 +433,8 @@ static void scsi_device_cls_release(struct device *class_dev)
 
 static void scsi_device_dev_release_usercontext(struct work_struct *work)
 {
-	struct scsi_device *sdev;
+	struct scsi_device *sdev, *tmp_sdev;
+	struct scsi_target *starget;
 	struct device *parent;
 	struct list_head *this, *tmp;
 	struct scsi_vpd *vpd_pg80 = NULL, *vpd_pg83 = NULL;
@@ -441,6 +442,7 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	unsigned long flags;
 
 	sdev = container_of(work, struct scsi_device, ew.work);
+	starget = sdev->sdev_target;
 
 	scsi_dh_release_device(sdev);
 
@@ -448,7 +450,12 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 
 	spin_lock_irqsave(sdev->host->host_lock, flags);
 	list_del(&sdev->siblings);
-	list_del(&sdev->same_target_siblings);
+	/* Use cmpxchg to avoid accidental deletion */
+	tmp_sdev = xa_cmpxchg(&starget->__devices, sdev->lun_idx, sdev,
+			      NULL, GFP_ATOMIC);
+	if (tmp_sdev != sdev)
+		sdev_printk(KERN_WARNING, sdev,
+			    "index %u reassigned!\n", sdev->lun_idx);
 	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
 
@@ -1593,6 +1600,7 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
 	unsigned long flags;
 	struct Scsi_Host *shost = sdev->host;
 	struct scsi_target  *starget = sdev->sdev_target;
+	int ret = -ERANGE;
 
 	device_initialize(&sdev->sdev_gendev);
 	sdev->sdev_gendev.bus = &scsi_bus_type;
@@ -1620,7 +1628,29 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
 
 	transport_setup_device(&sdev->sdev_gendev);
 	spin_lock_irqsave(shost->host_lock, flags);
-	list_add_tail(&sdev->same_target_siblings, &starget->devices);
+	if (sdev->lun < UINT_MAX) {
+		sdev->lun_idx = sdev->lun;
+		ret = xa_insert(&starget->__devices, (unsigned int)sdev->lun,
+				sdev, GFP_ATOMIC);
+	}
+	/*
+	 * Use a free index if the LUN is larger than 32 bit or
+	 * if the index is already taken. Leave the lower 256
+	 * entries free to avoid collisions.
+	 */
+	if (ret) {
+		struct xa_limit scsi_lun_limit = {
+			.min = 256,
+			.max = UINT_MAX,
+		};
+		ret = xa_alloc(&starget->__devices, &sdev->lun_idx,
+			       sdev, scsi_lun_limit, GFP_ATOMIC);
+	}
+	if (ret) {
+		shost_printk(KERN_ERR, shost,
+			     "LUN index %u:%u:%llu allocation error %d\n",
+			     sdev_channel(sdev), sdev_id(sdev), sdev->lun, ret);
+	}
 	list_add_tail(&sdev->siblings, &shost->__devices);
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	/*
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 8cb31bbd8a82..56801661f481 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -104,7 +104,6 @@ struct scsi_device {
 
 	/* the next two are protected by the host->host_lock */
 	struct list_head    siblings;   /* list of all devices on this host */
-	struct list_head    same_target_siblings; /* just the devices sharing same target id */
 
 	atomic_t device_busy;		/* commands actually active on LLDD */
 	atomic_t device_blocked;	/* Device returned QUEUE_FULL. */
@@ -123,6 +122,7 @@ struct scsi_device {
 
 	u16 id, channel;
 	u64 lun;
+	unsigned int lun_idx;		/* Index into target device xarray */
 	unsigned int manufacturer;	/* Manufacturer of device, for using 
 					 * vendor-specific cmd's */
 	unsigned sector_size;	/* size in bytes */
@@ -285,7 +285,7 @@ enum scsi_target_state {
 struct scsi_target {
 	struct scsi_device	*starget_sdev_user;
 	struct list_head	siblings;
-	struct list_head	devices;
+	struct xarray		__devices;
 	struct device		dev;
 	struct kref		reap_ref; /* last put renders target invisible */
 	u16			channel;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/6] scsi: remove direct device lookup per host
  2020-06-02 11:33 [PATCHv4 0/6] scsi: use xarray for devices and targets Hannes Reinecke
                   ` (2 preceding siblings ...)
  2020-06-02 11:33 ` [PATCH 3/6] scsi: move target device list to xarray Hannes Reinecke
@ 2020-06-02 11:33 ` Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 5/6] scsi_error: use xarray lookup instead of wrappers Hannes Reinecke
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2020-06-02 11:33 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner, James Bottomley,
	linux-scsi, Hannes Reinecke

Drop the per-host device list for direct lookup and iterate
over the targets and devices xarrays instead.
As both are now using xarrays the lookup is more efficient
as we can use the provided indices based on the HCTL id
to do a direct lookup instead of traversing lists.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/hosts.c       |  1 -
 drivers/scsi/scsi.c        | 84 ++++++++++++++++++++++++++++++++++++++++++----
 drivers/scsi/scsi_scan.c   | 30 ++++++++++++-----
 drivers/scsi/scsi_sysfs.c  | 20 +++++------
 include/scsi/scsi_device.h | 13 ++++---
 include/scsi/scsi_host.h   |  3 +-
 6 files changed, 117 insertions(+), 34 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index be599c855701..aebef37684e8 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -384,7 +384,6 @@ struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int privsize)
 	shost->host_lock = &shost->default_lock;
 	spin_lock_init(shost->host_lock);
 	shost->shost_state = SHOST_CREATED;
-	INIT_LIST_HEAD(&shost->__devices);
 	xa_init_flags(&shost->__targets, XA_FLAGS_LOCK_IRQ);
 	INIT_LIST_HEAD(&shost->eh_cmd_q);
 	INIT_LIST_HEAD(&shost->starved_list);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index a3e01708744f..6a1d8c6bd8f9 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -554,18 +554,48 @@ EXPORT_SYMBOL(scsi_device_put);
 struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
 					   struct scsi_device *prev)
 {
-	struct list_head *list = (prev ? &prev->siblings : &shost->__devices);
+	struct scsi_target *starget;
 	struct scsi_device *next = NULL;
 	unsigned long flags;
+	unsigned long tid = 0, lun_idx = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
-	while (list->next != &shost->__devices) {
-		next = list_entry(list->next, struct scsi_device, siblings);
-		/* skip devices that we can't get a reference to */
-		if (!scsi_device_get(next))
+	if (!prev) {
+		starget = xa_find(&shost->__targets, &tid,
+				       UINT_MAX, XA_PRESENT);
+		if (starget) {
+			next = xa_find(&starget->__devices, &lun_idx,
+				       UINT_MAX, XA_PRESENT);
+			if (!scsi_device_get(next)) {
+				spin_unlock_irqrestore(shost->host_lock, flags);
+				return next;
+			}
+		}
+	} else {
+		starget = prev->sdev_target;
+		tid = (starget->channel << 16) | starget->id;
+		lun_idx = prev->lun_idx;
+	}
+	while (starget) {
+		next = xa_find_after(&starget->__devices, &lun_idx,
+				     UINT_MAX, XA_PRESENT);
+		if (next) {
+			if (!scsi_device_get(next))
+				break;
+			continue;
+		}
+		/* No more LUNs on this target, switch to the next */
+		starget = xa_find_after(&shost->__targets, &tid,
+					UINT_MAX, XA_PRESENT);
+		if (!starget) {
+			next = NULL;
+			break;
+		}
+		lun_idx = 0;
+		next = xa_find(&starget->__devices, &lun_idx,
+			       UINT_MAX, XA_PRESENT);
+		if (next && !scsi_device_get(next))
 			break;
-		next = NULL;
-		list = list->next;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 
@@ -575,6 +605,46 @@ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
 }
 EXPORT_SYMBOL(__scsi_iterate_devices);
 
+/* helper for __shost_for_each_device, see that for documentation */
+struct scsi_device *__scsi_iterate_devices_unlocked(struct Scsi_Host *shost,
+						    struct scsi_device *prev)
+{
+	struct scsi_target *starget;
+	struct scsi_device *next = NULL;
+	unsigned long tid = 0, lun_idx = 0;
+
+	if (!prev) {
+		starget = xa_find(&shost->__targets, &tid,
+				       UINT_MAX, XA_PRESENT);
+		if (starget)
+			return xa_find(&starget->__devices, &lun_idx,
+				       UINT_MAX, XA_PRESENT);
+	} else {
+		starget = prev->sdev_target;
+		tid = scsi_target_index(starget);
+		lun_idx = prev->lun_idx;
+	}
+	while (starget) {
+		next = xa_find_after(&starget->__devices, &lun_idx,
+				     UINT_MAX, XA_PRESENT);
+		if (next)
+			return next;
+		/* No more LUNs on this target, switch to the next */
+		starget = xa_find_after(&shost->__targets, &tid,
+					UINT_MAX, XA_PRESENT);
+		if (!starget)
+			return NULL;
+
+		lun_idx = 0;
+		next = xa_find(&starget->__devices, &lun_idx,
+			       UINT_MAX, XA_PRESENT);
+		if (next)
+			return next;
+	}
+	return NULL;
+}
+EXPORT_SYMBOL(__scsi_iterate_devices_unlocked);
+
 /**
  * __scsi_target_lookup  -  find a target based on channel and target id
  * @shost:	SCSI host pointer
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 51faeb0dd742..8afb71c036d5 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -234,7 +234,6 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 	sdev->channel = starget->channel;
 	mutex_init(&sdev->state_mutex);
 	sdev->sdev_state = SDEV_CREATED;
-	INIT_LIST_HEAD(&sdev->siblings);
 	INIT_LIST_HEAD(&sdev->starved_entry);
 	INIT_LIST_HEAD(&sdev->event_list);
 	spin_lock_init(&sdev->list_lock);
@@ -1851,17 +1850,30 @@ EXPORT_SYMBOL(scsi_scan_host);
 
 void scsi_forget_host(struct Scsi_Host *shost)
 {
-	struct scsi_device *sdev;
+	struct scsi_target *starget, *starget_next;
+	struct scsi_device *sdev, *sdev_next;
 	unsigned long flags;
+	unsigned long tid = 0;
 
- restart:
 	spin_lock_irqsave(shost->host_lock, flags);
-	list_for_each_entry(sdev, &shost->__devices, siblings) {
-		if (sdev->sdev_state == SDEV_DEL)
-			continue;
-		spin_unlock_irqrestore(shost->host_lock, flags);
-		__scsi_remove_device(sdev);
-		goto restart;
+	starget = xa_find(&shost->__targets, &tid, UINT_MAX, XA_PRESENT);
+	while (starget) {
+		unsigned long lun_idx = 0;
+
+		starget_next = xa_find_after(&shost->__targets, &tid, UINT_MAX, XA_PRESENT);
+
+		sdev = xa_find(&starget->__devices, &lun_idx, UINT_MAX, XA_PRESENT);
+		while (sdev) {
+			sdev_next = xa_find_after(&starget->__devices, &lun_idx, UINT_MAX, XA_PRESENT);
+
+			if (sdev->sdev_state != SDEV_DEL) {
+				spin_unlock_irqrestore(shost->host_lock, flags);
+				__scsi_remove_device(sdev);
+				spin_lock_irqsave(shost->host_lock, flags);
+			}
+			sdev = sdev_next;
+		}
+		starget = starget_next;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 }
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 5b009ec97250..1abedbcf0521 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -449,7 +449,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	parent = sdev->sdev_gendev.parent;
 
 	spin_lock_irqsave(sdev->host->host_lock, flags);
-	list_del(&sdev->siblings);
 	/* Use cmpxchg to avoid accidental deletion */
 	tmp_sdev = xa_cmpxchg(&starget->__devices, sdev->lun_idx, sdev,
 			      NULL, GFP_ATOMIC);
@@ -1480,29 +1479,31 @@ static void __scsi_remove_target(struct scsi_target *starget)
 {
 	struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
 	unsigned long flags;
-	struct scsi_device *sdev;
+	struct scsi_device *sdev, *sdev_next;
+	unsigned long lun_idx = 0;
 
 	spin_lock_irqsave(shost->host_lock, flags);
- restart:
-	list_for_each_entry(sdev, &shost->__devices, siblings) {
+	sdev = xa_find(&starget->__devices, &lun_idx, UINT_MAX, XA_PRESENT);
+	while (sdev) {
+		sdev_next = xa_find_after(&starget->__devices, &lun_idx, UINT_MAX, XA_PRESENT);
+
 		/*
 		 * We cannot call scsi_device_get() here, as
 		 * we might've been called from rmmod() causing
 		 * scsi_device_get() to fail the module_is_live()
 		 * check.
 		 */
-		if (sdev->channel != starget->channel ||
-		    sdev->id != starget->id)
-			continue;
 		if (sdev->sdev_state == SDEV_DEL ||
 		    sdev->sdev_state == SDEV_CANCEL ||
-		    !get_device(&sdev->sdev_gendev))
+		    !get_device(&sdev->sdev_gendev)) {
+			sdev = sdev_next;
 			continue;
+		}
 		spin_unlock_irqrestore(shost->host_lock, flags);
 		scsi_remove_device(sdev);
 		put_device(&sdev->sdev_gendev);
 		spin_lock_irqsave(shost->host_lock, flags);
-		goto restart;
+		sdev = sdev_next;
 	}
 	spin_unlock_irqrestore(shost->host_lock, flags);
 }
@@ -1651,7 +1652,6 @@ void scsi_sysfs_device_initialize(struct scsi_device *sdev)
 			     "LUN index %u:%u:%llu allocation error %d\n",
 			     sdev_channel(sdev), sdev_id(sdev), sdev->lun, ret);
 	}
-	list_add_tail(&sdev->siblings, &shost->__devices);
 	spin_unlock_irqrestore(shost->host_lock, flags);
 	/*
 	 * device can now only be removed via __scsi_remove_device() so hold
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 56801661f481..3690af6ce6af 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -102,9 +102,6 @@ struct scsi_device {
 	struct Scsi_Host *host;
 	struct request_queue *request_queue;
 
-	/* the next two are protected by the host->host_lock */
-	struct list_head    siblings;   /* list of all devices on this host */
-
 	atomic_t device_busy;		/* commands actually active on LLDD */
 	atomic_t device_blocked;	/* Device returned QUEUE_FULL. */
 
@@ -381,6 +378,10 @@ extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *,
 	     (sdev); \
 	     (sdev) = __scsi_iterate_devices((shost), (sdev)))
 
+/* only exposed to implement shost_for_each_device */
+struct scsi_device *__scsi_iterate_devices_unlocked(struct Scsi_Host *,
+						    struct scsi_device *);
+
 /**
  * __shost_for_each_device - iterate over all devices of a host (UNLOCKED)
  * @sdev: the &struct scsi_device to use as a cursor
@@ -394,8 +395,10 @@ extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *,
  * device list in interrupt context.  Otherwise you really want to use
  * shost_for_each_device instead.
  */
-#define __shost_for_each_device(sdev, shost) \
-	list_for_each_entry((sdev), &((shost)->__devices), siblings)
+#define __shost_for_each_device(sdev, shost)				\
+	for((sdev) = __scsi_iterate_devices_unlocked((shost), NULL);	\
+	    (sdev);							\
+	    (sdev) = __scsi_iterate_devices_unlocked((shost),(sdev)))
 
 extern int scsi_change_queue_depth(struct scsi_device *, int);
 extern int scsi_track_queue_full(struct scsi_device *, int);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index b9395676c75b..ee0b72075e9f 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -520,9 +520,8 @@ struct Scsi_Host {
 	 * their __ prefixed variants with the lock held. NEVER
 	 * access this list directly from a driver.
 	 */
-	struct list_head	__devices;
 	struct xarray		__targets;
-	
+
 	struct list_head	starved_list;
 
 	spinlock_t		default_lock;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/6] scsi_error: use xarray lookup instead of wrappers
  2020-06-02 11:33 [PATCHv4 0/6] scsi: use xarray for devices and targets Hannes Reinecke
                   ` (3 preceding siblings ...)
  2020-06-02 11:33 ` [PATCH 4/6] scsi: remove direct device lookup per host Hannes Reinecke
@ 2020-06-02 11:33 ` Hannes Reinecke
  2020-06-02 11:33 ` [PATCH 6/6] scsi: avoid pointless memory allocation in scsi_alloc_target() Hannes Reinecke
  2020-06-03 12:53 ` [PATCHv4 0/6] scsi: use xarray for devices and targets Christoph Hellwig
  6 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2020-06-02 11:33 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner, James Bottomley,
	linux-scsi, Hannes Reinecke

For SCSI EH most shost_for_each_sdev() calls are just to filter
out devices for specific targets or channels.
These calls can be made more efficient using direct xarray
lookup and iterators.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi.c       |  2 +-
 drivers/scsi/scsi_error.c | 35 ++++++++++++++++++++++-------------
 drivers/scsi/scsi_priv.h  |  2 ++
 3 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 6a1d8c6bd8f9..296cecd61d3a 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -652,7 +652,7 @@ EXPORT_SYMBOL(__scsi_iterate_devices_unlocked);
  * @id:		ID of the target
  *
  */
-static struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
+struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
 					 u16 channel, u16 id)
 {
 	return xa_load(&shost->__targets, (channel << 16) | id);
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 927b1e641842..02605c848365 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -643,7 +643,9 @@ EXPORT_SYMBOL_GPL(scsi_check_sense);
 static void scsi_handle_queue_ramp_up(struct scsi_device *sdev)
 {
 	struct scsi_host_template *sht = sdev->host->hostt;
+	struct scsi_target *starget;
 	struct scsi_device *tmp_sdev;
+	unsigned long lun_idx = 0;
 
 	if (!sht->track_queue_depth ||
 	    sdev->queue_depth >= sdev->max_queue_depth)
@@ -661,10 +663,9 @@ static void scsi_handle_queue_ramp_up(struct scsi_device *sdev)
 	 * Walk all devices of a target and do
 	 * ramp up on them.
 	 */
-	shost_for_each_device(tmp_sdev, sdev->host) {
-		if (tmp_sdev->channel != sdev->channel ||
-		    tmp_sdev->id != sdev->id ||
-		    tmp_sdev->queue_depth == sdev->max_queue_depth)
+	starget = __scsi_target_lookup(sdev->host, sdev->channel, sdev->id);
+	xa_for_each(&starget->__devices, lun_idx, tmp_sdev) {
+		if (tmp_sdev->queue_depth == sdev->max_queue_depth)
 			continue;
 
 		scsi_change_queue_depth(tmp_sdev, tmp_sdev->queue_depth + 1);
@@ -675,14 +676,15 @@ static void scsi_handle_queue_ramp_up(struct scsi_device *sdev)
 static void scsi_handle_queue_full(struct scsi_device *sdev)
 {
 	struct scsi_host_template *sht = sdev->host->hostt;
+	struct scsi_target *starget = sdev->sdev_target;
 	struct scsi_device *tmp_sdev;
+	unsigned long lun_idx = 0;
 
 	if (!sht->track_queue_depth)
 		return;
 
-	shost_for_each_device(tmp_sdev, sdev->host) {
-		if (tmp_sdev->channel != sdev->channel ||
-		    tmp_sdev->id != sdev->id)
+	xa_for_each(&starget->__devices, lun_idx, tmp_sdev) {
+		if (tmp_sdev->sdev_state == SDEV_DEL)
 			continue;
 		/*
 		 * We do not know the number of commands that were at
@@ -2273,10 +2275,16 @@ int scsi_error_handler(void *data)
  */
 void scsi_report_bus_reset(struct Scsi_Host *shost, int channel)
 {
+	struct scsi_target *starget;
 	struct scsi_device *sdev;
+	unsigned long tid = 0;
 
-	__shost_for_each_device(sdev, shost) {
-		if (channel == sdev_channel(sdev))
+	xa_for_each(&shost->__targets, tid, starget) {
+		unsigned long lun_idx = 0;
+
+		if (starget->channel != channel)
+			continue;
+		xa_for_each(&starget->__devices, lun_idx, sdev)
 			__scsi_report_device_reset(sdev, NULL);
 	}
 }
@@ -2306,13 +2314,14 @@ EXPORT_SYMBOL(scsi_report_bus_reset);
  */
 void scsi_report_device_reset(struct Scsi_Host *shost, int channel, int target)
 {
+	struct scsi_target *starget;
 	struct scsi_device *sdev;
+	unsigned long lun_idx = 0;
 
-	__shost_for_each_device(sdev, shost) {
-		if (channel == sdev_channel(sdev) &&
-		    target == sdev_id(sdev))
+	starget = __scsi_target_lookup(shost, channel, target);
+	if (starget)
+		xa_for_each(&starget->__devices, lun_idx, sdev)
 			__scsi_report_device_reset(sdev, NULL);
-	}
 }
 EXPORT_SYMBOL(scsi_report_device_reset);
 
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 22b6585e28b4..0a87c95359aa 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -49,6 +49,8 @@ enum scsi_devinfo_key {
 	SCSI_DEVINFO_SPI,
 };
 
+extern struct scsi_target *__scsi_target_lookup(struct Scsi_Host *shost,
+						u16 channel, u16 id);
 extern blist_flags_t scsi_get_device_flags(struct scsi_device *sdev,
 					   const unsigned char *vendor,
 					   const unsigned char *model);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/6] scsi: avoid pointless memory allocation in scsi_alloc_target()
  2020-06-02 11:33 [PATCHv4 0/6] scsi: use xarray for devices and targets Hannes Reinecke
                   ` (4 preceding siblings ...)
  2020-06-02 11:33 ` [PATCH 5/6] scsi_error: use xarray lookup instead of wrappers Hannes Reinecke
@ 2020-06-02 11:33 ` Hannes Reinecke
  2020-06-03 12:53 ` [PATCHv4 0/6] scsi: use xarray for devices and targets Christoph Hellwig
  6 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2020-06-02 11:33 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Doug Gilbert, Daniel Wagner, James Bottomley,
	linux-scsi, Hannes Reinecke

As we already know the index of the requested target in
scsi_alloc_target() we can try to fetch it first before trying
to allocate a new one. This will save us a pointless memory
allocation in case the target is already present and uncontended.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_scan.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 8afb71c036d5..41818111808e 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -387,7 +387,7 @@ static void scsi_target_reap_ref_put(struct scsi_target *starget)
  * is responsible for both reaping and doing a last put
  */
 static struct scsi_target *scsi_alloc_target(struct device *parent,
-					     int channel, uint id)
+					     u16 channel, u16 id)
 {
 	struct Scsi_Host *shost = dev_to_shost(parent);
 	struct device *dev = NULL;
@@ -397,7 +397,23 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	struct scsi_target *starget;
 	struct scsi_target *found_target;
 	int error, ref_got;
-	unsigned long tid;
+	unsigned long tid = (channel << 16) | id;
+
+	/*
+	 * Try if the target is already present, and save us
+	 * a pointless memory allocation if so.
+	 */
+	spin_lock_irqsave(shost->host_lock, flags);
+	found_target = xa_load(&shost->__targets, tid);
+	if (found_target) {
+		get_device(&found_target->dev);
+		if (kref_get_unless_zero(&found_target->reap_ref)) {
+			spin_unlock_irqrestore(shost->host_lock, flags);
+			return found_target;
+		}
+		put_device(&found_target->dev);
+	}
+	spin_unlock_irqrestore(shost->host_lock, flags);
 
 	starget = kzalloc(size, GFP_KERNEL);
 	if (!starget) {
@@ -418,7 +434,6 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 	starget->state = STARGET_CREATED;
 	starget->scsi_level = SCSI_2;
 	starget->max_target_blocked = SCSI_DEFAULT_TARGET_BLOCKED;
-	tid = scsi_target_index(starget);
  retry:
 	spin_lock_irqsave(shost->host_lock, flags);
 	found_target = xa_load(&shost->__targets, tid);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHv4 0/6] scsi: use xarray for devices and targets
  2020-06-02 11:33 [PATCHv4 0/6] scsi: use xarray for devices and targets Hannes Reinecke
                   ` (5 preceding siblings ...)
  2020-06-02 11:33 ` [PATCH 6/6] scsi: avoid pointless memory allocation in scsi_alloc_target() Hannes Reinecke
@ 2020-06-03 12:53 ` Christoph Hellwig
  2020-06-03 18:23   ` Douglas Gilbert
  2020-06-04 16:12   ` Hannes Reinecke
  6 siblings, 2 replies; 10+ messages in thread
From: Christoph Hellwig @ 2020-06-03 12:53 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Martin K. Petersen, Christoph Hellwig, Doug Gilbert,
	Daniel Wagner, James Bottomley, linux-scsi

On Tue, Jun 02, 2020 at 01:33:05PM +0200, Hannes Reinecke wrote:
> Hi all,
> 
> based on the ideas from Doug Gilbert here's now my take on using
> xarrays for devices and targets.
> It revolves around two ideas:
> 
> - The scsi target 'channel' and 'id' numbers are never ever used
>   to the full 32 bit range; channels are well below 10, and no
>   driver is using more than 16 bits for the id. So we can reduce
>   the type of 'channel' and 'id' to 16 bits, and use the 32 bit
>   value 'channel << 16 | id' as the index into the target xarray.
> - Nearly every target only ever uses the first two levels of the
>   4-level SCSI LUN structure, which means that we can use the
>   linearized SCSI LUN id as an index into the xarray.
>   If we ever come across targets utilizing more that 2 levels of
>   the LUN structure we'll allocate the first unused index and have
>   to resort to a less efficient lookup instead of direct indexing.
> 
> With these changes we can implement an efficient lookup mechanism,
> devolving into direct lookup for most cases. It also allows us to
> detect duplicate entries or accidental overwrites of existing elements
> by using xa_cmpxchg().
> And iteration over targets and devices should be as efficient as the
> current, list-based, approach.
> 
> As usual, comments and reviews are welcome.

I see absolutely no argument for what the point of this series.  It adds
more code, and I don't really see any indications for it fixing bugs,
speeding up workloads, or reducing memory usage.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv4 0/6] scsi: use xarray for devices and targets
  2020-06-03 12:53 ` [PATCHv4 0/6] scsi: use xarray for devices and targets Christoph Hellwig
@ 2020-06-03 18:23   ` Douglas Gilbert
  2020-06-04 16:12   ` Hannes Reinecke
  1 sibling, 0 replies; 10+ messages in thread
From: Douglas Gilbert @ 2020-06-03 18:23 UTC (permalink / raw)
  To: Christoph Hellwig, Hannes Reinecke
  Cc: Martin K. Petersen, Daniel Wagner, James Bottomley, linux-scsi

On 2020-06-03 8:53 a.m., Christoph Hellwig wrote:
> On Tue, Jun 02, 2020 at 01:33:05PM +0200, Hannes Reinecke wrote:
>> Hi all,
>>
>> based on the ideas from Doug Gilbert here's now my take on using
>> xarrays for devices and targets.
>> It revolves around two ideas:
>>
>> - The scsi target 'channel' and 'id' numbers are never ever used
>>    to the full 32 bit range; channels are well below 10, and no
>>    driver is using more than 16 bits for the id. So we can reduce
>>    the type of 'channel' and 'id' to 16 bits, and use the 32 bit
>>    value 'channel << 16 | id' as the index into the target xarray.
>> - Nearly every target only ever uses the first two levels of the
>>    4-level SCSI LUN structure, which means that we can use the
>>    linearized SCSI LUN id as an index into the xarray.
>>    If we ever come across targets utilizing more that 2 levels of
>>    the LUN structure we'll allocate the first unused index and have
>>    to resort to a less efficient lookup instead of direct indexing.
>>
>> With these changes we can implement an efficient lookup mechanism,
>> devolving into direct lookup for most cases. It also allows us to
>> detect duplicate entries or accidental overwrites of existing elements
>> by using xa_cmpxchg().
>> And iteration over targets and devices should be as efficient as the
>> current, list-based, approach.
>>
>> As usual, comments and reviews are welcome.
> 
> I see absolutely no argument for what the point of this series.  It adds
> more code, and I don't really see any indications for it fixing bugs,
> speeding up workloads, or reducing memory usage.

Lets take memory usage first. The legacy design (part of which may have
been a later add-on) has three collections where two are needed:
    1) all targets in a host
    2) all sdev_s in a target
    3) all sdev_s in a host

So the third one is redundant and now removed (together with the
complexity of making sure those 3 collections are always in sync, seen
from the users' viewpoint). Each doubly linked collection on 64
bit machines uses 16 bytes (2 eight byte pointers). So that is a
32 byte reduction in each sdev object. The proposed solution adds 0
bytes because it uses the LUN as an index which is already there.
Similar but smaller win in scsi_target objects.

There are also some locks and mutexes in the three level object
tree (host-target-sdev[LU]) that can probably be dispensed with
as xarrays come with their own locks. That has not been done yet
making both my earlier proposal and this one "overlocked". And
locks and mutexes take up space in objects and slow things down.


The speeding up will come in big machine startup and shutdown and
its reaction time to disruptions (e.g. cable disconnected to a disk
array) IMO. xarray and explicit parent pointers give us a faster
way to navigate up and down the object tree. With this patchset we
have an O(ln(n)) lookup in the downward direction where currently we
only have O(n). Very little use is made of the "lookup" functions in
the API because users could see that it was just an iteration
(i.e. O(n)). Hopefully transports will take advantage of faster
lookups and perhaps implement their own xarrays. Even the upward
navigation can be complicated by transports inserting levels between
the host and the target. This is what the SCSI mid-layer object tree
looks like moving upwards from a SAS SSD, connected to an
SAS expander, moving up to its host (a HBA):
     scsi_device, ptr=ffff99d23f513960
     scsi_target, ptr=ffff99d241595c28
     sas_rphy, ptr=ffff99d242519c00
     sas_port, ptr=ffff99d24251ec00
     sas_expander_device, ptr=ffff99d23f4c6438
     sas_port, ptr=ffff99d23f4c7400
     Scsi_Host, ptr=ffff99d2425261f8

There already is a scsi_device::host redundant pointer to bypass the
oft-called and slow-walking dev_to_shost(). I'm proposing another
redundant scsi_target::parent_shost pointer that will bypass seven
dev_to_shost() invocations.

Currently all iterations are done under the host_lock as that is
required for doubly linked list safety. xarray uses rcu read locks
on all non-modifying operations including iterations and if we can
safely rely on them, that will increase the available parallelism
within one host.

Finally the SCSI fast path will usually require the presence the
corresponding sdev object, preferably cached. So making it smaller
will help.

Doug Gilbert


P.S. I sidestepped the "bugs" issue. Surely we will add some but
it is hard to believe when you wade into the complexity of the
currently linked collections and their myriad of locks, that there
aren't subtle bugs in the existing code. I have been working with
xarrays for about 1 year and finding locking issues is easier
with xarrays compared to "roll your own" linked list locking, IMO.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCHv4 0/6] scsi: use xarray for devices and targets
  2020-06-03 12:53 ` [PATCHv4 0/6] scsi: use xarray for devices and targets Christoph Hellwig
  2020-06-03 18:23   ` Douglas Gilbert
@ 2020-06-04 16:12   ` Hannes Reinecke
  1 sibling, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2020-06-04 16:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin K. Petersen, Doug Gilbert, Daniel Wagner, James Bottomley,
	linux-scsi

On 6/3/20 2:53 PM, Christoph Hellwig wrote:
> On Tue, Jun 02, 2020 at 01:33:05PM +0200, Hannes Reinecke wrote:
>> Hi all,
>>
>> based on the ideas from Doug Gilbert here's now my take on using
>> xarrays for devices and targets.
>> It revolves around two ideas:
>>
>> - The scsi target 'channel' and 'id' numbers are never ever used
>>    to the full 32 bit range; channels are well below 10, and no
>>    driver is using more than 16 bits for the id. So we can reduce
>>    the type of 'channel' and 'id' to 16 bits, and use the 32 bit
>>    value 'channel << 16 | id' as the index into the target xarray.
>> - Nearly every target only ever uses the first two levels of the
>>    4-level SCSI LUN structure, which means that we can use the
>>    linearized SCSI LUN id as an index into the xarray.
>>    If we ever come across targets utilizing more that 2 levels of
>>    the LUN structure we'll allocate the first unused index and have
>>    to resort to a less efficient lookup instead of direct indexing.
>>
>> With these changes we can implement an efficient lookup mechanism,
>> devolving into direct lookup for most cases. It also allows us to
>> detect duplicate entries or accidental overwrites of existing elements
>> by using xa_cmpxchg().
>> And iteration over targets and devices should be as efficient as the
>> current, list-based, approach.
>>
>> As usual, comments and reviews are welcome.
> 
> I see absolutely no argument for what the point of this series.  It adds
> more code, and I don't really see any indications for it fixing bugs,
> speeding up workloads, or reducing memory usage.
> 
 From my perspective this is a proof-of-concept; using xarrays to store 
targets and LUNs has the benefit that we can directly access the 
elements, and the lookup will be more efficient for larger setups.

But it's not a clear-cut solution, merely replacing one concept with 
some issues with another concept with another set of issues.

Guess the real benefit will come only if we manage to move to explicit 
scsi target removal, and not the implicit model of making the scsi 
target dependent on the underlying scsi devices we have now.
I'll be experimenting with that and will post an update for it.

I _do_ like the xarray for targets, though; they have a fixed location 
where they can go and as such xarray are a far more natural choice.
For LUNs it's less compelling as xarrays can't use 64bits generically as 
index, but still.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke            Teamlead Storage & Networking
hare@suse.de                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-06-04 16:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-02 11:33 [PATCHv4 0/6] scsi: use xarray for devices and targets Hannes Reinecke
2020-06-02 11:33 ` [PATCH 1/6] scsi: convert target lookup to xarray Hannes Reinecke
2020-06-02 11:33 ` [PATCH 2/6] target_core_pscsi: use __scsi_device_lookup() Hannes Reinecke
2020-06-02 11:33 ` [PATCH 3/6] scsi: move target device list to xarray Hannes Reinecke
2020-06-02 11:33 ` [PATCH 4/6] scsi: remove direct device lookup per host Hannes Reinecke
2020-06-02 11:33 ` [PATCH 5/6] scsi_error: use xarray lookup instead of wrappers Hannes Reinecke
2020-06-02 11:33 ` [PATCH 6/6] scsi: avoid pointless memory allocation in scsi_alloc_target() Hannes Reinecke
2020-06-03 12:53 ` [PATCHv4 0/6] scsi: use xarray for devices and targets Christoph Hellwig
2020-06-03 18:23   ` Douglas Gilbert
2020-06-04 16:12   ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.