[RFC v2 0/6] scsi: rework mid-layer with xarrays

* [RFC v2 0/6] scsi: rework mid-layer with xarrays
@ 2020-05-24 15:58 Douglas Gilbert
  2020-05-24 15:58 ` [RFC v2 1/6] scsi: xarray hctl Douglas Gilbert
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Douglas Gilbert @ 2020-05-24 15:58 UTC (permalink / raw)
  To: linux-scsi; +Cc: martin.petersen, jejb, hare

The SCSI subsystem uses a "hctl" topological tuple that stands for
"host,channel,target,lun". It implies that the SCSI mid-layer (or
mid-level) has a four level (inverted) object tree, or five level if
the root and leaf levels are both counted. Actually it is implemented
as a three level object tree in which the channel level is merged into
the target level.

The leaf level of the object tree (i.e. the "lun" part of the tuple)
derives its name from SCSI Logical Units (LUs) identified by LU Numbers
(LUNs). For historical reasons Linux terms LUs as "devices" and the
corresponding object in the subsystem's API is of type struct
scsi_device. Somewhat confusingly Linux also terms the host as a
"device". LUs are typically block storage while SCSI hosts are usually
associated with Host Bus Adapters (HBAs). To complete the confusion
in SCSI terminology the target is known as a "device". 

This series replaces the doubly linked lists that bind together the SCSI
subsystem's object tree with xarrays. At the top of that object tree is
an idr/ida virtual array which holds a collection of SCSI hosts that are
active in the system. The idr/ida implementation already uses xarrays,
so the collection of SCSI hosts is not altered by this patchset.

Each SCSI host object (shost) holds two collections: one of scsi_target
objects (starg(s)), the other of scsi_device objects (sdev(s)). That
second collection is technically redundant since each starg has a
collection of sdevs. Currently these three collections are implemented
using doubly linked lists (see: include/linux/list.h). This patchset
replaces those linked lists collections with xarrays. That replacement
is done in the first patch. In the remaining 5 patches, some advantage
is taken of xarray features. Also some inefficient iterations are
reworked.

Why xarrays?
------------
  - xarrays have built-in locking: spinlocks for modifying operations
    and rcu (locks) for non-modifying operations. With list_head you
    are on your own.
  - the name xarray suggests O(1) indexing (but it isn't that fast) and
    feels like it should be faster than doubly linked lists. [And
    xarrays are not true arrays, they just have a similar interface.]
  - struct list_head seems too clever. Looking at a group of related
    structures used to build an object tree (e.g. struct scsi_target,
    it is difficult to distinguish the collections from the elements of
    a collection. There are just lots of:
        struct list_head <name_obscures_whether_collector_or_collectee>;
  - struct list_head evenly distributes its storage overhead between the
    collection holder and the elements of a collection: 2 pointers each,
    16 bytes on 64 bit machines. xarray imposes an overhead on the
    collection holder but a smaller overhead on its elements, perhaps we
    can restrict it to 2 bytes: [0..USHRT_MAX]? Failing that, a 4 byte
    overhead per element.
  - linked lists don't scale very well on multi-core machines
  - any decisions that can be made on the basis of xarray marks is a
    cache(line) win, as there is no need to pull in the corresponding
    element (e.g. see: SCSI_XA_NON_SDEV_DEL)
  - xarray is well instrumented and will warn (once) at run time if it
    doesn't like what it finds with regard to locking.

At several points in the code changes are comments starting with
"XARRAY: ". These are typically where list manipulations have been
changed out for xarray operations. There might be subtle changes in
semantics (e.g. if an addition racing with a deletion is permitted).

Locking
-------
xarrays have inbuilt locking. Since the collection is held in the
parent, they are inherently safer than doubly linked lists. Lists can
crash within an iteration if either the current or next element is
deleted. As long as the parent doesn't get deleted, then a xarray
iteration will produce a pointer (or NULL) without crashing. The rcu
lock on the xarray iteration guarantees that. Dereferencing that
pointer is where the fun begins (i.e. it may crash). The caller may
know that (scsi_device) object must exist because it holds a reference
to it.

This patchset does _not_ weaken the existing locking structures in the
SCSI mid-layer. It just adds another locking layer underneath it. So it
is "over-locked" especially on modifying operations.

Performance
===========
It is a real struggle to measure anything meaningful on the creation and
deletion side. The problem is systemd and udev which conspire to show
the tester that they own 99.9% of all available cores during major
disruptions (e.g. adding 6000 SCSI devices). Using mask and stop on
systemd-udevd and systemd-journald helps (but they must be unmasked
before a reboot otherwise problems will follow).

It is hard to beat a linked list when iterating through a collection.
A possible win is on the lookup side, if that is required (and it may
not be). A xarray could be maintained in ascending order so a binary
search could be done (e.g. order shost->__targets with the channel
combined with target_id).

Douglas Gilbert (6):
  scsi: xarray hctl
  scsi: xarray, iterations on scsi_target
  scsi: xarray mark sdev_del state
  scsi: improve scsi_device_lookup
  scsi: add __scsi_target_lookup
  scsi: count number of targets

 drivers/scsi/hosts.c               |   8 +-
 drivers/scsi/scsi.c                | 194 +++++++++++++++++++++++------
 drivers/scsi/scsi_error.c          |  34 ++---
 drivers/scsi/scsi_lib.c            |  38 +++++-
 drivers/scsi/scsi_scan.c           |  70 +++++++----
 drivers/scsi/scsi_sysfs.c          |  56 ++++++---
 drivers/target/target_core_pscsi.c |   2 +-
 include/scsi/scsi_device.h         | 148 ++++++++++++++++++++--
 include/scsi/scsi_host.h           |  15 ++-
 include/scsi/scsi_transport.h      |   3 +-
 10 files changed, 448 insertions(+), 120 deletions(-)

-- 
2.25.1

^ permalink raw reply	[flat|nested] 25+ messages in thread