linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Douglas Gilbert <dgilbert@interlog.com>
To: Bart Van Assche <bvanassche@acm.org>,
	James Bottomley <jejb@linux.vnet.ibm.com>,
	linux-scsi@vger.kernel.org, linux-block@vger.kernel.org,
	linux-api@vger.kernel.org
Cc: martin.petersen@oracle.com, hare@suse.de,
	Arnd Bergmann <arnd@arndb.de>,
	Tony Battersby <tonyb@cybernetics.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v3 00/20] sg: add v4 interface
Date: Fri, 16 Aug 2019 11:59:04 -0400	[thread overview]
Message-ID: <d0c60641-0607-a9c4-e79d-b6e850ef8682@interlog.com> (raw)
In-Reply-To: <6606add1-7ae7-5d8d-e660-d267164981d9@acm.org>

On 2019-08-15 1:30 p.m., Bart Van Assche wrote:
> On 8/13/19 9:19 PM, Douglas Gilbert wrote:
>> Bart Van Assche hinted at a better API design but didn't present
>> it. If he did, that would be the first time an alternate API
>> design was presented for async usage in the 20 years that I have
>> been associated with the driver.
> 
> I would like to start from the use cases instead of the implementation of a new 
> SG/IO interface. My employer uses the SG/IO interface for controlling SMR and 

There is no "new" SG/IO interface. Linux has broken the ability of char
drivers to safely use the read() and write() system calls. This
adversely impacts the bsg and sg drivers. In response the following
replacement mappings have been suggested in my first sg patchset:

1) For sg driver currently in production, its async interface:
        write(sg_fd, &sg_v3_obj, sizeof(sg_v3_obj))
          ----->  ioctl(sg_fd, SG_IOSUBMIT_V3, &sg_v3_obj)
    and
        read(sg_fd, &sg_v3_obj, sizeof(sg_v3_obj))
          ----->  ioctl(sg_fd, SG_RECEIVE_V3, &sg_v3_obj)

    And send out a WARN_ONCE when write(sg_fd, &sg_v3_obj,...) is used.

2) For the async portion of the bsg driver that was removed last
    year, the following, slightly more complex mapping is proposed:
        write(bsg_fd, &sg_v4_obj, sizeof(sg_v4_obj))
          ----->  ioctl(sg_fd_equiv_bsg, SG_IOSUBMIT, &sg_v4_obj)
    and
        read(bsg_fd, &sg_v4_obj, sizeof(sg_v4_obj))
          ----->  ioctl(sg_fd_equiv_bsg, SG_RECEIVE, &sg_v4_obj)

    The bsg_fd --> sg_fd_equiv_bsg mapping can be done with the help
    of sysfs.


There is another case with the bsg async interface where the third
argument to write() and read() is a multiple of the size of a sg_v4_obj.
I call that a multiple requests invocation. That is handled in my second
patchset with an extra level of indirection. Yes, that is a change in
the API, but it is more on the syntax side rather than the semantics side.

The ioctls have another advantage over the write()/read() interface.
The reader will notice the both SG_IOSUBMIT and SG_IORECEIVE are defined
with the _IOWR() macro indicating bi-directional dataflow. The "reverse"
direction dataflow for the submit side is when a tag is sent back from
the block layer. For the receive side the reverse flow is when matching
either by pack_id or tag.

Also some longstanding features of the sg async API such as
ioctl(SG_GET_NUM_WAITING) can lead to a reduction in API traffic. Say
we have 20 SCSI commands that don't depend on one another (e.g. READ
GATHERED). They could be submitted asynchronously with a single
multiple requests invocation by ioctl(SG_IOSUBMIT) with the flag
SGV4_FLAG_IMMED set. The user code could then wait for one (any one)
to finish and process it (so that is two API calls so far). Now an
ioctl(SG_GET_NUM_WAITING) could be issued and say it gets 3 then a
multiple requests invocation of ioctl(SG_IORECEIVE) for those 3
could be sent and complete promptly. Now the tally of API calls is
up to 4. If another ioctl(SG_GET_NUM_WAITING) was issued and say
it yielded 16 then a multiple requests invocation of
ioctl(SG_IORECEIVE) for those 16 would complete the originally
submitted 20 SCSI commands. The total tally of API calls is 6 with
only 1 of those waiting. The wait could be made fully async by
using a polling loop or a signal to replace that (and any other)
wait.

If the user space didn't mind blocking then the whole 20 SCSI commands
could be processed efficiently with a single multiple requests
invocation using ioctl(SG_IOSUBMIT) with the SGV4_FLAG_IMMED flag
cleared. It would first issue all 20 command then return after all 20
commands were complete. That is an extension of the removed bsg async
SCSI API, but a pretty good one IMO.

The sg driver's async model remains basically the same as when the
driver first appeared in 1992. Naturally there have been enhancements
along the way, such as that last example.

> HSMR disks. What we need is the ability to discover, read, write and configure 
> such disks, support for the non-standard HSMR flex protocol, the ability to give 
> certain users or groups access to a subset of the LBAs and also the ability to 
> make that information persistent. I think that such functionality could be 
> implemented by extending LVM and by adding support for all ZBC commands we need 
> in the block layer, device mapper layer and also in the asynchronous I/O layer. 
> The block, dm and aio layers already support submitting commands asynchronously 
> but do not yet support all the ZBC commands that we use.

I believe that you will find that the more layers of abstraction that are
placed between the actual device and the OS level API, the more difficult
the discovery process will be. And in some cases you will need to get to
a management layer to let those management functions "pass-through" those
layers. Some RAID card drivers take advantage of the no_uld_attach flag in
scsi_device to expose real devices, but only to the sg/bsg interface for
management purposes (for utilities like smartmontools) and do not produce
sd device nodes.

> Are there any SG/IO use cases that have not yet been mentioned in this e-mail 
> thread? If SMR and HSMR are the primary use cases for SG/IO, should asynchronous 
> command support be added in the SG/IO layer or should rather ZBC support in the 
> block, dm and aio layers be improved?

My guess is quite a few, and the companies involved don't want to talk about
their use publicly. For example when a big computer company starts reporting
errors, I believe my role is to try and fix the errors, not to interrogate
them about how and why they are using the driver. On the other hand, Tony
Battersby has been relatively active on this list and produced patches for
the sg driver over several years. Tony is well positioned to know the
driver's  strengths and weaknesses but has said that he has little time to
review these patchsets. I appreciate any feedback I can get from him.

Doug Gilbert



  reply	other threads:[~2019-08-16 15:59 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-07 11:42 [PATCH v3 00/20] sg: add v4 interface Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 01/20] sg: move functions around Douglas Gilbert
2019-08-12 14:22   ` Christoph Hellwig
2019-08-07 11:42 ` [PATCH v3 02/20] sg: remove typedefs, type+formatting cleanup Douglas Gilbert
2019-08-12 14:22   ` Christoph Hellwig
2019-08-07 11:42 ` [PATCH v3 03/20] sg: sg_log and is_enabled Douglas Gilbert
2019-08-12 14:23   ` Christoph Hellwig
2019-08-13 22:57     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 04/20] sg: rework sg_poll(), minor changes Douglas Gilbert
2019-08-12 14:23   ` Christoph Hellwig
2019-08-13  0:35     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 05/20] sg: bitops in sg_device Douglas Gilbert
2019-08-12 14:23   ` Christoph Hellwig
2019-08-14  1:35     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 06/20] sg: make open count an atomic Douglas Gilbert
2019-08-12 14:23   ` Christoph Hellwig
2019-08-07 11:42 ` [PATCH v3 07/20] sg: move header to uapi section Douglas Gilbert
2019-08-12 14:24   ` Christoph Hellwig
2019-08-12 14:32     ` Greg KH
2019-08-12 14:35     ` James Bottomley
2019-08-13  0:21     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 08/20] sg: speed sg_poll and sg_get_num_waiting Douglas Gilbert
2019-08-12 14:31   ` Christoph Hellwig
2019-08-12 16:31     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 09/20] sg: sg_allow_if_err_recovery and renames Douglas Gilbert
2019-08-12 14:31   ` Christoph Hellwig
2019-08-14  1:26     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 10/20] sg: remove most access_ok functions Douglas Gilbert
2019-08-12 14:32   ` Christoph Hellwig
2019-08-07 11:42 ` [PATCH v3 11/20] sg: replace rq array with lists Douglas Gilbert
2019-08-12 14:35   ` Christoph Hellwig
2019-08-13 23:46     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 12/20] sg: sense buffer rework Douglas Gilbert
2019-08-12 14:37   ` Christoph Hellwig
2019-08-12 16:26     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 13/20] sg: add sg v4 interface support Douglas Gilbert
2019-08-09 23:12   ` James Bottomley
2019-08-11 19:21     ` Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 14/20] sg: rework debug info Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 15/20] sg: add 8 byte SCSI LUN to sg_scsi_id Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 16/20] sg: expand sg_comm_wr_t Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 17/20] sg: add sg_iosubmit_v3 and sg_ioreceive_v3 ioctls Douglas Gilbert
2019-08-09 23:15   ` James Bottomley
2019-08-12 15:37     ` Douglas Gilbert
2019-08-12 16:14       ` Tony Battersby
2019-08-12 18:46         ` James Bottomley
2019-08-12 19:37           ` Tony Battersby
2019-08-07 11:42 ` [PATCH v3 18/20] sg: add some __must_hold macros Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 19/20] sg: first debugfs support Douglas Gilbert
2019-08-07 11:42 ` [PATCH v3 20/20] sg: bump version to 4.0.03 Douglas Gilbert
2019-08-08 19:10 ` [PATCH v3 00/20] sg: add v4 interface James Bottomley
2019-08-08 21:08   ` Douglas Gilbert
2019-08-08 21:37     ` Tony Battersby
2019-08-08 22:25       ` Bart Van Assche
2019-08-09 13:28         ` Tony Battersby
2019-08-08 23:00     ` James Bottomley
2019-08-14  4:19       ` Douglas Gilbert
2019-08-15 17:30         ` Bart Van Assche
2019-08-16 15:59           ` Douglas Gilbert [this message]
2019-08-16 17:19             ` Greg KH
2019-08-16 18:10             ` Bart Van Assche
2019-08-16 18:44               ` Douglas Gilbert
2019-08-12 15:23   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0c60641-0607-a9c4-e79d-b6e850ef8682@interlog.com \
    --to=dgilbert@interlog.com \
    --cc=arnd@arndb.de \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=tonyb@cybernetics.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).