All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12 V2] target: fix cmd plugging and completion
@ 2021-02-09 12:38 Mike Christie
  2021-02-09 12:38 ` [PATCH 01/13] target: move t_task_cdb initialization Mike Christie
                   ` (12 more replies)
  0 siblings, 13 replies; 50+ messages in thread
From: Mike Christie @ 2021-02-09 12:38 UTC (permalink / raw)
  To: Chaitanya.Kulkarni, loberman, martin.petersen, linux-scsi,
	target-devel, mst, stefanha, virtualization

The following patches made over Martin's 5.12 branches fix two
issues:

1. target_core_iblock plugs and unplugs the queue for every
command. To handle this issue and handle an issue that
vhost-scsi and loop were avoiding by adding their own workqueue,
I added a new submission workqueue to LIO. Drivers can pass cmds
to it, and we can then submit batches of cmds.

2. vhost-scsi and loop on the submission side were doing a work
per cmd and on the lio completion side it was doing a work per
cmd. The cap on running works is 512 (max_active) and so we can
end up end up using a lot of threads when submissions start blocking
because they hit the block tag limit or the completion side blocks
trying to send the cmd. In this patchset I just use a cmd list
per session to avoid abusing the workueue layer.

The combined patchset fixes a major perf issue we've been hitting
where IOPs is stuck at 230K when running:

    fio --filename=/dev/sda  --direct=1 --rw=randrw --bs=4k
    --ioengine=libaio --iodepth=128  --numjobs=8 --time_based
    --group_reporting --runtime=60

The patches in this set get me to 350K when using devices that
have native IOPs of around 400-500K.

Note that 5.12 has some interrupt changes that my patches
collide with. Martin's 5.12 branches had the changes so I
based my patches on that.

V2:
- Fix up container_of use coding style
- Handle offlist review comment from Laurence where with the
original code and my patches we can hit a bug where the cmd
times out, LIO starts up the TMR code, but it misses the cmd
because it's on the workqueue.
- Made the work per device work instead of session to handle
the previous issue and so if one dev hits some issue it sleeps on,
it won't block other devices.




^ permalink raw reply	[flat|nested] 50+ messages in thread
* [PATCH 00/13 V3] target: fix cmd plugging and completion
@ 2021-02-10  4:55 Mike Christie
  2021-02-10  4:55 ` [PATCH 05/13] tcm loop: use blk cmd allocator for se_cmds Mike Christie
  0 siblings, 1 reply; 50+ messages in thread
From: Mike Christie @ 2021-02-10  4:55 UTC (permalink / raw)
  To: bostroesser, Chaitanya.Kulkarni, loberman, martin.petersen,
	linux-scsi, target-devel, mst, stefanha

The following patches made over Martin's 5.12 branches

https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/log/?h=5.12/scsi-staging
or
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/log/?h=5.12/scsi-queue

to handle conflicts with the in_interrupt changes.

1. target_core_iblock plugs and unplugs the queue for every
command. To handle this issue and handle an issue that
vhost-scsi and loop were avoiding by adding their own workqueue,
I added a new submission workqueue to LIO. Drivers can pass cmds
to it, and we can then submit batches of cmds.

2. vhost-scsi and loop on the submission side were doing a work
per cmd and on the lio completion side it was doing a work per
cmd. The cap on running works is 512 (max_active) and so we can
end up end up using a lot of threads when submissions start blocking
because they hit the block tag limit or the completion side blocks
trying to send the cmd. In this patchset I just use a cmd list
per session to avoid abusing the workueue layer.

The combined patchset fixes a major perf issue we've been hitting
where IOPs is stuck at 230K when running:

    fio --filename=/dev/sda  --direct=1 --rw=randrw --bs=4k
    --ioengine=libaio --iodepth=128  --numjobs=8 --time_based
    --group_reporting --runtime=60

The patches in this set get me to 350K when using devices that
have native IOPs of around 400-500K.

V3:
- Fix rc type in target_submit so its a sense_reason_t
- Add BUG_ON if caller uses target_queue_cmd_submit but hasn't
implemented get_cdb.
- Drop unused variables in loop.
- Fix race in tcmu plug check
- Add comment about how plug check works in iblock
- Do a flush when handling TMRs instead of cancel

V2:
- Fix up container_of use coding style
- Handle offlist review comment from Laurence where with the
original code and my patches we can hit a bug where the cmd
times out, LIO starts up the TMR code, but it misses the cmd
because it's on the workqueue.
- Made the work per device work instead of session to handle
the previous issue and so if one dev hits some issue it sleeps on,
it won't block other devices.




^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2021-02-10 18:48 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-09 12:38 [PATCH 00/12 V2] target: fix cmd plugging and completion Mike Christie
2021-02-09 12:38 ` [PATCH 01/13] target: move t_task_cdb initialization Mike Christie
2021-02-10  8:35   ` Christoph Hellwig
2021-02-10  8:35     ` Christoph Hellwig
2021-02-09 12:38 ` [PATCH 02/13] target: split target_submit_cmd_map_sgls Mike Christie
2021-02-09 16:15   ` kernel test robot
2021-02-09 16:15     ` kernel test robot
2021-02-09 16:15     ` kernel test robot
2021-02-09 18:40     ` Mike Christie
2021-02-09 18:40       ` Mike Christie
2021-02-10  8:36   ` Christoph Hellwig
2021-02-10  8:36     ` Christoph Hellwig
2021-02-09 12:38 ` [PATCH 03/13] target: add workqueue based cmd submission Mike Christie
2021-02-09 15:48   ` Bodo Stroesser
2021-02-09 18:43     ` Mike Christie
2021-02-09 19:10       ` Mike Christie
2021-02-09 12:38 ` [PATCH 04/13] vhost scsi: use lio wq cmd submission helper Mike Christie
2021-02-09 12:38 ` [PATCH 05/13] tcm loop: use blk cmd allocator for se_cmds Mike Christie
2021-02-10  8:37   ` Christoph Hellwig
2021-02-10  8:37     ` Christoph Hellwig
2021-02-09 12:38 ` [PATCH 06/13] tcm loop: use lio wq cmd submission helper Mike Christie
2021-02-09 15:59   ` Bodo Stroesser
2021-02-09 18:44     ` Mike Christie
2021-02-09 17:39   ` kernel test robot
2021-02-09 17:39     ` kernel test robot
2021-02-09 12:38 ` [PATCH 07/13] target: cleanup cmd flag bits Mike Christie
2021-02-10  8:38   ` Christoph Hellwig
2021-02-10  8:38     ` Christoph Hellwig
2021-02-09 12:38 ` [PATCH 08/13] target: fix backend plugging Mike Christie
2021-02-09 12:38 ` [PATCH 09/13] target iblock: add backend plug/unplug callouts Mike Christie
2021-02-09 12:38 ` [PATCH 10/13] target_core_user: " Mike Christie
2021-02-09 16:32   ` Bodo Stroesser
2021-02-09 18:59     ` Mike Christie
2021-02-09 12:38 ` [PATCH 11/13] target: replace work per cmd in completion path Mike Christie
2021-02-09 17:01   ` Bodo Stroesser
2021-02-09 18:50     ` Mike Christie
2021-02-10  8:42   ` Christoph Hellwig
2021-02-10  8:42     ` Christoph Hellwig
2021-02-10 18:33     ` Mike Christie
2021-02-09 12:38 ` [PATCH 12/13] target, vhost-scsi: don't switch cpus on completion Mike Christie
2021-02-10  8:44   ` Christoph Hellwig
2021-02-10  8:44     ` Christoph Hellwig
2021-02-10 18:43     ` Mike Christie
2021-02-10 18:45       ` Mike Christie
2021-02-09 12:38 ` [PATCH 13/13] target: flush submission work during TMR processing Mike Christie
2021-02-09 14:31   ` Laurence Oberman
2021-02-10 14:25     ` Laurence Oberman
2021-02-09 17:05   ` Bodo Stroesser
2021-02-09 18:49     ` Mike Christie
2021-02-10  4:55 [PATCH 00/13 V3] target: fix cmd plugging and completion Mike Christie
2021-02-10  4:55 ` [PATCH 05/13] tcm loop: use blk cmd allocator for se_cmds Mike Christie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.