All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Kashyap Desai <kashyap.desai@broadcom.com>, Jens Axboe <axboe@kernel.dk>
Cc: <linux-block@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<ming.lei@redhat.com>, <hare@suse.de>,
	<linux-scsi@vger.kernel.org>
Subject: Re: [PATCH v5 00/14] blk-mq: Reduce static requests memory footprint for shared sbitmap
Date: Fri, 8 Oct 2021 11:17:35 +0100	[thread overview]
Message-ID: <8867352d-2107-1f8a-0f1c-ef73450bf256@huawei.com> (raw)
In-Reply-To: <e4e92abbe9d52bcba6b8cc6c91c442cc@mail.gmail.com>

On 07/10/2021 21:31, Kashyap Desai wrote:
> Perf top data indicates lock contention in "blk_mq_find_and_get_req" call.
> 
> 1.31%     1.31%  kworker/57:1H-k  [kernel.vmlinux]
>       native_queued_spin_lock_slowpath
>       ret_from_fork
>       kthread
>       worker_thread
>       process_one_work
>       blk_mq_timeout_work
>       blk_mq_queue_tag_busy_iter
>       bt_iter
>       blk_mq_find_and_get_req
>       _raw_spin_lock_irqsave
>       native_queued_spin_lock_slowpath
> 
> 
> Kernel v5.14 Data -
> 
> %Node1 :  8.4 us, 31.2 sy,  0.0 ni, 43.7 id,  0.0 wa,  0.0 hi, 16.8 si,  0.0
> st
>       4.46%  [kernel]       [k] complete_cmd_fusion
>       3.69%  [kernel]       [k] megasas_build_and_issue_cmd_fusion
>       2.97%  [kernel]       [k] blk_mq_find_and_get_req
>       2.81%  [kernel]       [k] megasas_build_ldio_fusion
>       2.62%  [kernel]       [k] syscall_return_via_sysret
>       2.17%  [kernel]       [k] __entry_text_start
>       2.01%  [kernel]       [k] io_submit_one
>       1.87%  [kernel]       [k] scsi_queue_rq
>       1.77%  [kernel]       [k] native_queued_spin_lock_slowpath
>       1.76%  [kernel]       [k] scsi_complete
>       1.66%  [kernel]       [k] llist_reverse_order
>       1.63%  [kernel]       [k] _raw_spin_lock_irqsave
>       1.61%  [kernel]       [k] llist_add_batch
>       1.39%  [kernel]       [k] aio_complete_rw
>       1.37%  [kernel]       [k] read_tsc
>       1.07%  [kernel]       [k] blk_complete_reqs
>       1.07%  [kernel]       [k] native_irq_return_iret
>       1.04%  [kernel]       [k] __x86_indirect_thunk_rax
>       1.03%  fio            [.] __fio_gettime
>       1.00%  [kernel]       [k] flush_smp_call_function_queue
> 
> 
> Test #2: Three VDs (each VD consist of 8 SAS SSDs).
> (numactl -N 1 fio
> 3vd.fio --rw=randread --bs=4k --iodepth=32 --numjobs=8
> --ioscheduler=none/mq-deadline)
> 
> There is a performance regression but it is not due to this patch set.
> Kernel v5.11 gives 2.1M IOPs on mq-deadline but 5.15 (without this patchset)
> gives 1.8M IOPs.
> In this test I did not noticed CPU issue as mentioned in Test-1.
> 
> In general, I noticed host_busy is incorrect once I apply this patchset. It
> should not be more than can_queue, but sysfs host_busy value is very high
> when IOs are running. This issue is only after applying this patchset.
> 
> Is this patch set only change the behavior of <shared_host_tag> enabled
> driver ? Will there be any impact on mpi3mr driver ? I can test that as
> well.

I can see where the high value of host_busy is coming from in this 
series - we incorrectly re-iter the tags by #hw queues times in 
blk_mq_tagset_busy_iter() - d'oh.

Please try the below patch. I have looked at other places where we may 
have similar problems in looping the hw queue count for tagset->tags[], 
and they look ok. But I will double-check. I think that 
blk_mq_queue_tag_busy_iter() should be fine - Ming?

--->8----

 From e6ecaa6d624ebb903fa773ca2a2035300b4c55c5 Mon Sep 17 00:00:00 2001
From: John Garry <john.garry@huawei.com>
Date: Fri, 8 Oct 2021 10:55:11 +0100
Subject: [PATCH] blk-mq: Fix blk_mq_tagset_busy_iter() for shared tags

Since it is now possible for a tagset to share a single set of tags, the
iter function should not re-iter the tags for the count of hw queues in
that case. Rather it should just iter once.

Signed-off-by: John Garry <john.garry@huawei.com>

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 72a2724a4eee..ef888aab81b3 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -378,9 +378,15 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, 
busy_tag_iter_fn *fn,
  void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
  		busy_tag_iter_fn *fn, void *priv)
  {
+	int nr_hw_queues;
  	int i;

-	for (i = 0; i < tagset->nr_hw_queues; i++) {
+	if (blk_mq_is_shared_tags(tagset->flags))
+		nr_hw_queues = 1;
+	else
+		nr_hw_queues = tagset->nr_hw_queues;
+
+	for (i = 0; i < nr_hw_queues; i++) {
  		if (tagset->tags && tagset->tags[i])
  			__blk_mq_all_tag_iter(tagset->tags[i], fn, priv,
  					      BT_TAG_ITER_STARTED);

----8<----

Thanks,
john

      parent reply	other threads:[~2021-10-08 10:15 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-05 10:23 [PATCH v5 00/14] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
2021-10-05 10:23 ` [PATCH v5 01/14] blk-mq: Change rqs check in blk_mq_free_rqs() John Garry
2021-10-05 10:23 ` [PATCH v5 02/14] block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ John Garry
2021-10-05 10:23 ` [PATCH v5 03/14] blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests() John Garry
2021-10-05 10:23 ` [PATCH v5 04/14] blk-mq: Invert check " John Garry
2021-10-05 10:23 ` [PATCH v5 05/14] blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_rqs}() John Garry
2021-10-05 10:23 ` [PATCH v5 06/14] blk-mq-sched: Rename blk_mq_sched_free_{requests -> rqs}() John Garry
2021-10-05 10:23 ` [PATCH v5 07/14] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping() John Garry
2021-10-05 10:23 ` [PATCH v5 08/14] blk-mq: Don't clear driver tags own mapping John Garry
2021-10-05 10:23 ` [PATCH v5 09/14] blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap() John Garry
2021-10-05 10:23 ` [PATCH v5 10/14] blk-mq: Add blk_mq_alloc_map_and_rqs() John Garry
2021-10-05 10:23 ` [PATCH v5 11/14] blk-mq: Refactor and rename blk_mq_free_map_and_{requests->rqs}() John Garry
2021-10-05 10:23 ` [PATCH v5 12/14] blk-mq: Use shared tags for shared sbitmap support John Garry
2021-10-05 10:23 ` [PATCH v5 13/14] blk-mq: Stop using pointers for blk_mq_tags bitmap tags John Garry
2021-10-05 10:23 ` [PATCH v5 14/14] blk-mq: Change shared sbitmap naming to shared tags John Garry
2021-10-05 12:35 ` [PATCH v5 00/14] blk-mq: Reduce static requests memory footprint for shared sbitmap Jens Axboe
2021-10-05 13:34   ` John Garry
2021-10-05 13:53     ` Kashyap Desai
2021-10-05 16:23     ` Jens Axboe
2021-10-07 20:31     ` Kashyap Desai
2021-10-08  3:11       ` Bart Van Assche
2021-10-08  8:07         ` John Garry
2021-10-08 10:17       ` John Garry [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8867352d-2107-1f8a-0f1c-ef73450bf256@huawei.com \
    --to=john.garry@huawei.com \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.de \
    --cc=kashyap.desai@broadcom.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.