All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Kashyap Desai <kashyap.desai@broadcom.com>, <axboe@kernel.dk>,
	<jejb@linux.ibm.com>, <martin.petersen@oracle.com>,
	<ming.lei@redhat.com>, <bvanassche@acm.org>, <hare@suse.de>,
	<don.brace@microsemi.com>,
	Sumit Saxena <sumit.saxena@broadcom.com>, <hch@infradead.org>,
	Shivasharan Srikanteshwara 
	<shivasharan.srikanteshwara@broadcom.com>
Cc: <chenxiang66@hisilicon.com>, <linux-block@vger.kernel.org>,
	<linux-scsi@vger.kernel.org>, <esc.storagedev@microsemi.com>,
	Hannes Reinecke <hare@suse.com>
Subject: Re: [PATCH RFC v6 08/10] megaraid_sas: switch fusion adapters to MQ
Date: Fri, 17 Apr 2020 17:46:25 +0100	[thread overview]
Message-ID: <40faaef8-8bfc-639f-747f-cacd4e61464f@huawei.com> (raw)
In-Reply-To: <7cac3eb9fd79b5b988e25da542305b35@mail.gmail.com>

On 08/04/2020 10:59, Kashyap Desai wrote:

Hi Kashyap,

> 
>>> We have done some level of testing to know performance impact on SAS
>>> SSDs and HDD setup. Here is my finding - My testing used - Two socket
>>> Intel Skylake/Lewisburg/Purley Output of numactl --hardware
>>>
>>> available: 2 nodes (0-1)
>>> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 36 37 38 39
>>> 40 41
>>> 42 43 44 45 46 47 48 49 50 51 52 53
>>> node 0 size: 31820 MB
>>> node 0 free: 21958 MB
>>> node 1 cpus: 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 54
>>> 55
>>> 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 1 size: 32247 MB
>>> node 1 free: 21068 MB node distances:
>>> node   0   1
>>>     0:  10  21
>>>     1:  21  10

Do you have other info, like IRQ-CPU affinity dump and controller PCI 
vendor+device ID? Also /proc/interrupts info would be good after a run, 
like supplied by Sumit here:

https://lore.kernel.org/linux-scsi/CAL2rwxotoWakFS4DPe85hZ4VAgd_zw8pL+B5ckHR9NwEf+-L=g@mail.gmail.com/

Are you enabling some special driver perf mode?

>>>
>>>
>>> 64 HDD setup -
>>>
>>> With higher QD
>> what's OD?
>>
>>> and io schedulder = mq-deadline, shared host tag is not scaling well. >>> If I use ioscheduler = none, I can see consistent 2.0M IOPs.
>>> This issue is seen only with RFC. Without RFC mq-deadline scales up to
>>> 2.0M IOPS.

In theory, from this driver perspective, we should not be making a 
difference. That's after your change to use sdev-> device busy count, 
rather than the hctx nr_active count. As I understand, that's the only 
difference you made.

But I will try an IO scheduler on hisi sas for ssd to see if any difference.

>> I didn't try any scheduler. I can have a look at that.
>>
>>> Perf Top result of RFC - (IOPS = 1.4M IOPS)
>>>
>>>      78.20%  [kernel]        [k] native_queued_spin_lock_slowpath
>>>        1.46%  [kernel]        [k] sbitmap_any_bit_set
>>>        1.14%  [kernel]        [k] blk_mq_run_hw_queue
>>>        0.90%  [kernel]        [k] _mix_pool_bytes
>>>        0.63%  [kernel]        [k] _raw_spin_lock
>>>        0.57%  [kernel]        [k] blk_mq_run_hw_queues
>>>        0.56%  [megaraid_sas]  [k] complete_cmd_fusion
>>>        0.54%  [megaraid_sas]  [k] megasas_build_and_issue_cmd_fusion
>>>        0.50%  [kernel]        [k] dd_has_work
>>>        0.38%  [kernel]        [k] _raw_spin_lock_irqsave
>>>        0.36%  [kernel]        [k] gup_pgd_range
>>>        0.35%  [megaraid_sas]  [k] megasas_build_ldio_fusion
>>>        0.31%  [kernel]        [k] io_submit_one
>>>        0.29%  [kernel]        [k] hctx_lock
>>>        0.26%  [kernel]        [k] try_to_grab_pending
>>>        0.24%  [kernel]        [k] scsi_queue_rq
>>>        0.22%  fio             [.] __fio_gettime
>>>        0.22%  [kernel]        [k] insert_work
>>>        0.20%  [kernel]        [k] native_irq_return_iret
>>>
>>> Perf top without RFC driver - (IOPS = 2.0 M IOPS)
>>>
>>>       58.40%  [kernel]          [k] native_queued_spin_lock_slowpath
>>>        2.06%  [kernel]          [k] _mix_pool_bytes
>>>        1.38%  [kernel]          [k] _raw_spin_lock_irqsave
>>>        0.97%  [kernel]          [k] _raw_spin_lock
>>>        0.91%  [kernel]          [k] scsi_queue_rq
>>>        0.82%  [kernel]          [k] __sbq_wake_up
>>>        0.77%  [kernel]          [k] _raw_spin_unlock_irqrestore
>>>        0.74%  [kernel]          [k] scsi_mq_get_budget
>>>        0.61%  [kernel]          [k] gup_pgd_range
>>>        0.58%  [kernel]          [k] aio_complete_rw
>>>        0.52%  [kernel]          [k] elv_rb_add
>>>        0.50%  [kernel]          [k] llist_add_batch
>>>        0.50%  [kernel]          [k] native_irq_return_iret
>>>        0.48%  [kernel]          [k] blk_rq_map_sg
>>>        0.48%  fio               [.] __fio_gettime
>>>        0.47%  [kernel]          [k] blk_mq_get_tag
>>>        0.44%  [kernel]          [k] blk_mq_dispatch_rq_list
>>>        0.40%  fio               [.] io_u_queued_complete
>>>        0.39%  fio               [.] get_io_u
>>>
>>>
>>> If you want me to test any top up patch, please let me know.  BTW, we
>>> also wants to provide module parameter for user to switch back to
>>> older nr_hw_queue = 1 mode. I will work on that part.
>> ok, but I would just like to reiterate the point that you will not see the
>> full
>> benefit of blk-mq draining hw queues for cpu hotplug since you hide hw
>> queues from blk-mq.
> Agree. We have done  minimal testing using this RFC. We want to ACK this RFC
> as long as primary performance goal is achieved.
> 
> We have done full testing on nr_hw_queue =1 (and that is what customer is
> using) so we at least want to give that interface available for customer for
> some time (assuming they may find some performance gap between two interface
> which we may not have encountered during smoke testing.).
> Over a period of time, if nr_hw_queue = N works for (Broadcom will conduct
> full performance once RFC is committed in upstream) all the IO profiles, we
> will share the information with customer about benefit of using nr_hw_queues
> =  N.

Hopefully you can use nr_hw_queues = N always.

> 

Thanks,
john

  reply	other threads:[~2020-04-17 16:47 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-05 11:54 [PATCH RFC v6 00/10] blk-mq/scsi: Provide hostwide shared tags for SCSI HBAs John Garry
2020-03-05 11:54 ` [PATCH RFC v6 01/10] blk-mq: rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED John Garry
2020-03-05 11:54 ` [PATCH RFC v6 02/10] blk-mq: rename blk_mq_update_tag_set_depth() John Garry
2020-03-05 11:54 ` [PATCH RFC v6 03/10] blk-mq: Use pointers for blk_mq_tags bitmap tags John Garry
2020-03-05 12:42   ` Hannes Reinecke
2020-03-05 11:54 ` [PATCH RFC v6 04/10] blk-mq: Facilitate a shared sbitmap per tagset John Garry
2020-03-05 12:49   ` Hannes Reinecke
2020-03-05 13:52     ` John Garry
2020-03-05 11:54 ` [PATCH RFC v6 05/10] blk-mq: Add support in hctx_tags_bitmap_show() for a shared sbitmap John Garry
2020-03-05 12:52   ` Hannes Reinecke
2020-03-05 11:54 ` [PATCH RFC v6 06/10] scsi: Add template flag 'host_tagset' John Garry
2020-03-06 11:12   ` John Garry
2020-03-05 11:54 ` [PATCH RFC v6 07/10] scsi: hisi_sas: Switch v3 hw to MQ John Garry
2020-03-05 12:52   ` Hannes Reinecke
2020-03-05 11:54 ` [PATCH RFC v6 08/10] megaraid_sas: switch fusion adapters " John Garry
2020-04-07 11:14   ` Kashyap Desai
2020-04-08  9:33     ` John Garry
2020-04-08  9:59       ` Kashyap Desai
2020-04-17 16:46         ` John Garry [this message]
2020-04-20 17:47           ` Kashyap Desai
2020-04-21 12:35             ` John Garry
2020-04-22 18:59               ` Kashyap Desai
2020-04-22 21:28                 ` John Garry
2020-04-23 16:31                   ` John Garry
2020-04-24 16:31                     ` Kashyap Desai
2020-04-27 17:06                       ` John Garry
2020-04-27 18:58                         ` Kashyap Desai
2020-04-28 15:55                           ` John Garry
2020-04-29 11:29                             ` John Garry
2020-04-29 15:50                               ` Kashyap Desai
2020-04-29 17:55                                 ` John Garry
2020-04-30 17:40                                   ` John Garry
2020-04-30 19:18                                     ` Kashyap Desai
2020-03-05 11:54 ` [PATCH RFC v6 09/10] smartpqi: enable host tagset John Garry
2020-03-05 11:54 ` [PATCH RFC v6 10/10] hpsa: enable host_tagset and switch to MQ John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40faaef8-8bfc-639f-747f-cacd4e61464f@huawei.com \
    --to=john.garry@huawei.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=chenxiang66@hisilicon.com \
    --cc=don.brace@microsemi.com \
    --cc=esc.storagedev@microsemi.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=jejb@linux.ibm.com \
    --cc=kashyap.desai@broadcom.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=shivasharan.srikanteshwara@broadcom.com \
    --cc=sumit.saxena@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.