nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "matsuda-daisuke@fujitsu.com" <matsuda-daisuke@fujitsu.com>
To: 'Bob Pearson' <rpearsonhpe@gmail.com>,
	'Bart Van Assche' <bvanassche@acm.org>,
	Yanjun Zhu <yanjun.zhu@linux.dev>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"leonro@nvidia.com" <leonro@nvidia.com>,
	"jgg@nvidia.com" <jgg@nvidia.com>,
	"zyjzyj2000@gmail.com" <zyjzyj2000@gmail.com>
Cc: "nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"yangx.jy@fujitsu.com" <yangx.jy@fujitsu.com>,
	"lizhijian@fujitsu.com" <lizhijian@fujitsu.com>,
	"y-goto@fujitsu.com" <y-goto@fujitsu.com>
Subject: RE: [RFC PATCH 2/7] RDMA/rxe: Convert the triple tasklets to workqueues
Date: Wed, 28 Sep 2022 06:40:15 +0000	[thread overview]
Message-ID: <TYCPR01MB84552F7598F695D0C7CE16EBE5549@TYCPR01MB8455.jpnprd01.prod.outlook.com> (raw)
In-Reply-To: <4a9f2b42-f230-b951-f03b-588b94cd39a6@gmail.com>

Dear Bob, Yanjun, and Bart,

Sorry for taking long time to reply.

> On 9/12/22 02:58, matsuda-daisuke@fujitsu.com wrote:
> > On Mon, Sep 12, 2022 12:09 AM Bart Van Assche wrote:
> >> On 9/11/22 00:10, Yanjun Zhu wrote:
> >>> I also implemented a workqueue for rxe. IMO, can we add a variable to
> >>> decide to use tasklet or workqueue?
> >>>
> >>> If user prefer using tasklet, he can set the variable to use
> >>> tasklet. And the default is tasklet. Set the variable to another
> >>> value to use workqueue.
> >
> > That's an interesting idea, but I am not sure how users specify it.
> > IIRC, tasklets are generated when rdma link is added, typically by
> > executing ' rdma link add' command. I don't think we can add
> > an device specific option to the utility(iproute2/rdma).
> >
> >>
> >> I'm in favor of removing all uses of the tasklet mechanism because of
> >> the disadvantages of that mechanism. See also:
> >> * "Eliminating tasklets" (https://lwn.net/Articles/239633/).
> >> * "Modernizing the tasklet API" (https://lwn.net/Articles/830964/).
> >> * Sebastian Andrzej Siewior's opinion about tasklets
> >> (https://lore.kernel.org/all/YvovfXMJQAUBsvBZ@linutronix.de/).
> >
> > I am also in favor of using workqueues alone not only because of the
> > disadvantages above but also to avoid complexity. I would like to know
> > if there is anybody who will bothered by the change especially in terms
> > of performance.
> >
> > Thanks,
> > Daisuke
> >
> >>
> >> Thanks,
> >>
> >> Bart.
> >
> 
> The HPE patch set for work queues (I should send it in) kept the ability to run tasklets or work queues.
> The reason was that the change was motivated by a benchmarking exercise and we found that the performance
> of tasklets was noticeably better for one IO stream but for 2 or more IO streams work queues were better because we
> could place the work on separate cpus. Tasklets have a tendency to bunch up on the same cpu. I am interested in
> how Matsuda got better/same performance for work queues.

As far as I measured the bandwidth using ib_send_bw command, the performance
was better with workqueues. There seem to be multiple factors that affect
the result. For example, with the current implementation, rxe_responder() can
sometimes be called from softirq(NET_RX_SOFTIRQ) context directly. I changed
rxe_resp_queue_pkt() to always schedule the responder for all incoming 
requests. This may have led to better utilization of multiple processors because
softirq code and responder code are more likely to run concurrently on different
cores in this case. Tasklets are likely to run on the same core as softirq code
because TASKLET_SOFTIRQ is processed later than NET_RX_SOFTIRQ in __do_softirq().

That being said, I think it is also true that the performance of tasklets can be
superior to that of workqueues in some situations. When I measured the bandwidth
of RDMA Read using ib_read_bw command, it was better with tasklets. Additionally,
the latency is generally around 40% higher with workqueues, so it is possible some
kinds of workloads do not benefit from using workqueues.

I therefore think we may preserve tasklets though there are disadvantages suggested
by Bart. While I have no objection to removing tasklets totally, I am in favour of
Yanjun's suggestion of switching between tasklets and workqueues using sysctl
parameter. I would like to hear what you guys think about this.

I would also like to know when Bob is going to post the patchset. Both of us need to use
workqueues, but allowing sleep for ODP and improving performance for multiplex IO
streams are different matters. I suppose it will be easier to make the changes one by one.
If you need some more time to post it, I suggest we proceed with Yanjun's idea for now.
That will preserve current implementation of tasklets, so it would be not hard to add your
change onto it. Could you let me know your thoughts?


Regards,
Daisuke Matsuda


> 
> Bob

  reply	other threads:[~2022-09-28  6:41 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-07  2:42 [RFC PATCH 0/7] RDMA/rxe: On-Demand Paging on SoftRoCE Daisuke Matsuda
2022-09-07  2:42 ` [RFC PATCH 1/7] IB/mlx5: Change ib_umem_odp_map_dma_single_page() to retain umem_mutex Daisuke Matsuda
2022-09-07  2:43 ` [RFC PATCH 2/7] RDMA/rxe: Convert the triple tasklets to workqueues Daisuke Matsuda
2022-09-09 19:39   ` Bob Pearson
2022-09-12  8:27     ` matsuda-daisuke
2022-09-11  7:10   ` Yanjun Zhu
2022-09-11 15:08     ` Bart Van Assche
2022-09-12  7:58       ` matsuda-daisuke
2022-09-12  8:29         ` Yanjun Zhu
2022-09-12 19:52         ` Bob Pearson
2022-09-28  6:40           ` matsuda-daisuke [this message]
2022-09-12  8:25       ` Yanjun Zhu
2022-09-07  2:43 ` [RFC PATCH 3/7] RDMA/rxe: Cleanup code for responder Atomic operations Daisuke Matsuda
2022-09-07  2:43 ` [RFC PATCH 4/7] RDMA/rxe: Add page invalidation support Daisuke Matsuda
2022-09-07  2:43 ` [RFC PATCH 5/7] RDMA/rxe: Allow registering MRs for On-Demand Paging Daisuke Matsuda
2022-09-08 16:57   ` Haris Iqbal
2022-09-09  0:55     ` matsuda-daisuke
2022-09-07  2:43 ` [RFC PATCH 6/7] RDMA/rxe: Add support for Send/Recv/Write/Read operations with ODP Daisuke Matsuda
2022-09-08  8:29   ` Leon Romanovsky
2022-09-09  2:45     ` matsuda-daisuke
2022-09-07  2:43 ` [RFC PATCH 7/7] RDMA/rxe: Add support for the traditional Atomic " Daisuke Matsuda
2022-09-08  8:40 ` [RFC PATCH 0/7] RDMA/rxe: On-Demand Paging on SoftRoCE Zhu Yanjun
2022-09-08 10:25   ` matsuda-daisuke
2022-09-09  3:07 ` Li Zhijian
2022-09-12  9:21   ` matsuda-daisuke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=TYCPR01MB84552F7598F695D0C7CE16EBE5549@TYCPR01MB8455.jpnprd01.prod.outlook.com \
    --to=matsuda-daisuke@fujitsu.com \
    --cc=bvanassche@acm.org \
    --cc=jgg@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lizhijian@fujitsu.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=rpearsonhpe@gmail.com \
    --cc=y-goto@fujitsu.com \
    --cc=yangx.jy@fujitsu.com \
    --cc=yanjun.zhu@linux.dev \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).