All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tony Lu <tonylu@linux.alibaba.com>
To: Karsten Graul <kgraul@linux.ibm.com>
Cc: Leon Romanovsky <leon@kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	kuba@kernel.org, davem@davemloft.net, netdev@vger.kernel.org,
	linux-s390@vger.kernel.org,
	RDMA mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [RFC PATCH net-next 0/6] net/smc: Spread workload over multiple cores
Date: Fri, 28 Jan 2022 14:55:39 +0800	[thread overview]
Message-ID: <YfOTa5uIPUw+gOfM@TonyMac-Alibaba> (raw)
In-Reply-To: <3fcfdf75-eb8c-426d-5874-3afdc49de743@linux.ibm.com>

On Thu, Jan 27, 2022 at 03:52:36PM +0100, Karsten Graul wrote:
> On 27/01/2022 10:50, Tony Lu wrote:
> > On Thu, Jan 27, 2022 at 11:25:41AM +0200, Leon Romanovsky wrote:
> >> On Thu, Jan 27, 2022 at 05:14:35PM +0800, Tony Lu wrote:
> >>> On Thu, Jan 27, 2022 at 10:47:09AM +0200, Leon Romanovsky wrote:
> >>>> On Thu, Jan 27, 2022 at 03:59:36PM +0800, Tony Lu wrote:
> >>>
> >>> Sorry for that if I missed something about properly using existing
> >>> in-kernel API. I am not sure the proper API is to use ib_cq_pool_get()
> >>> and ib_cq_pool_put()?
> >>>
> >>> If so, these APIs doesn't suit for current smc's usage, I have to
> >>> refactor logic (tasklet and wr_id) in smc. I think it is a huge work
> >>> and should do it with full discussion.
> >>
> >> This discussion is not going anywhere. Just to summarize, we (Jason and I)
> >> are asking to use existing API, from the beginning.
> > 
> > Yes, I can't agree more with you about using existing API and I have
> > tried them earlier. The existing APIs are easy to use if I wrote a new
> > logic. I also don't want to repeat the codes.
> > 
> > The main obstacle is that the packet and wr processing of smc is
> > tightly bound to the old API and not easy to replace with existing API.
> > 
> > To solve a real issue, I have to fix it based on the old API. If using
> > existing API in this patch, I have to refactor smc logics which needs
> > more time. Our production tree is synced with smc next. So I choose to
> > fix this issue first, then refactor these logic to fit existing API once
> > and for all.
> 
> While I understand your approach to fix the issue first I need to say
> that such interim fixes create an significant amount of effort that has to
> be spent for review and test for others. And there is the increased risk 
> to introduce new bugs by just this only-for-now fix.

Let's back to this patch itself. This approach spreads CQs to different
vectors, it tries to solve this issue under current design and not to
introduce more changes to make it easier to review and test. It severely
limits the performance of SMC when replacing TCP. This patch tries to
reduce the gap between SMC and TCP.

To use newer API, it should have a lots of work to do with wr process
logic, for example remove tasklet handler, refactor wr_id logic. I have
no idea if we should do this? If it's okay and got your permission, I
will do this in the next patch.

> Given the fact that right now you are the only one who is affected by this problem
> I recommend to keep your fix in your environment for now, and come back with the
> final version. In the meantime I can use the saved time to review the bunch 
> of other patches that we received.

I really appreciate the time you spent reviewing our patch. Recently,
our team has submitted a lot of patches and got your detailed
suggestions, including panic (linkgroup, CDC), performance and so on.
We are using SMC in our public cloud environment. Therefore, we maintain
a internal tree and try to contribute these changes to upstream, and we
will continue to invest to improve the stability, performance and
compatibility, and focus on SMC for a long time.

We are willing to commit time and resource to help out in reviewing and
testing the patch in mail list and -next, as reviewer or tester.

We have built up CI/CD and nightly test for SMC. And we intend to send
test reports for each patch in the mail list, help to review, find out
panic and performance regression.

Not sure if this proposal will help save your time to review other
patches? Glad to hear your advice.

Thank you,
Tony Lu

  reply	other threads:[~2022-01-28  6:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-14  5:48 [RFC PATCH net-next 0/6] net/smc: Spread workload over multiple cores Tony Lu
2022-01-14  5:48 ` [RFC PATCH net-next 1/6] net/smc: Spread CQs to differents completion vectors Tony Lu
2022-01-14  5:48 ` [RFC PATCH net-next 2/6] net/smc: Prepare for multiple CQs per IB devices Tony Lu
2022-01-14  5:48 ` [RFC PATCH net-next 3/6] net/smc: Introduce smc_ib_cq to bind link and cq Tony Lu
2022-01-14  5:48 ` [RFC PATCH net-next 4/6] net/smc: Multiple CQs per IB devices Tony Lu
2022-01-14  5:48 ` [RFC PATCH net-next 5/6] net/smc: Unbind buffer size from clcsock and make it tunable Tony Lu
2022-01-14  9:13   ` kernel test robot
2022-01-14  9:43   ` kernel test robot
2022-01-14  5:48 ` [RFC PATCH net-next 6/6] net/smc: Introduce tunable linkgroup max connections Tony Lu
2022-01-16  9:00 ` [RFC PATCH net-next 0/6] net/smc: Spread workload over multiple cores Leon Romanovsky
2022-01-16 17:47   ` Tony Lu
2022-01-26  7:23   ` Tony Lu
2022-01-26 15:28     ` Jason Gunthorpe
2022-01-27  3:14       ` Tony Lu
2022-01-27  6:21         ` Leon Romanovsky
2022-01-27  7:59           ` Tony Lu
2022-01-27  8:47             ` Leon Romanovsky
2022-01-27  9:14               ` Tony Lu
2022-01-27  9:25                 ` Leon Romanovsky
2022-01-27  9:50                   ` Tony Lu
2022-01-27 14:52                     ` Karsten Graul
2022-01-28  6:55                       ` Tony Lu [this message]
2022-02-01 16:50                         ` Karsten Graul
2022-02-09  9:49                           ` Tony Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfOTa5uIPUw+gOfM@TonyMac-Alibaba \
    --to=tonylu@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=jgg@ziepe.ca \
    --cc=kgraul@linux.ibm.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.