All of lore.kernel.org
 help / color / mirror / Atom feed
From: Karsten Graul <kgraul@linux.ibm.com>
To: Tony Lu <tonylu@linux.alibaba.com>
Cc: "D. Wythe" <alibuda@linux.alibaba.com>,
	dust.li@linux.alibaba.com, kuba@kernel.org, davem@davemloft.net,
	netdev@vger.kernel.org, linux-s390@vger.kernel.org,
	linux-rdma@vger.kernel.org
Subject: Re: [PATCH net-next v2] net/smc: Reduce overflow of smc clcsock listen queue
Date: Thu, 13 Jan 2022 09:07:51 +0100	[thread overview]
Message-ID: <5a5ba1b6-93d7-5c1e-aab2-23a52727fbd1@linux.ibm.com> (raw)
In-Reply-To: <YdaUuOq+SkhYTWU8@TonyMac-Alibaba>

On 06/01/2022 08:05, Tony Lu wrote:
> On Wed, Jan 05, 2022 at 08:13:23PM +0100, Karsten Graul wrote:
>> On 05/01/2022 16:06, D. Wythe wrote:
>>> LGTM. Fallback makes the restrictions on SMC dangling
>>> connections more meaningful to me, compared to dropping them.
>>>
>>> Overall, i see there are two scenario.
>>>
>>> 1. Drop the overflow connections limited by userspace application
>>> accept.
>>>
>>> 2. Fallback the overflow connections limited by the heavy process of
>>> current SMC handshake. ( We can also control its behavior through
>>> sysctl.)
>>>
>>
>> I vote for (2) which makes the behavior from user space applications point of view more like TCP.
> Fallback when smc reaches itself limit is a good idea. I'm curious
> whether the fallback reason is suitable, it more like a non-negative
> issue. Currently, smc fallback for negative issues, such as resource not
> available or internal error. This issue doesn't like a non-negative
> reason.

SMC falls back when the SMC processing cannot be completed, e.g. due to 
resource constraints like memory. For me the time/duration constraint is
also a good reason to fall back to TCP.

> 
> And I have no idea about to mix the normal and fallback connections at
> same time, meanwhile there is no error happened or hard limit reaches,
> is a easy to maintain for users? Maybe let users misunderstanding, a
> parameter from userspace control this limit, and the behaviour (drop or
> fallback).

I think of the following approach: the default maximum of active workers in a
work queue is defined by WQ_MAX_ACTIVE (512). when this limit is hit then we
have slightly lesser than 512 parallel SMC handshakes running at the moment,
and new workers would be enqueued without to become active.
In that case (max active workers reached) I would tend to fallback new connections
to TCP. We would end up with lesser connections using SMC, but for the user space
applications there would be nearly no change compared to TCP (no dropped TCP connection
attempts, no need to reconnect).
Imho, most users will never run into this problem, so I think its fine to behave like this.

As far as I understand you, you still see a good reason in having another behavior 
implemented in parallel (controllable by user) which enqueues all incoming connections
like in your patch proposal? But how to deal with the out-of-memory problems that might 
happen with that?

>  
>> One comment to sysctl: our current approach is to add new switches to the existing 
>> netlink interface which can be used with the smc-tools package (or own implementations of course). 
>> Is this prereq problematic in your environment? 
>> We tried to avoid more sysctls and the netlink interface keeps use more flexible.
> 
> I agree with you about using netlink is more flexible. There are
> something different in our environment to use netlink to control the
> behaves of smc.
> 
> Compared with netlink, sysctl is:
> - easy to use on clusters. Applications who want to use smc, don't need
>   to deploy additional tools or developing another netlink logic,
>   especially for thousands of machines or containers. With smc forward,
>   we should make sure the package or logic is compatible with current
>   kernel, but sysctl's API compatible is easy to discover.
> 
> - config template and default maintain. We are using /etc/sysctl.conf to
>   make sure the systeml configures update to date, such as pre-tuned smc
>   config parameters. So that we can change this default values on boot,
>   and generate lots of machines base on this machine template. Userspace
>   netlink tools doesn't suit for it, for example ip related config, we
>   need additional NetworkManager or netctl to do this.
> 
> - TCP-like sysctl entries. TCP provides lots of sysctl to configure
>   itself, somethings it is hard to use and understand. However, it is
>   accepted by most of users and system. Maybe we could use sysctl for
>   the item that frequently and easy to change, netlink for the complex
>   item.
> 
> We are gold to contribute to smc-tools. Use netlink and sysctl both
> time, I think, is a more suitable choice.

Lets decide that when you have a specific control that you want to implement. 
I want to have a very good to introduce another interface into the SMC module,
making the code more complex and all of that. The decision for the netlink interface 
was also done because we have the impression that this is the NEW way to go, and
since we had no interface before we started with the most modern way to implement it.

TCP et al have a history with sysfs, so thats why it is still there. 
But I might be wrong on that...

  reply	other threads:[~2022-01-13  8:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-04 13:12 [PATCH net-next v2] net/smc: Reduce overflow of smc clcsock listen queue D. Wythe
2022-01-04 13:45 ` Karsten Graul
2022-01-04 16:17   ` D. Wythe
2022-01-05  4:40   ` D. Wythe
2022-01-05  8:28     ` Tony Lu
2022-01-05  8:57     ` dust.li
2022-01-05 13:17       ` Karsten Graul
2022-01-05 15:06         ` D. Wythe
2022-01-05 19:13           ` Karsten Graul
2022-01-06  7:05             ` Tony Lu
2022-01-13  8:07               ` Karsten Graul [this message]
2022-01-13 18:50                 ` Jakub Kicinski
2022-01-20 13:39                 ` Tony Lu
2022-01-20 16:00                   ` Stefan Raspl
2022-01-21  2:47                     ` Tony Lu
2022-02-16 11:46                 ` dust.li
2022-01-06  3:51           ` D. Wythe
2022-01-06  9:54             ` Karsten Graul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a5ba1b6-93d7-5c1e-aab2-23a52727fbd1@linux.ibm.com \
    --to=kgraul@linux.ibm.com \
    --cc=alibuda@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=dust.li@linux.alibaba.com \
    --cc=kuba@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tonylu@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.