Re: [PATCH] cifs: use the least loaded channel for sending requests

From: Tom Talpey <tom@talpey.com>
To: Shyam Prasad N <nspmangalore@gmail.com>
Cc: "Steve French" <smfrench@gmail.com>,
	CIFS <linux-cifs@vger.kernel.org>,
	"Bharath S M" <bharathsm@microsoft.com>,
	"Aurélien Aptel" <aurelien.aptel@gmail.com>
Subject: Re: [PATCH] cifs: use the least loaded channel for sending requests
Date: Wed, 21 Dec 2022 11:46:59 -0500	[thread overview]
Message-ID: <73b86766-75cf-cc1e-0d21-01eaeea71a49@talpey.com> (raw)
In-Reply-To: <CANT5p=rje_XAHySDoxL50C6=EUkvdawN4neU+0xyvFDLAbYW6A@mail.gmail.com>

Question on this:

 > We will not be mixing traffic here. Channel bandwidth and type are
 > considered while establishing a new channel.
 > This change is only for distributing the requests among the channels
 > for the session.

I disagree. The iface_cmp() function in cifsglob.h returns a positive
value for all interfaces which are as fast or faster than the one
being compared-to. And weirdly, it only looks at rdma_capable when the
speeds are equal, and it ignores rss_capable unless rdma_capable
matches.

It also makes the following questionable assumption:

     * The iface_list is assumed to be sorted by speed.

In parse_server_interfaces(), new eligible entries are added in the
order the server returns them. The protocol doesn't require this! It's
entirely implementation specific.

In any event, I think if the client connects to a server on a 1GbE
interface, and discovers a 10GbE one, it will mix traffic on both.
In the current code, every other operation will switch interfaces.
In your code, it will only slightly prefer the 10GbE, when bulk data
begins to backlog the 1GbE.

So, unless you fix iface_cmp, and rework the selection logic in
parse_server_interfaces, I don't think the change does much.

What about the runtime selection?

Tom.

On 12/21/2022 10:33 AM, Shyam Prasad N wrote:
> On Tue, Dec 20, 2022 at 11:48 PM Tom Talpey <tom@talpey.com> wrote:
>>
>> I'd suggest a runtime configuration, personally. A config option
>> is undesirable, because it's very difficult to deploy. A module
>> parameter is only slightly better. The channel selection is a
>> natural for a runtime per-operation decision. And for the record,
>> I think a round-robin selection is a bad approach. Personally
>> I'd just yank it.
> 
> Hi Tom,
> 
> Thanks for taking a look at this.
> I was considering doing so. But was unsure if we'll still need a way
> to do round robin.
> Steve/Aurelien: Any objections to just remove the round-robin approach?
> 
>>
>> I'm uneasy about ignoring the channel bandwidth and channel type.
>> Low bandwidth channels, or mixing RDMA and non-RDMA, are going to
>> be very problematic for bulk data. In fact, the Windows client
>> never mixes such alternatives, it always selects identical link
>> speeds and transport types. The traffic will always find a way to
>> block on the slowest/worst connection.
>>
>> Do you feel there is some advantage to mixing traffic? If so, can
>> you elaborate on that?
> 
> We will not be mixing traffic here. Channel bandwidth and type are
> considered while establishing a new channel.
> This change is only for distributing the requests among the channels
> for the session.
> 
> That said, those decisions are sub-optimal today, IMO.
> I plan to send out some changes there too.
> 
>>
>> The patch you link to doesn't seem complete. If min_in_flight is
>> initialized to -1, how does the server->in_flight < min_in_flight
>> test ever return true?
> 
> min_in_flight is declared as unsigned and then assigned to -1.
> I'm relying on the compiler to use the max value for the unsigned int
> based on this.
> Perhaps I should have been more explicit by assigning this to
> UINT_MAX. Will do so now.
> 
>>
>> Tom.
>>
>> On 12/20/2022 9:47 AM, Steve French wrote:
>>> maybe a module load parm would be easier to use than kernel config
>>> option (and give more realistic test comparison data for many)
>>>
>>> On Tue, Dec 20, 2022 at 7:29 AM Shyam Prasad N <nspmangalore@gmail.com> wrote:
>>>>
>>>> Hi Steve,
>>>>
>>>> Below is a patch for a new channel allocation strategy that we've been
>>>> discussing for some time now. It uses the least loaded channel to send
>>>> requests as compared to the simple round robin. This will help
>>>> especially in cases where the server is not consuming requests at the
>>>> same rate across the channels.
>>>>
>>>> I've put the changes behind a config option that has a default value of true.
>>>> This way, we have an option to switch to the current default of round
>>>> robin when needed.
>>>>
>>>> Please review.
>>>>
>>>> https://github.com/sprasad-microsoft/smb3-kernel-client/commit/28b96fd89f7d746fc2b6c68682527214a55463f9.patch
>>>>
>>>> --
>>>> Regards,
>>>> Shyam
>>>
>>>
>>>
> 
> 
>