From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jakub Kicinski <jakub.kicinski@netronome.com>
Subject: Re: [PATCH net-next 00/13] nfp: abm: add basic support for advanced
 buffering NIC
Date: Tue, 22 May 2018 12:14:15 -0700
Message-ID: <20180522121415.500330ad@cakuba>
References: <20180522051255.9438-1-jakub.kicinski@netronome.com>
        <CAJ3xEMh1uoLAJbdXRLPSPHKxd5ZmPN=nf0ZOgj8TUm8j2gp6fg@mail.gmail.com>
        <CAJpBn1ywEO2MayK=EvzzLYLXfZ2xuWCfyiaAYLtL-gnrj2gKwQ@mail.gmail.com>
        <CAJ3xEMhfZjKKRGZOJ8CvrbDEQ9jh9kYvZ-aJBsC86ixm+1RnWw@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Cc: David Miller <davem@davemloft.net>,
        Linux Netdev List <netdev@vger.kernel.org>,
        oss-drivers@netronome.com, Andy Gospodarek <gospo@broadcom.com>,
        linux-internal <linux-internal@mellanox.com>
To: Or Gerlitz <gerlitz.or@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mail-qt0-f177.google.com ([209.85.216.177]:33651 "EHLO
        mail-qt0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751592AbeEVTOU (ORCPT
        <rfc822;netdev@vger.kernel.org>); Tue, 22 May 2018 15:14:20 -0400
Received: by mail-qt0-f177.google.com with SMTP id e8-v6so24936287qth.0
        for <netdev@vger.kernel.org>; Tue, 22 May 2018 12:14:20 -0700 (PDT)
In-Reply-To: <CAJ3xEMhfZjKKRGZOJ8CvrbDEQ9jh9kYvZ-aJBsC86ixm+1RnWw@mail.gmail.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On Tue, 22 May 2018 17:50:45 +0300, Or Gerlitz wrote:
> On Tue, May 22, 2018 at 10:56 AM, Jakub Kicinski wrote:
> > On Mon, May 21, 2018 at 11:32 PM, Or Gerlitz wrote:  
> >> On Tue, May 22, 2018 at 8:12 AM, Jakub Kicinski wrote:  
> >>> Hi!
> >>>
> >>> This series lays groundwork for advanced buffer management NIC feature.
> >>> It makes necessary NFP core changes, spawns representors and adds devlink
> >>> glue.  Following series will add the actual buffering configuration (patch
> >>> series size limit).
> >>>
> >>> First three patches add support for configuring NFP buffer pools via a
> >>> mailbox.  The existing devlink APIs are used for the purpose.
> >>>
> >>> Third patch allows us to perform small reads from the NFP memory.
> >>>
> >>> The rest of the patch set adds eswitch mode change support and makes
> >>> the driver spawn appropriate representors.  
> >>
> >> Hi Jakub,
> >>
> >> Could you provide more higher level description on the abm use-case
> >> and nature of these representors? I understand that under abm you are
> >> modeling the nic as switch with vNIC ports, does vNIC port and vNIC
> >> port rep have the same characteristics as VF and VF rep (xmit on one side
> >> <--> send on 2nd side),  does traffic is to be offloaded using TC, etc.
> >> What one would be doing with vNIC instance, hand it to container ala the Intel
> >> VMDQ concept?
> >> can this be seen as veth HW offload? etc  
> 
> > Yes, the reprs can be used like VF reprs but that's not the main use
> > case. We are targeting container world with ABM, so no VFs and no
> > SR-IOV.  There is only one vNIC per port and no veth offload etc. In  
> 
> one vNIC for multiple containers? or you have a (v?) port per container?

One vNIC with many queues for multiple containers.  If containers have
QoS requirements they should be pinned to specific CPUs and eBPF on the
card can be used to RSS only to the queues associated with those CPUs,
while TC qdisc offload can be used to manipulate the scheduling of
those queues (and ECN marking).

> > In the most basic scenario with 1 PF corresponding to 1 port there is no
> > real use for switching.  
> 
> multiple containers? please clarify it a little better

Yes, if we had a netdev per container that would mean we would need
switching.  But vetch/macvlan/ipvlan offload and therefore multiple
netdevs is not a requirement right now.

IOW we have the eBPF programmable RSS and we are trying to extend the
QoS to be able to make sure that critical/important containers' RX/TX
doesn't get disturbed by less important workloads.

> > The main purpose here is that we want to setup the buffering and QoS
> > inside the NIC (both for TX and RX) and then use eBPF to perform
> > filtering, queue assignment and per-application RSS. That's pretty
> > much it at this point.  
> >
> > Switching if any will be a basic bridge offload.  QoS configuration
> > will all be done using TC qdisc offload, RED etc. exactly like mlxsw :)  
> 
> I guess I'll understand it better once you clarify the multiple
> containers thing,
> thanks for the details and openness

I hope my clarifications help :)