netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florian Fainelli <f.fainelli@gmail.com>
To: Ido Schimmel <idosch@mellanox.com>,
	Robert Beckett <bob.beckett@collabora.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Andrew Lunn <andrew@lunn.ch>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jiri Pirko <jiri@resnulli.us>
Subject: Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
Date: Thu, 12 Sep 2019 09:25:02 -0700	[thread overview]
Message-ID: <68676250-17df-b0bb-521a-64877f198647@gmail.com> (raw)
In-Reply-To: <20190912090339.GA16311@splinter>

On 9/12/19 2:03 AM, Ido Schimmel wrote:
> On Wed, Sep 11, 2019 at 12:49:03PM +0100, Robert Beckett wrote:
>> On Wed, 2019-09-11 at 11:21 +0000, Ido Schimmel wrote:
>>> On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli wrote:
>>>> +Ido, Jiri,
>>>>
>>>> On 9/10/19 8:41 AM, Robert Beckett wrote:
>>>>> This patch-set adds support for some features of the Marvell
>>>>> switch
>>>>> chips that can be used to handle packet storms.
>>>>>
>>>>> The rationale for this was a setup that requires the ability to
>>>>> receive
>>>>> traffic from one port, while a packet storm is occuring on
>>>>> another port
>>>>> (via an external switch with a deliberate loop). This is needed
>>>>> to
>>>>> ensure vital data delivery from a specific port, while mitigating
>>>>> any
>>>>> loops or DoS that a user may introduce on another port (can't
>>>>> guarantee
>>>>> sensible users).
>>>>
>>>> The use case is reasonable, but the implementation is not really.
>>>> You
>>>> are using Device Tree which is meant to describe hardware as a
>>>> policy
>>>> holder for setting up queue priorities and likewise for queue
>>>> scheduling.
>>>>
>>>> The tool that should be used for that purpose is tc and possibly an
>>>> appropriately offloaded queue scheduler in order to map the desired
>>>> scheduling class to what the hardware supports.
>>>>
>>>> Jiri, Ido, how do you guys support this with mlxsw?
>>>
>>> Hi Florian,
>>>
>>> Are you referring to policing traffic towards the CPU using a policer
>>> on
>>> the egress of the CPU port? At least that's what I understand from
>>> the
>>> description of patch 6 below.
>>>
>>> If so, mlxsw sets policers for different traffic types during its
>>> initialization sequence. These policers are not exposed to the user
>>> nor
>>> configurable. While the default settings are good for most users, we
>>> do
>>> want to allow users to change these and expose current settings.
>>>
>>> I agree that tc seems like the right choice, but the question is
>>> where
>>> are we going to install the filters?
>>>
>>
>> Before I go too far down the rabbit hole of tc traffic shaping, maybe
>> it would be good to explain in more detail the problem I am trying to
>> solve.
>>
>> We have a setup as follows:
>>
>> Marvell 88E6240 switch chip, accepting traffic from 4 ports. Port 1
>> (P1) is critical priority, no dropped packets allowed, all others can
>> be best effort.
>>
>> CPU port of swtich chip is connected via phy to phy of intel i210 (igb
>> driver).
>>
>> i210 is connected via pcie switch to imx6.
>>
>> When too many small packets attempt to be delivered to CPU port (e.g.
>> during broadcast flood) we saw dropped packets.
>>
>> The packets were being received by i210 in to rx descriptor buffer
>> fine, but the CPU could not keep up with the load. We saw
>> rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.
>>
>>
>> With this in mind, I am wondering whether any amount of tc traffic
>> shaping would help? Would tc shaping require that the packet reception
>> manages to keep up before it can enact its policies? Does the
>> infrastructure have accelerator offload hooks to be able to apply it
>> via HW? I dont see how it would be able to inspect the packets to apply
>> filtering if they were dropped due to rx descriptor exhaustion. (please
>> bear with me with the basic questions, I am not familiar with this part
>> of the stack).
>>
>> Assuming that tc is still the way to go, after a brief look in to the
>> man pages and the documentation at largc.org, it seems like it would
>> need to use the ingress qdisc, with some sort of system to segregate
>> and priortise based on ingress port. Is this possible?
> 
> Hi Robert,
> 
> As I see it, you have two problems here:
> 
> 1. Classification: Based on ingress port in your case
> 
> 2. Scheduling: How to schedule between the different transmission queues
> 
> Where the port from which the packets should egress is the CPU port,
> before they cross the PCI towards the imx6.
> 
> Both of these issues can be solved by tc. The main problem is that today
> we do not have a netdev to represent the CPU port and therefore can't
> use existing infra like tc. I believe we need to create one. Besides
> scheduling, we can also use it to permit/deny certain traffic from
> reaching the CPU and perform policing.

We do not necessarily have to create a CPU netdev, we can overlay netdev
operations onto the DSA master interface (fec in that case), and
whenever you configure the DSA master interface, we also call back into
the switch side for the CPU port. This is not necessarily the cleanest
way to do things, but that is how we support ethtool operations (and
some netdev operations incidentally), and it works
-- 
Florian

  parent reply	other threads:[~2019-09-12 16:25 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-10 15:41 [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Robert Beckett
2019-09-10 15:41 ` [PATCH 1/7] net/dsa: configure autoneg for CPU port Robert Beckett
2019-09-10 16:14   ` Vivien Didelot
2019-09-10 16:56   ` Florian Fainelli
2019-09-10 18:26   ` Andrew Lunn
2019-09-10 18:29     ` Florian Fainelli
2019-09-11  9:16       ` Robert Beckett
2019-09-11  9:54         ` Robert Beckett
2019-09-11 22:52         ` Andrew Lunn
2019-09-12 10:14           ` Robert Beckett
2019-09-12 10:43             ` Andrew Lunn
2019-09-11 11:43   ` kbuild test robot
2019-09-14  7:16   ` kbuild test robot
2019-09-10 15:41 ` [PATCH 2/7] net: dsa: mv88e6xxx: add ability to set default queue priorities per port Robert Beckett
2019-09-10 16:43   ` Vivien Didelot
2019-09-10 15:41 ` [PATCH 3/7] dt-bindings: " Robert Beckett
2019-09-10 16:42   ` Florian Fainelli
2019-09-10 16:49     ` Vivien Didelot
2019-09-10 20:46       ` Vladimir Oltean
2019-09-10 15:41 ` [PATCH 4/7] net: dsa: mv88e6xxx: add ability to set queue scheduling Robert Beckett
2019-09-10 17:18   ` Vivien Didelot
2019-09-10 15:41 ` [PATCH 5/7] dt-bindings: " Robert Beckett
2019-09-10 15:41 ` [PATCH 6/7] net: dsa: mv88e6xxx: add egress rate limiting Robert Beckett
2019-09-10 17:13   ` Vivien Didelot
2019-09-11 12:26   ` kbuild test robot
2019-09-10 15:41 ` [PATCH 7/7] dt-bindings: " Robert Beckett
2019-09-10 16:49 ` [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms Florian Fainelli
2019-09-11  9:43   ` Robert Beckett
2019-09-11 11:21   ` Ido Schimmel
2019-09-11 11:49     ` Robert Beckett
2019-09-11 22:58       ` Andrew Lunn
2019-09-12  9:05         ` Ido Schimmel
2019-09-12  9:03       ` Ido Schimmel
2019-09-12  9:21         ` Andrew Lunn
2019-09-12 16:25         ` Florian Fainelli [this message]
2019-09-12 16:46           ` Robert Beckett
2019-09-12 17:41             ` Florian Fainelli
2019-09-13 12:47               ` Robert Beckett
2019-09-10 17:19 ` Vivien Didelot
2019-09-11  9:46   ` Robert Beckett
2019-09-11 15:31     ` Vivien Didelot
2019-09-11 23:01     ` Andrew Lunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68676250-17df-b0bb-521a-64877f198647@gmail.com \
    --to=f.fainelli@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=bob.beckett@collabora.com \
    --cc=davem@davemloft.net \
    --cc=idosch@mellanox.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).