netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ioana Ciornei <ioana.ciornei@nxp.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: RE: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes
Date: Fri, 22 May 2020 13:58:41 +0000	[thread overview]
Message-ID: <VI1PR0402MB387142F1D317717D7382DB3CE0B40@VI1PR0402MB3871.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <20200521120752.07fd83aa@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

> Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic
> classes
> 
> On Wed, 20 May 2020 20:24:43 +0000 Ioana Ciornei wrote:
> > > Subject: Re: [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx
> > > traffic classes
> > >
> > > On Wed, 20 May 2020 15:10:42 +0000 Ioana Ciornei wrote:
> > > > DPAA2 has frame queues per each Rx traffic class and the decision
> > > > from which queue to pull frames from is made by the HW based on
> > > > the queue priority within a channel (there is one channel per each CPU).
> > >
> > > IOW you're reading the descriptor for the device memory/iomem
> > > address and the HW will return the next descriptor based on configured
> priority?
> >
> > That's the general idea but the decision is not made on a frame by
> > frame bases but rather on a dequeue operation which can, at a maximum,
> > return
> > 16 frame descriptors at a time.
> 
> I see!
> 
> > > Presumably strict priority?
> >
> > Only the two highest traffic classes are in strict priority, while the
> > other 6 TCs form two priority tiers - medium(4 TCs) and low (last two TCs).
> >
> > > > If this should be modeled in software, then I assume there should
> > > > be a NAPI instance for each traffic class and the stack should
> > > > know in which order to call the poll() callbacks so that the priority is
> respected.
> > >
> > > Right, something like that. But IMHO not needed if HW can serve the
> > > right descriptor upon poll.
> >
> > After thinking this through I don't actually believe that multiple
> > NAPI instances would solve this in any circumstance at all:
> >
> > - If you have hardware prioritization with full scheduling on dequeue
> > then job on the driver side is already done.
> > - If you only have hardware assist for prioritization (ie hardware
> > gives you multiple rings but doesn't tell you from which one to
> > dequeue) then you can still use a single NAPI instance just fine and pick the
> highest priority non-empty ring on-the-fly basically.
> >
> > What I am having trouble understanding is how the fully software
> > implementation of this possible new Rx qdisc should work. Somehow the
> > skb->priority should be taken into account when the skb is passing
> > though the stack (ie a higher priority skb should surpass another
> > previously received skb even if the latter one was received first, but its priority
> queue is congested).
> 
> I'd think the SW implementation would come down to which ring to service first.
> If there are multiple rings on the host NAPI can try to read from highest priority
> ring first and then move on to next prio.
> Not sure if there would be a use case for multiple NAPIs for busy polling or not.
> 
> I was hoping we can solve this with the new ring config API (which is coming any
> day now, ehh) - in which I hope user space will be able to assign rings to NAPI
> instances, all we would have needed would be also controlling the querying
> order. But that doesn't really work for you, it seems, since the selection is
> offloaded to HW :S
> 

Yes, I would need only the configuration of traffic classes and their priorities and
not the software prioritization.

I'll keep a close eye on the mailing list to see what the new ring config API
that you're referring to is adding.

> > I don't have a very deep understanding of the stack but I am thinking
> > that the
> > enqueue_to_backlog()/process_backlog() area could be a candidate place
> > for sorting out bottlenecks. In case we do that I don't see why a
> > qdisc would be necessary at all and not have everybody benefit from
> prioritization based on skb->priority.
> 
> I think once the driver picks the frame up it should run with it to completion (+/-
> GRO). We have natural batching with NAPI processing.
> Every NAPI budget high priority rings get a chance to preempt lower ones.

  reply	other threads:[~2020-05-22 13:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-15 18:47 [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 1/7] dpaa2-eth: Add " Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 2/7] dpaa2-eth: Trim debugfs FQ stats Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 3/7] dpaa2-eth: Distribute ingress frames based on VLAN prio Ioana Ciornei
2020-05-15 23:07   ` kbuild test robot
2020-05-15 18:47 ` [PATCH v2 net-next 4/7] dpaa2-eth: Add helper functions Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 5/7] dpaa2-eth: Minor cleanup in dpaa2_eth_set_rx_taildrop() Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 6/7] dpaa2-eth: Add congestion group taildrop Ioana Ciornei
2020-05-15 18:47 ` [PATCH v2 net-next 7/7] dpaa2-eth: Update FQ taildrop threshold and buffer pool count Ioana Ciornei
2020-05-15 19:20 ` [PATCH v2 net-next 0/7] dpaa2-eth: add support for Rx traffic classes Jakub Kicinski
2020-05-15 19:31   ` Ioana Ciornei
2020-05-15 19:40     ` Jakub Kicinski
2020-05-15 20:48       ` Ioana Ciornei
2020-05-15 22:25         ` Jakub Kicinski
2020-05-15 23:33           ` David Miller
2020-05-16  8:16           ` Ioana Ciornei
2020-05-18 19:35             ` Jakub Kicinski
2020-05-19  7:38               ` Ioana Ciornei
2020-05-19 18:43                 ` Jakub Kicinski
2020-05-19 20:58                   ` Ioana Ciornei
2020-05-19 21:35                     ` Jakub Kicinski
2020-05-20 15:10                       ` Ioana Ciornei
2020-05-20 19:12                         ` Jakub Kicinski
2020-05-20 20:24                           ` Ioana Ciornei
2020-05-21 19:07                             ` Jakub Kicinski
2020-05-22 13:58                               ` Ioana Ciornei [this message]
2020-05-29 11:45               ` Ioana Ciornei
2020-05-29 19:40                 ` Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR0402MB387142F1D317717D7382DB3CE0B40@VI1PR0402MB3871.eurprd04.prod.outlook.com \
    --to=ioana.ciornei@nxp.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).