All of lore.kernel.org
 help / color / mirror / Atom feed
* stmmac: Disappointing or normal DMA performance?
@ 2021-09-22  8:48 John Smith
  2021-09-22 14:12 ` Andrew Lunn
       [not found] ` <YUuFfuowmumndWkI@lunn.ch>
  0 siblings, 2 replies; 4+ messages in thread
From: John Smith @ 2021-09-22  8:48 UTC (permalink / raw)
  To: netdev

I have a one-way 300Mbs traffic RGMII arriving at a stmmac version
3.7, in the form of 30000 1280-byte frames per second, evenly spread.

In NAPI poll mode, at each DMA interrupt, I get around 10 frames. More
precisely:

In stmmac_rx of stmmac_main.c:

static int stmmac_rx(struct stmmac_priv *priv, int limit) {
...
while (count < limit)

count is around 10 when NAPI limit/weight is 64. It means that I get
3000 DMA IRQs per second for my 30000 packets.

I have tried different settings but I can't do anything better.
"Better" meaning that I would like to have fewer interrupts per
second, so a higher number of frames on each interrupt in order to
minimize the load transferred to my CPU.

Everything else being the same, if I send 10000 1280x3=3840-byte
frames per second, I get around 3 or 4 frames per interrupt.

Is there a way to increase the ratio of packets / IRQs? I want fewer
IRQs with more packets as the current performance overloads my
embedded chip,

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: stmmac: Disappointing or normal DMA performance?
  2021-09-22  8:48 stmmac: Disappointing or normal DMA performance? John Smith
@ 2021-09-22 14:12 ` Andrew Lunn
  2021-09-22 18:59   ` John Smith
       [not found] ` <YUuFfuowmumndWkI@lunn.ch>
  1 sibling, 1 reply; 4+ messages in thread
From: Andrew Lunn @ 2021-09-22 14:12 UTC (permalink / raw)
  To: John Smith; +Cc: netdev

> Is there a way to increase the ratio of packets / IRQs? I want fewer
> IRQs with more packets as the current performance overloads my
> embedded chip,

Have you played with ethtool -c/-C.

     Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: stmmac: Disappointing or normal DMA performance?
  2021-09-22 14:12 ` Andrew Lunn
@ 2021-09-22 18:59   ` John Smith
  0 siblings, 0 replies; 4+ messages in thread
From: John Smith @ 2021-09-22 18:59 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev

Yes, I have played with the RX coal values. My understanding is that
there is a hardware watchdog to trigger the DMA IRQ with a maximum
value of MAX_DMA_RIWT == 0xff.

rx_riwt = stmmac_usec2riwt(ec->rx_coalesce_usecs, priv);

if ((rx_riwt > MAX_DMA_RIWT) || (rx_riwt < MIN_DMA_RIWT))
return -EINVAL;

This seems to lead to 326us. My 30000 frames per second arrive every
33us and 326 / 33 ~ 10 frames per interrupt...

In other words, I have the feeling that the answer to my subject
question is both: it's normal and it's disappointing.

Can ST confirm the hardware limit (the internal RX FIFO I guess) that
it's not possible to do better than 10 1280-byte frames per DMA
interrupt in my case?

John

On Wed, Sep 22, 2021 at 7:12 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > Is there a way to increase the ratio of packets / IRQs? I want fewer
> > IRQs with more packets as the current performance overloads my
> > embedded chip,
>
> Have you played with ethtool -c/-C.
>
>      Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: stmmac: Disappointing or normal DMA performance?
       [not found] ` <YUuFfuowmumndWkI@lunn.ch>
@ 2021-09-22 22:33   ` John Smith
  0 siblings, 0 replies; 4+ messages in thread
From: John Smith @ 2021-09-22 22:33 UTC (permalink / raw)
  To: Andrew Lunn, netdev

On Wed, Sep 22, 2021 at 12:35 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Wed, Sep 22, 2021 at 01:48:36AM -0700, John Smith wrote:
> > I have a one-way 300Mbs traffic RGMII arriving at a stmmac version
> > 3.7, in the form of 30000 1280-byte frames per second, evenly spread.
> >
> > In NAPI poll mode, at each DMA interrupt, I get around 10 frames. More
> > precisely:
> >
> > In stmmac_rx of stmmac_main.c:
> >
> > static int stmmac_rx(struct stmmac_priv *priv, int limit) {
> > ...
> > while (count < limit)
> >
> > count is around 10 when NAPI limit/weight is 64. It means that I get
> > 3000 DMA IRQs per second for my 30000 packets.
>
> I assume it exists the loop here:
>
>                 /* check if managed by the DMA otherwise go ahead */
>                 if (unlikely(status & dma_own))
>                         break;
>
> Calling stmmac_display_ring() every interrupt is too expensive, but
> maybe do it every 1000. Extend the dump so it includes des0. You can
> then check there really are 10 packets ready to be received, not more?
>
> I suppose another interesting thing to try. Get the driver to do
> nothing every other RX interrupt. Do you get the same number of frames
> per second, but now 20 per stmmac_rx()? That will tell you if it is
> some sort of hardware limit or not. I guess then check that interrupt
> disable/enable is actually being performed, is it swapping between
> interrupt driven and polling?
>
>           Andrew

Yes, the stmmac_rx returns at the dman_own test after 10 frames for
each interrupt.

I tried to override that line but obviously, it leads to crashes.

The problem is that the hardware watchdog with its maximum value 0xff,
returns after 326us generating the interrupt. I think that it's
triggered when the internal rx fifo in the hardware block is full.

If I understand well, your suggestion is to live with the interrupt
but not do the heavy lifting in each callback, instead call stmmac_rx
every 10 or 100 DMA callback. I have tried a simple hack but it
doesn't seem to work. Perhaps I misunderstood the suggestion. I'm not
sure it's going to work because the DMA buffer needs to be emptied and
blanked at each interrupt otherwise data is going to be lost or the
driver is going to be unhappy.

Also, if I send frames of 1280x3, I get 1/3 interrupts so it really
seems to be a FIFO depth limitation.

Thank you for helping! I think that only the ST team knows the
hardware limitation and they would have some ideas of a workaround for
this specific case but they don't reply...

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-09-22 22:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22  8:48 stmmac: Disappointing or normal DMA performance? John Smith
2021-09-22 14:12 ` Andrew Lunn
2021-09-22 18:59   ` John Smith
     [not found] ` <YUuFfuowmumndWkI@lunn.ch>
2021-09-22 22:33   ` John Smith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.