All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible bug in mlx5_tx_burst_mpw?
@ 2016-09-14 13:24 Luke Gorrie
  2016-09-14 14:30 ` Adrien Mazarguil
  0 siblings, 1 reply; 5+ messages in thread
From: Luke Gorrie @ 2016-09-14 13:24 UTC (permalink / raw)
  To: dev

Howdy,

Just noticed a line of code that struck me as odd and so I am writing just
in case it is a bug:

http://dpdk.org/browse/dpdk/tree/drivers/net/mlx5/mlx5_rxtx.c#n1014

Specifically the check "(mpw.length != length)" in mlx_tx_burst_mpw() looks
like a descriptor-format optimization for the special case where
consecutive packets on the wire are exactly the same size. This would
strike me as peculiar.

Just wanted to check, is that interpretation correct and if so then is this
intentional?

Cheers,
-Luke

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible bug in mlx5_tx_burst_mpw?
  2016-09-14 13:24 Possible bug in mlx5_tx_burst_mpw? Luke Gorrie
@ 2016-09-14 14:30 ` Adrien Mazarguil
  2016-09-14 19:33   ` Luke Gorrie
  0 siblings, 1 reply; 5+ messages in thread
From: Adrien Mazarguil @ 2016-09-14 14:30 UTC (permalink / raw)
  To: Luke Gorrie; +Cc: dev

Hi Luke,

On Wed, Sep 14, 2016 at 03:24:07PM +0200, Luke Gorrie wrote:
> Howdy,
> 
> Just noticed a line of code that struck me as odd and so I am writing just
> in case it is a bug:
> 
> http://dpdk.org/browse/dpdk/tree/drivers/net/mlx5/mlx5_rxtx.c#n1014
> 
> Specifically the check "(mpw.length != length)" in mlx_tx_burst_mpw() looks
> like a descriptor-format optimization for the special case where
> consecutive packets on the wire are exactly the same size. This would
> strike me as peculiar.
> 
> Just wanted to check, is that interpretation correct and if so then is this
> intentional?

Your interpretation is correct (this is intentional and not a bug).

In the event successive packets share a few properties (length, number of
segments, offload flags), these can be factored out as an optimization to
lower the amount of traffic on the PCI bus. This feature is currently
supported by the ConnectX-4 Lx family of adapters.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible bug in mlx5_tx_burst_mpw?
  2016-09-14 14:30 ` Adrien Mazarguil
@ 2016-09-14 19:33   ` Luke Gorrie
  2016-09-16  7:14     ` Adrien Mazarguil
  0 siblings, 1 reply; 5+ messages in thread
From: Luke Gorrie @ 2016-09-14 19:33 UTC (permalink / raw)
  To: dev

Hi Adrien,

On 14 September 2016 at 16:30, Adrien Mazarguil <adrien.mazarguil@6wind.com>
wrote:

> Your interpretation is correct (this is intentional and not a bug).
>

Thanks very much for clarifying.

This is interesting to me because I am also working on a ConnectX-4 (Lx)
driver based on the newly released driver interface specification [1] and I
am wondering how interested I should be in this MPW feature that is
currently not documented.

In the event successive packets share a few properties (length, number of
> segments, offload flags), these can be factored out as an optimization to
> lower the amount of traffic on the PCI bus. This feature is currently
> supported by the ConnectX-4 Lx family of adapters.
>

I have a concern here that I hope you will forgive me for voicing.

This optimization seems to run the risk of inflating scores on
constant-packet-size IXIA-style benchmarks like [2] and making them less
useful for predicting real-world performance. That seems like a negative to
me as an application developer. I wonder if I am overlooking some practical
benefits that motivate implementing this in silicon and in the driver and
enabling it by default?

Cheers,
-Luke

[1]
http://www.mellanox.com/related-docs/user_manuals/Ethernet_Adapters_Programming_Manual.pdf
[2]
https://www.mellanox.com/blog/2016/06/performance-beyond-numbers-stephen-curry-style-server-io/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible bug in mlx5_tx_burst_mpw?
  2016-09-14 19:33   ` Luke Gorrie
@ 2016-09-16  7:14     ` Adrien Mazarguil
  2016-09-16  7:57       ` Luke Gorrie
  0 siblings, 1 reply; 5+ messages in thread
From: Adrien Mazarguil @ 2016-09-16  7:14 UTC (permalink / raw)
  To: Luke Gorrie; +Cc: dev

On Wed, Sep 14, 2016 at 09:33:18PM +0200, Luke Gorrie wrote:
> Hi Adrien,
> 
> On 14 September 2016 at 16:30, Adrien Mazarguil <adrien.mazarguil@6wind.com>
> wrote:
> 
> > Your interpretation is correct (this is intentional and not a bug).
> >
> 
> Thanks very much for clarifying.
> 
> This is interesting to me because I am also working on a ConnectX-4 (Lx)
> driver based on the newly released driver interface specification [1] and I
> am wondering how interested I should be in this MPW feature that is
> currently not documented.

Seems like this document only describes established features whose interface
won't be subject to firmware evolutions, I think MPW is not one of them.
AFAIK currently MPW cannot be used with LSO which we intend to support soon.

Our implementation is a stripped down version of the code found in
libmlx5. I guess you could ask Mellanox directly if you need more
information.

> In the event successive packets share a few properties (length, number of
> > segments, offload flags), these can be factored out as an optimization to
> > lower the amount of traffic on the PCI bus. This feature is currently
> > supported by the ConnectX-4 Lx family of adapters.
> >
> 
> I have a concern here that I hope you will forgive me for voicing.
> 
> This optimization seems to run the risk of inflating scores on
> constant-packet-size IXIA-style benchmarks like [2] and making them less
> useful for predicting real-world performance. That seems like a negative to
> me as an application developer. I wonder if I am overlooking some practical
> benefits that motivate implementing this in silicon and in the driver and
> enabling it by default?

Your concern is understandable, no offense taken. You are obviously right
about benchmarks with constant packets, whose results can be improved by
MPW.

Performance-wise, with the right traffic patterns MPW allows ConnectX-4 Lx
adapters to outperform their non-Lx counterparts (e.g. comparing 40G EN Lx
PCIe 8x vs. 40G EN PCIe 8x) when measuring traffic rate (Mpps), not
throughput. Disabling MPW yields comparable results, which is why it is
considered to be an optimization.

Since processing MPW consumes a few additional CPU cycles, it can be
disabled at runtime with the txq_mpw_en switch (documented in mlx5.rst).

Now about the real-world scenario, we are not talking about needing millions
of identical packets to notice an improvement. MPW is effective from 2 to at
most 5 consecutive packets that share some meta-data (length, number of
segments and offload flags), all within the same burst. Just to be clear,
neither their destination nor their payload need to be the same, it would
have been useless otherwise.

Sending a few packets at once with such similar properties is common
occurrence in the real world, think about forwarding TCP traffic that has
been shaped to a constant size by LSO or MTU.

Like many optimizations, this one targets a specific yet common use-case.
If you would rather get a constant rate out of any traffic pattern for
predictable latency, DPDK which is burst-oriented is probably not what your
application needs if used as-is.

> [1]
> http://www.mellanox.com/related-docs/user_manuals/Ethernet_Adapters_Programming_Manual.pdf
> [2]
> https://www.mellanox.com/blog/2016/06/performance-beyond-numbers-stephen-curry-style-server-io/

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible bug in mlx5_tx_burst_mpw?
  2016-09-16  7:14     ` Adrien Mazarguil
@ 2016-09-16  7:57       ` Luke Gorrie
  0 siblings, 0 replies; 5+ messages in thread
From: Luke Gorrie @ 2016-09-16  7:57 UTC (permalink / raw)
  To: dev

Hi Adrien,

Thanks for taking the time to write a detailed reply. This indeed sounds
reasonable to me. Users will need to take these special-cases into account
when predicting performance on their own anticipated workloads, which is a
bit tricky, but then that is life when dealing with complex new technology.
I am eager to see what new techniques come down the pipeline for
efficiently moving packets and descriptors across PCIe.

Thanks again for the detailed reply.

Cheers!
-Luke

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-09-16  7:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-14 13:24 Possible bug in mlx5_tx_burst_mpw? Luke Gorrie
2016-09-14 14:30 ` Adrien Mazarguil
2016-09-14 19:33   ` Luke Gorrie
2016-09-16  7:14     ` Adrien Mazarguil
2016-09-16  7:57       ` Luke Gorrie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.