All of lore.kernel.org
 help / color / mirror / Atom feed
* UDP ordering when using multiple rx queue
@ 2012-07-11  7:53 Jean-Michel Hautbois
  2012-07-11 11:08 ` Merav Sicron
  2012-07-11 22:50 ` Chris Friesen
  0 siblings, 2 replies; 6+ messages in thread
From: Jean-Michel Hautbois @ 2012-07-11  7:53 UTC (permalink / raw)
  To: netdev

Hi list,

I am doing some experiments on several NICs and I have an issue with
my application.
This is a sending of raw data packets which consists of bursts each
1/30s of 784 times 4000 bytes UDP packets.
The packets are one a wired link, no switch or anything, so there is
no chance of reordering or drop between the sender and receiver.

On receiver side, I need to get the packets ordered, or the
application will consider the packets are late (and then, lost).
(Yes, the application is badly written on that specific part, but it
is not mine :)).

Several tests lead to a simple conclusion : when the NIC has only one
RX queue, everything is ok (like be2net for instance), but when it has
more than one RX queue, then I can have "lost packets".
This is the case for bnx2x or mlx4 for instance.

Here are my questions :
- Is it possible to force a driver to use only one rx queue, even if
it can use more without reloading the driver (and this is feasible
only when a parameter exists for that !) ?
- Is it possible to "force" the network stack to give the packets on
the correct order (I would say no, as this is not specified in the
protocol) ?

My only bet is the first one (forcing one rx queue).
The last and desperate solution would be rewriting the application,
not easy to make it accepted.

Thanks !
JM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UDP ordering when using multiple rx queue
  2012-07-11  7:53 UDP ordering when using multiple rx queue Jean-Michel Hautbois
@ 2012-07-11 11:08 ` Merav Sicron
  2012-07-11 11:13   ` Jean-Michel Hautbois
  2012-07-11 22:50 ` Chris Friesen
  1 sibling, 1 reply; 6+ messages in thread
From: Merav Sicron @ 2012-07-11 11:08 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: netdev

On Wed, 2012-07-11 at 00:53 -0700, Jean-Michel Hautbois wrote:

> Several tests lead to a simple conclusion : when the NIC has only one
> RX queue, everything is ok (like be2net for instance), but when it has
> more than one RX queue, then I can have "lost packets".
> This is the case for bnx2x or mlx4 for instance.
>From what you describe I assume that you use different source IP /
destination IP in each packet - is this something that you can control?
Because with the same IP addresses the traffic will be steered to the
same queue. 

> Here are my questions :
> - Is it possible to force a driver to use only one rx queue, even if
> it can use more without reloading the driver (and this is feasible
> only when a parameter exists for that !) ?
You can reduce the number of queues using "ethtool -L ethX combined 1".
Note however that it will cause automatic driver unload/load.

Thanks,
Merav

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UDP ordering when using multiple rx queue
  2012-07-11 11:08 ` Merav Sicron
@ 2012-07-11 11:13   ` Jean-Michel Hautbois
  2012-07-11 13:41     ` Jean-Michel Hautbois
  0 siblings, 1 reply; 6+ messages in thread
From: Jean-Michel Hautbois @ 2012-07-11 11:13 UTC (permalink / raw)
  To: Merav Sicron; +Cc: netdev

2012/7/11 Merav Sicron <meravs@broadcom.com>:
> On Wed, 2012-07-11 at 00:53 -0700, Jean-Michel Hautbois wrote:
>
>> Several tests lead to a simple conclusion : when the NIC has only one
>> RX queue, everything is ok (like be2net for instance), but when it has
>> more than one RX queue, then I can have "lost packets".
>> This is the case for bnx2x or mlx4 for instance.
> >From what you describe I assume that you use different source IP /
> destination IP in each packet - is this something that you can control?
> Because with the same IP addresses the traffic will be steered to the
> same queue.

OK, sorry for not having explained that : the packets are multicast
with a port for each stream. Sending one stream multicast on a bnx2x
based NIC can lead to several queues used (two, for what I can see)
and then, to the problem reported.

>> Here are my questions :
>> - Is it possible to force a driver to use only one rx queue, even if
>> it can use more without reloading the driver (and this is feasible
>> only when a parameter exists for that !) ?
> You can reduce the number of queues using "ethtool -L ethX combined 1".
> Note however that it will cause automatic driver unload/load.

OK, thanks for this tip :).

JM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UDP ordering when using multiple rx queue
  2012-07-11 11:13   ` Jean-Michel Hautbois
@ 2012-07-11 13:41     ` Jean-Michel Hautbois
  2012-07-11 17:50       ` Rick Jones
  0 siblings, 1 reply; 6+ messages in thread
From: Jean-Michel Hautbois @ 2012-07-11 13:41 UTC (permalink / raw)
  To: Merav Sicron; +Cc: netdev

2012/7/11 Jean-Michel Hautbois <jhautbois@gmail.com>:
> 2012/7/11 Merav Sicron <meravs@broadcom.com>:
>> On Wed, 2012-07-11 at 00:53 -0700, Jean-Michel Hautbois wrote:
>>
>>> Several tests lead to a simple conclusion : when the NIC has only one
>>> RX queue, everything is ok (like be2net for instance), but when it has
>>> more than one RX queue, then I can have "lost packets".
>>> This is the case for bnx2x or mlx4 for instance.
>> >From what you describe I assume that you use different source IP /
>> destination IP in each packet - is this something that you can control?
>> Because with the same IP addresses the traffic will be steered to the
>> same queue.
>
> OK, sorry for not having explained that : the packets are multicast
> with a port for each stream. Sending one stream multicast on a bnx2x
> based NIC can lead to several queues used (two, for what I can see)
> and then, to the problem reported.
>
>>> Here are my questions :
>>> - Is it possible to force a driver to use only one rx queue, even if
>>> it can use more without reloading the driver (and this is feasible
>>> only when a parameter exists for that !) ?
>> You can reduce the number of queues using "ethtool -L ethX combined 1".
>> Note however that it will cause automatic driver unload/load.
>
> OK, thanks for this tip :).
>
> JM

I confirm that using ethtool -L eth1 combined 1 solves my issue.
I can have 3Gbps per sec with 5 multicast on 5 ports without any
"packet loss" (again, for my application) and it uses one RX queue
only (of course :)).
One multicast (one port) but with the default combined=8 splits in two
rx queues...
Unicast traffic seems ok (I used netperf in order to check this assumption).

JM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UDP ordering when using multiple rx queue
  2012-07-11 13:41     ` Jean-Michel Hautbois
@ 2012-07-11 17:50       ` Rick Jones
  0 siblings, 0 replies; 6+ messages in thread
From: Rick Jones @ 2012-07-11 17:50 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: Merav Sicron, netdev

On 07/11/2012 06:41 AM, Jean-Michel Hautbois wrote:
> I confirm that using ethtool -L eth1 combined 1 solves my issue.

My being pedantic or not, you have kludged around your issue, which is a 
broken application.

Can you actually ass-u-me that this application is deployed with just a 
single back-to-back link between two systems?  I'm guessing that isn't 
the way it is deployed in production or there would be zero call for 
multicast.   There is *zero* guarantee of ordering with UDP, multicast 
or otherwise - certainly not between sends involving different port 
numbers, nor for that matter even between sends involving the same port 
numbers.  Once you leave the NIC (and perhaps even before) all bets are off.

Have you tested using bonded links?  Or through switches which 
themselves are joined by bonded links? Various bonding modes can even 
re-order traffic of a single flow (eg mode-rr).  As I understand it, the 
moves to "break the bottlenecks" imposed by spanning tree will mean that 
meshes of switches, even without bonded links, will send traffic of 
different flows through different paths through the switch fabric.  In 
those cases they might send traffic to the same multicast address along 
the same path each time, but you probably cannot count on that, nor them 
sending traffic to different multicast addresses along the same path. 
Some clever meshed-switch folks may go ahead and look up at the 
transport-layer port numbers when deciding on their splits - just like 
some bonding modes can.

Until you get the application re-written to handle out-of-order traffic, 
it "works" only by chance.

> Unicast traffic seems ok (I used netperf in order to check this assumption).

Netperf does nothing to check the order of datagrams.  It is perfectly 
content receiving datagrams in any order.  So you can use it to see that 
a single flow of UDP unicast is not split-up by the NIC (by looking at 
the per-queue stats) you can assume nothing about the final ordering of 
those UDP datagrams from a "successful" netperf UDP_STREAM test.

rick jones

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: UDP ordering when using multiple rx queue
  2012-07-11  7:53 UDP ordering when using multiple rx queue Jean-Michel Hautbois
  2012-07-11 11:08 ` Merav Sicron
@ 2012-07-11 22:50 ` Chris Friesen
  1 sibling, 0 replies; 6+ messages in thread
From: Chris Friesen @ 2012-07-11 22:50 UTC (permalink / raw)
  To: Jean-Michel Hautbois; +Cc: netdev

On 07/11/2012 01:53 AM, Jean-Michel Hautbois wrote:
> On receiver side, I need to get the packets ordered, or the
> application will consider the packets are late (and then, lost).
> (Yes, the application is badly written on that specific part, but it
> is not mine :)).

Not the first such app I've seen.

> Several tests lead to a simple conclusion : when the NIC has only one
> RX queue, everything is ok (like be2net for instance), but when it has
> more than one RX queue, then I can have "lost packets".
> This is the case for bnx2x or mlx4 for instance.

This depends on the hardware.  The Intel NICs for example (and others, 
I'm just most familiar with them) support multiple queues but they do 
hardware hashing of "flows" such that a given flow will be routed to a 
specific queue and thus stay in order.

> Here are my questions :
> - Is it possible to force a driver to use only one rx queue, even if
> it can use more without reloading the driver (and this is feasible
> only when a parameter exists for that !) ?

This depends on the driver, but generally I would expect this to be a 
module parameter.

> - Is it possible to "force" the network stack to give the packets on
> the correct order (I would say no, as this is not specified in the
> protocol) ?

No, it's up to the hardware/driver.


> My only bet is the first one (forcing one rx queue).
> The last and desperate solution would be rewriting the application,
> not easy to make it accepted.

Depending on the hardware/driver you may be able to enable flow hashing.

Chris

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-07-11 22:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-11  7:53 UDP ordering when using multiple rx queue Jean-Michel Hautbois
2012-07-11 11:08 ` Merav Sicron
2012-07-11 11:13   ` Jean-Michel Hautbois
2012-07-11 13:41     ` Jean-Michel Hautbois
2012-07-11 17:50       ` Rick Jones
2012-07-11 22:50 ` Chris Friesen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.