All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024
@ 2012-02-23 13:34 Yevgeny Petrilin
  2012-02-23 19:45 ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Yevgeny Petrilin @ 2012-02-23 13:34 UTC (permalink / raw)
  To: davem; +Cc: netdev, yevgenyp


Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
---
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index d60335f..174dc38 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -110,7 +110,7 @@ enum {
 #define MLX4_EN_NUM_TX_RINGS		8
 #define MLX4_EN_NUM_PPP_RINGS		8
 #define MAX_TX_RINGS			(MLX4_EN_NUM_TX_RINGS + MLX4_EN_NUM_PPP_RINGS)
-#define MLX4_EN_DEF_TX_RING_SIZE	512
+#define MLX4_EN_DEF_TX_RING_SIZE	1024
 #define MLX4_EN_DEF_RX_RING_SIZE  	1024
 
 /* Target number of packets to coalesce with interrupt moderation */
-- 
1.5.4.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024
  2012-02-23 13:34 [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024 Yevgeny Petrilin
@ 2012-02-23 19:45 ` David Miller
  2012-02-23 19:54   ` Eric Dumazet
  2012-02-24 19:35   ` Yevgeny Petrilin
  0 siblings, 2 replies; 8+ messages in thread
From: David Miller @ 2012-02-23 19:45 UTC (permalink / raw)
  To: yevgenyp; +Cc: netdev

From: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Date: Thu, 23 Feb 2012 15:34:05 +0200

> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>

This is rediculious as a default, yes even for 10Gb.

Do you have any idea how high latency is going to be for packets
trying to get into the transmit queue if there are already a
thousand other frames in there?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024
  2012-02-23 19:45 ` David Miller
@ 2012-02-23 19:54   ` Eric Dumazet
  2012-02-24 19:35   ` Yevgeny Petrilin
  1 sibling, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2012-02-23 19:54 UTC (permalink / raw)
  To: David Miller; +Cc: yevgenyp, netdev

Le jeudi 23 février 2012 à 14:45 -0500, David Miller a écrit :
> From: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
> Date: Thu, 23 Feb 2012 15:34:05 +0200
> 
> > Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
> 
> This is rediculious as a default, yes even for 10Gb.
> 
> Do you have any idea how high latency is going to be for packets
> trying to get into the transmit queue if there are already a
> thousand other frames in there?

Before increasing TX ring sizes, a driver should implement BQL as a
prereq.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024
  2012-02-23 19:45 ` David Miller
  2012-02-23 19:54   ` Eric Dumazet
@ 2012-02-24 19:35   ` Yevgeny Petrilin
  2012-02-24 20:14     ` David Miller
  2012-02-24 20:17     ` Eric Dumazet
  1 sibling, 2 replies; 8+ messages in thread
From: Yevgeny Petrilin @ 2012-02-24 19:35 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

> > Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
> 
> This is rediculious as a default, yes even for 10Gb.
> 
> Do you have any idea how high latency is going to be for packets
> trying to get into the transmit queue if there are already a
> thousand other frames in there?

On the other hand, when having smaller queue with 1000 in-flight packets would mean queue would be stopped,
how is it better?
Having bigger TX ring helps dealing better with bursts of TX packets, without the overhead of stopping and starting the queue,
It also makes sense to have same size TX and RX queues, for example in case of traffic being forwarded from TX to RX.

I did find number of 10Gb vendors that have 1024 or more as the default size for TX queue.

Thanks,
Yevgeny

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024
  2012-02-24 19:35   ` Yevgeny Petrilin
@ 2012-02-24 20:14     ` David Miller
  2012-02-24 20:17     ` Eric Dumazet
  1 sibling, 0 replies; 8+ messages in thread
From: David Miller @ 2012-02-24 20:14 UTC (permalink / raw)
  To: yevgenyp; +Cc: netdev

From: Yevgeny Petrilin <yevgenyp@mellanox.com>
Date: Fri, 24 Feb 2012 19:35:45 +0000

> On the other hand, when having smaller queue with 1000 in-flight
> packets would mean queue would be stopped, how is it better?

It's a thousand times better.

Because if a high priority packet gets queued up it won't have to wait
for 1024 packets to hit the wire before it can go out.

You need to support byte queue limits before you jack things up so high
like this, otherwise high priority packets are absolutely pointless
and unusable.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024
  2012-02-24 19:35   ` Yevgeny Petrilin
  2012-02-24 20:14     ` David Miller
@ 2012-02-24 20:17     ` Eric Dumazet
  2012-02-25  6:51       ` Bill Fink
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2012-02-24 20:17 UTC (permalink / raw)
  To: Yevgeny Petrilin; +Cc: David Miller, netdev

Le vendredi 24 février 2012 à 19:35 +0000, Yevgeny Petrilin a écrit :
> > > Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
> > 
> > This is rediculious as a default, yes even for 10Gb.
> > 
> > Do you have any idea how high latency is going to be for packets
> > trying to get into the transmit queue if there are already a
> > thousand other frames in there?
> 
> On the other hand, when having smaller queue with 1000 in-flight packets would mean queue would be stopped,
> how is it better?

Its better because you can have any kind of Qdisc setup to properly
classify packets, with 100.000 total packets in queues if you wish.

TX ring is a single FIFO, and that is just horrible, especially with big packets...

> Having bigger TX ring helps dealing better with bursts of TX packets, without the overhead of stopping and starting the queue,
> It also makes sense to have same size TX and RX queues, for example in case of traffic being forwarded from TX to RX.
> 

Really I doubt people using forwarding setups use default qdiscs.

Instead of bigger TX rings, they need appropriate Qdiscs.

> I did find number of 10Gb vendors that have 1024 or more as the default size for TX queue.

Thats a shame.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024
  2012-02-24 20:17     ` Eric Dumazet
@ 2012-02-25  6:51       ` Bill Fink
  2012-02-25  8:22         ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Bill Fink @ 2012-02-25  6:51 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Yevgeny Petrilin, David Miller, netdev

On Fri, 24 Feb 2012, Eric Dumazet wrote:

> Le vendredi 24 février 2012 à 19:35 +0000, Yevgeny Petrilin a écrit :
> > > > Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
> > > 
> > > This is rediculious as a default, yes even for 10Gb.
> > > 
> > > Do you have any idea how high latency is going to be for packets
> > > trying to get into the transmit queue if there are already a
> > > thousand other frames in there?

For a GigE NIC with a typical ring size of 256, the serialization delay
for 256 1500 byte packets is:

	1500*8*256/10^9 = ~3.1 msec

For a 10-GigE NIC with a ring size of 1024, the serialization delay
for 1024 1500 byte packets is:

	1500*8*1024/10^10 = ~1.2 msec

So it's not immediately clear that a ring size of 1024 is unreasonable
for 10-GigE.

It probably boils down to whether the default setting should
be biased more toward low latency applications or high throughput
bulk data applications.  Determining the best happy medium is
best decided by appropriate benchmark testing.  Of course,
anyone can change the settings to suit their purpose, so it's
really just a question of what's best for the "usual" case.

> > On the other hand, when having smaller queue with 1000 in-flight packets would mean queue would be stopped,
> > how is it better?
> 
> Its better because you can have any kind of Qdisc setup to properly
> classify packets, with 100.000 total packets in queues if you wish.

Not everyone wants to deal with the convoluted, arcane, and poorly
documented qdisc machinery, especially with its current limitations
at 10-GigE (or faster) line rates.

> TX ring is a single FIFO, and that is just horrible, especially with big packets...
> 
> > Having bigger TX ring helps dealing better with bursts of TX packets, without the overhead of stopping and starting the queue,
> > It also makes sense to have same size TX and RX queues, for example in case of traffic being forwarded from TX to RX.
> 
> Really I doubt people using forwarding setups use default qdiscs.

I don't think it's necessarily that uncommon, such as a simple
10-GigE firewall setup.

> Instead of bigger TX rings, they need appropriate Qdiscs.
> 
> > I did find number of 10Gb vendors that have 1024 or more as the default size for TX queue.
> 
> Thats a shame.

						-Bill

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024
  2012-02-25  6:51       ` Bill Fink
@ 2012-02-25  8:22         ` Eric Dumazet
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2012-02-25  8:22 UTC (permalink / raw)
  To: Bill Fink; +Cc: Yevgeny Petrilin, David Miller, netdev

Le samedi 25 février 2012 à 01:51 -0500, Bill Fink a écrit :

> For a GigE NIC with a typical ring size of 256, the serialization delay
> for 256 1500 byte packets is:
> 
> 	1500*8*256/10^9 = ~3.1 msec
> 
> For a 10-GigE NIC with a ring size of 1024, the serialization delay
> for 1024 1500 byte packets is:
> 
> 	1500*8*1024/10^10 = ~1.2 msec
> 
> So it's not immediately clear that a ring size of 1024 is unreasonable
> for 10-GigE.
> 

Its clear when you take into account packets of 64Kbytes (TSO)

With current hardware and state of linux software, you dont need anymore
very big NIC queues since they bring known drawbacks.

It was true in the past with UP and some timer handlers that could hog
cpu for long periods of time, and when TSO didnt exist.

Hopefully all these cpu hogs are not running in softirq handlers
anymore.

If your workload needs more than ~500 slots, then something is wrong
elsewhere and should be fixed. No more workarounds please.

Now BQL (Byte Queue Limits) is available, a driver should implement it
first before considering big TX rings. Thats a 20 minutes change.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-02-25  8:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-23 13:34 [PATCH net-next 1/3] mlx4_en: TX ring size default to 1024 Yevgeny Petrilin
2012-02-23 19:45 ` David Miller
2012-02-23 19:54   ` Eric Dumazet
2012-02-24 19:35   ` Yevgeny Petrilin
2012-02-24 20:14     ` David Miller
2012-02-24 20:17     ` Eric Dumazet
2012-02-25  6:51       ` Bill Fink
2012-02-25  8:22         ` Eric Dumazet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.