All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv3 net-next 2/3] sunvnet: allow admin to set sunvnet MTU
@ 2014-09-13 16:00 David L Stevens
  2014-09-13 20:21 ` David Miller
  0 siblings, 1 reply; 5+ messages in thread
From: David L Stevens @ 2014-09-13 16:00 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

This patch allows an admin to set the MTU on a sunvnet device to arbitrary
values between the minimum (68) and maximum (65535) IPv4 packet sizes.

Signed-off-by: David L Stevens <david.stevens@oracle.com>
---
 drivers/net/ethernet/sun/sunvnet.c |    5 ++++-
 drivers/net/ethernet/sun/sunvnet.h |    2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet.c b/drivers/net/ethernet/sun/sunvnet.c
index 04c58b5..3826b36 100644
--- a/drivers/net/ethernet/sun/sunvnet.c
+++ b/drivers/net/ethernet/sun/sunvnet.c
@@ -754,6 +754,9 @@ static int vnet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	if (unlikely(!port))
 		goto out_dropped;
 
+	if (skb->len > port->vio.rmtu)
+		goto out_dropped;
+
 	spin_lock_irqsave(&port->vio.lock, flags);
 
 	dr = &port->vio.drings[VIO_DRIVER_TX_RING];
@@ -975,7 +978,7 @@ static void vnet_set_rx_mode(struct net_device *dev)
 
 static int vnet_change_mtu(struct net_device *dev, int new_mtu)
 {
-	if (new_mtu != ETH_DATA_LEN)
+	if (new_mtu < 68 || new_mtu > 65535)
 		return -EINVAL;
 
 	dev->mtu = new_mtu;
diff --git a/drivers/net/ethernet/sun/sunvnet.h b/drivers/net/ethernet/sun/sunvnet.h
index 243ae69..fcf0129 100644
--- a/drivers/net/ethernet/sun/sunvnet.h
+++ b/drivers/net/ethernet/sun/sunvnet.h
@@ -11,7 +11,7 @@
  */
 #define VNET_TX_TIMEOUT			(5 * HZ)
 
-#define VNET_MAXPACKET			1518ULL /* ETH_FRAMELEN + VLAN_HDR */
+#define VNET_MAXPACKET			65553ULL /* 64K-1  +ETH HDR +VLAN HDR*/
 #define VNET_TX_RING_SIZE		512
 #define VNET_TX_WAKEUP_THRESH(dr)	((dr)->pending / 4)
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCHv3 net-next 2/3] sunvnet: allow admin to set sunvnet MTU
  2014-09-13 16:00 [PATCHv3 net-next 2/3] sunvnet: allow admin to set sunvnet MTU David L Stevens
@ 2014-09-13 20:21 ` David Miller
  2014-09-14  2:15   ` David L Stevens
  0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2014-09-13 20:21 UTC (permalink / raw)
  To: david.stevens; +Cc: netdev

From: David L Stevens <david.stevens@oracle.com>
Date: Sat, 13 Sep 2014 12:00:55 -0400

> This patch allows an admin to set the MTU on a sunvnet device to arbitrary
> values between the minimum (68) and maximum (65535) IPv4 packet sizes.
> 
> Signed-off-by: David L Stevens <david.stevens@oracle.com>

I personally find this scheme where we pretend that the device can
have an arbitrary MTU, when in fact the effective MTU is a product of
the sub-ports, quite ugly.

In fact, that ugly ICMP stuff in the next patch is absolutely required
to avoid bogus behavior possible after this patch.  You have to
combine #2 and #3 otherwise you are adding an intermediate regression.

Logic wise, at the very least you should limit the MTU setting to the
largest MTU of all of the individual ports.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCHv3 net-next 2/3] sunvnet: allow admin to set sunvnet MTU
  2014-09-13 20:21 ` David Miller
@ 2014-09-14  2:15   ` David L Stevens
  2014-09-14 12:21     ` Sowmini Varadhan
  0 siblings, 1 reply; 5+ messages in thread
From: David L Stevens @ 2014-09-14  2:15 UTC (permalink / raw)
  To: David Miller; +Cc: netdev



On 09/13/2014 04:21 PM, David Miller wrote:

> I personally find this scheme where we pretend that the device can
> have an arbitrary MTU, when in fact the effective MTU is a product of
> the sub-ports, quite ugly.

I wouldn't say I like it, either, but the problem is that without it, we
are tied to the least common denominator. Anything that doesn't support
v1.6 of the VIO protocol is stuck at the low MTU and low throughput, and
since Solaris itself is limited to 16000, Linux, which can do 64K-1, is
also limited to 16000. On my hardware, the original we'd be tied to is
about 1Gbps, the 16000 is about 5.4Gbps, and the full linux-linux is about
8Gbps. So, a big penalty.

I think of it as an Ethernet connected to a virtual switch, and the ICMP
errors are for PMTUD are analogous to IGMP snooping. This is not an Ethernet
device alone-- those don't negotiate per-destination link MTUs. But nothing forces
anyone to mix MTUs; the ICMP errors simply allow it.

> In fact, that ugly ICMP stuff in the next patch is absolutely required
> to avoid bogus behavior possible after this patch.  You have to
> combine #2 and #3 otherwise you are adding an intermediate regression.

I disagree here. It's not any more bogus for the admin to set an MTU value
of what s/he wants when the others have not been. It *always* happens that
way. Ordinary Ethernet comes up at 1500 and one of them must be increased
first. At that time, the others don't match, and it is the admin's responsibility
to make sure they match.

> Logic wise, at the very least you should limit the MTU setting to the
> largest MTU of all of the individual ports.

We can't directly do that, because the MTU for the port is negotiated at
probe time. That'll be 1500 IP data (always) and we have to raise one of them
first, so one of them has to be set at a higher value than the negotiated
MTU at some point, at least until it is reset and re-negotiated. But we
don't know until we try a higher value if all the links can use it, and we
can't prevent another link from joining later that has a lower MTU, but we
can't then lower our on MTU for the whole device.

I think in ordinary Ethernet, there is nothing at all enforcing a particular
MTU-- it is set to what the admin wants, regardless of what other hosts use.
That's the effect we ought to have here, despite the one-to-many p2p links
where we can know in advance what the link MTUs are, and that's what patch #2
does. I don't think we should try too hard to prevent a value an admin wants --
it will just get in the way of the admin, where it doesn't in ordinary Ethernet.

On the other hand, if the link MTU is lower, we shouldn't quietly drop packets, thus the ICMP
errors that allow both.

							+-DLS

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCHv3 net-next 2/3] sunvnet: allow admin to set sunvnet MTU
  2014-09-14  2:15   ` David L Stevens
@ 2014-09-14 12:21     ` Sowmini Varadhan
  2014-09-14 13:24       ` David L Stevens
  0 siblings, 1 reply; 5+ messages in thread
From: Sowmini Varadhan @ 2014-09-14 12:21 UTC (permalink / raw)
  To: David L Stevens, David Miller; +Cc: netdev

On 09/13/2014 10:15 PM, David L Stevens wrote:

> I wouldn't say I like it, either, but the problem is that without it, we
> are tied to the least common denominator. Anything that doesn't support
> v1.6 of the VIO protocol is stuck at the low MTU and low throughput, and

To put things in perspective, in practice its only legacy linux today 
that will do the v1.0, and administrators are likely to want to upgrade
to the later version, so encumbering the code with legacy version 
support may end up becoming hard-to-maintain code?

> I think of it as an Ethernet connected to a virtual switch, and the ICMP
> errors are for PMTUD are analogous to IGMP snooping. This is not an Ethernet
> device alone-- those don't negotiate per-destination link MTUs. But nothing forces
> anyone to mix MTUs; the ICMP errors simply allow it.

As I understand it, this method of sending ICMP from the driver will not
work for L2 (non-IP) packets, and it will not even work for IP packets 
that are coming to us, from, say, openvswitch, right? So in practice it
actually has limited usability?

--Sowmini

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCHv3 net-next 2/3] sunvnet: allow admin to set sunvnet MTU
  2014-09-14 12:21     ` Sowmini Varadhan
@ 2014-09-14 13:24       ` David L Stevens
  0 siblings, 0 replies; 5+ messages in thread
From: David L Stevens @ 2014-09-14 13:24 UTC (permalink / raw)
  To: sowmini.varadhan, David Miller; +Cc: netdev



On 09/14/2014 08:21 AM, Sowmini Varadhan wrote:

> To put things in perspective, in practice its only legacy linux today that will do the v1.0, and administrators are likely to want to upgrade
> to the later version, so encumbering the code with legacy version support may end up becoming hard-to-maintain code?

No, v1.8 Solaris would force us to a 1/3 drop in performance between linux LDOMs because of its 16000 byte MTU limit.
I don't think it's particularly hard to maintain -- it's virtually a literal translation of the text in the VIO protocol
document. Everything that's there should stay there; only new revisions of the protocol would cause new changes, presumably
in other areas of the code where those new features are implemented. And I don't think reverse compatibility is optional.

> As I understand it, this method of sending ICMP from the driver will not
> work for L2 (non-IP) packets, and it will not even work for IP packets that are coming to us, from, say, openvswitch, right? So in practice it
> actually has limited usability?

It wouldn't work for a bridged L2 network with no local IP address, because there would be no valid return IP address for the
ICMP error we generate (in IPv4 -- IPv6 will always have a valid link-local address). Everything else, including openvswitch as far
as I can tell, should make use of the standard pmtud routing information that these update.

What I come back to, as before, is the simple notion that nothing forces an administrator to the otherwise unusual circumstance
of setting different MTUs on directly-attached common networks. If you want to bridge L2 traffic, make your MTU 1500 and it'll
work exactly as before. If you, instead, are using IPv4 or IPv6 and ordinary routed traffic, you can have 8X performance improvement
between hosts that can support it, even if other hosts on the same vswitch and outside your control cannot. You can talk to all
hosts on the vswitch, with a performance that matches the capabilities of each peer. I don't see any way that's not better.

									+-DLS

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-09-14 13:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-13 16:00 [PATCHv3 net-next 2/3] sunvnet: allow admin to set sunvnet MTU David L Stevens
2014-09-13 20:21 ` David Miller
2014-09-14  2:15   ` David L Stevens
2014-09-14 12:21     ` Sowmini Varadhan
2014-09-14 13:24       ` David L Stevens

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.