All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/4] Automatic adjustment of max frame size
@ 2015-10-26  3:40 ` Toshiaki Makita
  0 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: David S . Miller
  Cc: Toshiaki Makita, Patrick McHardy, Stephen Hemminger,
	Vlad Yasevich, Jeff Kirsher, intel-wired-lan, toshiaki.makita1,
	netdev

This patch set tries to resolve packet drop by oversize error on receiving
double tagged packets and possibly other encapsulated packets.

Problem description:
Currently most NICs have 4 bytes room of receive buffer for vlan header and
can receive 1522 bytes frame at maximum.
This is, however, not sufficient once double tagged vlan is used.
As MEF [1] says, maximum frame size of double tagged packets need to be at
least 1526 to provide transparent ethernet VPN, and along the same line,
HW switches send 1526 bytes double tagged packets.
Thus, double tagged packets are dropped by default in most cases by
oversize error. NICs need to accept 1526 bytes packets in this situation.

Approaches:
To satisfy this requirement, this patch set introduces a way to indicate
needed extra buffer space to drivers.
This way can be re-used by other protocols than vlan, like mpls, vxlan, etc.

Other possible solutions:

- To adjust mtu automatically when stacked vlan device is created.
  This is suboptimal because lower device is not necessarily used for only
  vlan. Sometimes tagged and untagged traffic are both used at the same time.
  Also, there are devices that already reserve 8 bytes room, in which case mtu
  adjustment is unnecessary.

- To reserve more room by default.
  This is also suboptimal because there are devices that chages behavior
  when max acceptable frame size gets larger. For exapmle, e1000e enters
  jumbo frame mode which has some additional ristrictions than normal.
  Also, this is vlan-specific solution and not reusable by other encapsulation
  protocols.

This patch set introduces .ndo_enc_hdr_len() and I chose e1000e as the first
implementation. Patch 3 makes vlan driver utilize this API and automatically
expand max frame size of the real device. Patch 4 makes bridge use the API
in similar way as vlan.

Challenges:
- Restore/shrink extra header room after vlan devices are deleted.
  This will need some additional memory storage.
- Manual modification of extra buffer size (by iproute2).

Note:
- This problem was once discussed in Netdev 0.1 [2].
  This patch set is based on the conclusion of the discussion.

Changes:
 v2: Fixed chackpatch warnings

[1] https://wiki.mef.net/display/CESG/ENNI+Frame
[2] https://www.netdev01.org/docs/netdev01_bof_8021ad_makita_150212.pdf

Toshiaki Makita (4):
  net: Add ndo_enc_hdr_len to notify extra header room for encapsulated
    frames
  e1000e: Add ndo_enc_hdr_len
  vlan: Notify real device of encap header length
  bridge: Notify port device of encap header length

 drivers/net/ethernet/intel/e1000e/netdev.c | 82 ++++++++++++++++++++++--------
 include/linux/netdevice.h                  |  9 ++++
 net/8021q/vlan.c                           | 16 +++++-
 net/8021q/vlan_dev.c                       | 48 +++++++++++++++--
 net/bridge/br_vlan.c                       | 18 +++++++
 net/core/dev.c                             | 36 +++++++++++++
 6 files changed, 180 insertions(+), 29 deletions(-)

-- 
1.8.1.2

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-wired-lan] [PATCH v2 net-next 0/4] Automatic adjustment of max frame size
@ 2015-10-26  3:40 ` Toshiaki Makita
  0 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: intel-wired-lan

This patch set tries to resolve packet drop by oversize error on receiving
double tagged packets and possibly other encapsulated packets.

Problem description:
Currently most NICs have 4 bytes room of receive buffer for vlan header and
can receive 1522 bytes frame at maximum.
This is, however, not sufficient once double tagged vlan is used.
As MEF [1] says, maximum frame size of double tagged packets need to be at
least 1526 to provide transparent ethernet VPN, and along the same line,
HW switches send 1526 bytes double tagged packets.
Thus, double tagged packets are dropped by default in most cases by
oversize error. NICs need to accept 1526 bytes packets in this situation.

Approaches:
To satisfy this requirement, this patch set introduces a way to indicate
needed extra buffer space to drivers.
This way can be re-used by other protocols than vlan, like mpls, vxlan, etc.

Other possible solutions:

- To adjust mtu automatically when stacked vlan device is created.
  This is suboptimal because lower device is not necessarily used for only
  vlan. Sometimes tagged and untagged traffic are both used at the same time.
  Also, there are devices that already reserve 8 bytes room, in which case mtu
  adjustment is unnecessary.

- To reserve more room by default.
  This is also suboptimal because there are devices that chages behavior
  when max acceptable frame size gets larger. For exapmle, e1000e enters
  jumbo frame mode which has some additional ristrictions than normal.
  Also, this is vlan-specific solution and not reusable by other encapsulation
  protocols.

This patch set introduces .ndo_enc_hdr_len() and I chose e1000e as the first
implementation. Patch 3 makes vlan driver utilize this API and automatically
expand max frame size of the real device. Patch 4 makes bridge use the API
in similar way as vlan.

Challenges:
- Restore/shrink extra header room after vlan devices are deleted.
  This will need some additional memory storage.
- Manual modification of extra buffer size (by iproute2).

Note:
- This problem was once discussed in Netdev 0.1 [2].
  This patch set is based on the conclusion of the discussion.

Changes:
 v2: Fixed chackpatch warnings

[1] https://wiki.mef.net/display/CESG/ENNI+Frame
[2] https://www.netdev01.org/docs/netdev01_bof_8021ad_makita_150212.pdf

Toshiaki Makita (4):
  net: Add ndo_enc_hdr_len to notify extra header room for encapsulated
    frames
  e1000e: Add ndo_enc_hdr_len
  vlan: Notify real device of encap header length
  bridge: Notify port device of encap header length

 drivers/net/ethernet/intel/e1000e/netdev.c | 82 ++++++++++++++++++++++--------
 include/linux/netdevice.h                  |  9 ++++
 net/8021q/vlan.c                           | 16 +++++-
 net/8021q/vlan_dev.c                       | 48 +++++++++++++++--
 net/bridge/br_vlan.c                       | 18 +++++++
 net/core/dev.c                             | 36 +++++++++++++
 6 files changed, 180 insertions(+), 29 deletions(-)

-- 
1.8.1.2



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 net-next 1/4] net: Add ndo_enc_hdr_len to notify extra header room for encapsulated frames
  2015-10-26  3:40 ` [Intel-wired-lan] " Toshiaki Makita
@ 2015-10-26  3:40   ` Toshiaki Makita
  -1 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: David S . Miller
  Cc: Toshiaki Makita, Patrick McHardy, Stephen Hemminger,
	Vlad Yasevich, Jeff Kirsher, intel-wired-lan, toshiaki.makita1,
	netdev

Currently most NICs reserve 1522 bytes space for frames to handle 4
bytes VLAN header in addition to 1518, maximum size of ethernet frame.
This is, however, not sufficient when stacked vlan or other encapsulation
protocols are used.
To accommodate this, add .ndo_enc_hdr_len() and inform drivers of needed
encapsulation header size.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 include/linux/netdevice.h |  9 +++++++++
 net/core/dev.c            | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4ac653b..1b2b587 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1059,6 +1059,10 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *	This function is used to get egress tunnel information for given skb.
  *	This is useful for retrieving outer tunnel header parameters while
  *	sampling packet.
+ * int (*ndo_enc_hdr_len)(struct net_device *dev, int hdr_len);
+ *	Called to notify addtional encapsulation header length to reserve.
+ *	Implements should reserve hdr_len room in addition to MTU to handle
+ *	encapsulated frames.
  *
  */
 struct net_device_ops {
@@ -1236,6 +1240,8 @@ struct net_device_ops {
 							 bool proto_down);
 	int			(*ndo_fill_metadata_dst)(struct net_device *dev,
 						       struct sk_buff *skb);
+	int			(*ndo_enc_hdr_len)(struct net_device *dev,
+						   int hdr_len);
 };
 
 /**
@@ -1396,6 +1402,7 @@ enum netdev_priv_flags {
  *	@if_port:	Selectable AUI, TP, ...
  *	@dma:		DMA channel
  *	@mtu:		Interface MTU value
+ *	@enc_hdr_len:	Additional encapsulation header length to MTU
  *	@type:		Interface hardware type
  *	@hard_header_len: Hardware header length
  *
@@ -1616,6 +1623,7 @@ struct net_device {
 	unsigned char		dma;
 
 	unsigned int		mtu;
+	unsigned int		enc_hdr_len;
 	unsigned short		type;
 	unsigned short		hard_header_len;
 
@@ -3031,6 +3039,7 @@ int dev_change_name(struct net_device *, const char *);
 int dev_set_alias(struct net_device *, const char *, size_t);
 int dev_change_net_namespace(struct net_device *, struct net *, const char *);
 int dev_set_mtu(struct net_device *, int);
+int dev_set_enc_hdr_len(struct net_device *, int);
 void dev_set_group(struct net_device *, int);
 int dev_set_mac_address(struct net_device *, struct sockaddr *);
 int dev_change_carrier(struct net_device *, bool new_carrier);
diff --git a/net/core/dev.c b/net/core/dev.c
index 13f49f8..7c4114b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6051,6 +6051,42 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
 EXPORT_SYMBOL(dev_set_mtu);
 
 /**
+ *	dev_set_enc_hdr_len - Expand encapsulation header room
+ *	@dev: device
+ *	@new_len: new length
+ */
+int dev_set_enc_hdr_len(struct net_device *dev, int new_len)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	if (new_len < 0)
+		return -EINVAL;
+
+	if (!netif_device_present(dev))
+		return -ENODEV;
+
+	if (new_len <= dev->enc_hdr_len)
+		return 0;
+
+	if (ops->ndo_enc_hdr_len) {
+		int err;
+
+		/* This function can be called from child user/net namespace */
+		if (!ns_capable(dev_net(dev)->user_ns, CAP_NET_ADMIN))
+			return -EPERM;
+
+		err = ops->ndo_enc_hdr_len(dev, new_len);
+		if (err)
+			return err;
+	}
+
+	dev->enc_hdr_len = new_len;
+
+	return 0;
+}
+EXPORT_SYMBOL(dev_set_enc_hdr_len);
+
+/**
  *	dev_set_group - Change group this device belongs to
  *	@dev: device
  *	@new_group: group this device should belong to
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-wired-lan] [PATCH v2 net-next 1/4] net: Add ndo_enc_hdr_len to notify extra header room for encapsulated frames
@ 2015-10-26  3:40   ` Toshiaki Makita
  0 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: intel-wired-lan

Currently most NICs reserve 1522 bytes space for frames to handle 4
bytes VLAN header in addition to 1518, maximum size of ethernet frame.
This is, however, not sufficient when stacked vlan or other encapsulation
protocols are used.
To accommodate this, add .ndo_enc_hdr_len() and inform drivers of needed
encapsulation header size.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 include/linux/netdevice.h |  9 +++++++++
 net/core/dev.c            | 36 ++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4ac653b..1b2b587 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1059,6 +1059,10 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev,
  *	This function is used to get egress tunnel information for given skb.
  *	This is useful for retrieving outer tunnel header parameters while
  *	sampling packet.
+ * int (*ndo_enc_hdr_len)(struct net_device *dev, int hdr_len);
+ *	Called to notify addtional encapsulation header length to reserve.
+ *	Implements should reserve hdr_len room in addition to MTU to handle
+ *	encapsulated frames.
  *
  */
 struct net_device_ops {
@@ -1236,6 +1240,8 @@ struct net_device_ops {
 							 bool proto_down);
 	int			(*ndo_fill_metadata_dst)(struct net_device *dev,
 						       struct sk_buff *skb);
+	int			(*ndo_enc_hdr_len)(struct net_device *dev,
+						   int hdr_len);
 };
 
 /**
@@ -1396,6 +1402,7 @@ enum netdev_priv_flags {
  *	@if_port:	Selectable AUI, TP, ...
  *	@dma:		DMA channel
  *	@mtu:		Interface MTU value
+ *	@enc_hdr_len:	Additional encapsulation header length to MTU
  *	@type:		Interface hardware type
  *	@hard_header_len: Hardware header length
  *
@@ -1616,6 +1623,7 @@ struct net_device {
 	unsigned char		dma;
 
 	unsigned int		mtu;
+	unsigned int		enc_hdr_len;
 	unsigned short		type;
 	unsigned short		hard_header_len;
 
@@ -3031,6 +3039,7 @@ int dev_change_name(struct net_device *, const char *);
 int dev_set_alias(struct net_device *, const char *, size_t);
 int dev_change_net_namespace(struct net_device *, struct net *, const char *);
 int dev_set_mtu(struct net_device *, int);
+int dev_set_enc_hdr_len(struct net_device *, int);
 void dev_set_group(struct net_device *, int);
 int dev_set_mac_address(struct net_device *, struct sockaddr *);
 int dev_change_carrier(struct net_device *, bool new_carrier);
diff --git a/net/core/dev.c b/net/core/dev.c
index 13f49f8..7c4114b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6051,6 +6051,42 @@ int dev_set_mtu(struct net_device *dev, int new_mtu)
 EXPORT_SYMBOL(dev_set_mtu);
 
 /**
+ *	dev_set_enc_hdr_len - Expand encapsulation header room
+ *	@dev: device
+ *	@new_len: new length
+ */
+int dev_set_enc_hdr_len(struct net_device *dev, int new_len)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	if (new_len < 0)
+		return -EINVAL;
+
+	if (!netif_device_present(dev))
+		return -ENODEV;
+
+	if (new_len <= dev->enc_hdr_len)
+		return 0;
+
+	if (ops->ndo_enc_hdr_len) {
+		int err;
+
+		/* This function can be called from child user/net namespace */
+		if (!ns_capable(dev_net(dev)->user_ns, CAP_NET_ADMIN))
+			return -EPERM;
+
+		err = ops->ndo_enc_hdr_len(dev, new_len);
+		if (err)
+			return err;
+	}
+
+	dev->enc_hdr_len = new_len;
+
+	return 0;
+}
+EXPORT_SYMBOL(dev_set_enc_hdr_len);
+
+/**
  *	dev_set_group - Change group this device belongs to
  *	@dev: device
  *	@new_group: group this device should belong to
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 net-next 2/4] e1000e: Add ndo_enc_hdr_len
  2015-10-26  3:40 ` [Intel-wired-lan] " Toshiaki Makita
@ 2015-10-26  3:40   ` Toshiaki Makita
  -1 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: David S . Miller
  Cc: Toshiaki Makita, Patrick McHardy, Stephen Hemminger,
	Vlad Yasevich, Jeff Kirsher, intel-wired-lan, toshiaki.makita1,
	netdev

e1000e has 4 bytes additional room for vlan header, so set default
enc_hdr_len to 4.
Note that e1000e uses mtu to validate frame size in some places, which
are needed to be modified to use max_frame_size as extra header room
became variable.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 82 ++++++++++++++++++++++--------
 1 file changed, 60 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 0a854a4..61ce986 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3046,7 +3046,7 @@ static void e1000_setup_rctl(struct e1000_adapter *adapter)
 	if (hw->mac.type >= e1000_pch2lan) {
 		s32 ret_val;
 
-		if (adapter->netdev->mtu > ETH_DATA_LEN)
+		if (adapter->max_frame_size > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
 			ret_val = e1000_lv_jumbo_workaround_ich8lan(hw, true);
 		else
 			ret_val = e1000_lv_jumbo_workaround_ich8lan(hw, false);
@@ -3066,7 +3066,7 @@ static void e1000_setup_rctl(struct e1000_adapter *adapter)
 	rctl &= ~E1000_RCTL_SBP;
 
 	/* Enable Long Packet receive */
-	if (adapter->netdev->mtu <= ETH_DATA_LEN)
+	if (adapter->max_frame_size <= VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
 		rctl &= ~E1000_RCTL_LPE;
 	else
 		rctl |= E1000_RCTL_LPE;
@@ -3286,7 +3286,7 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 	/* With jumbo frames, excessive C-state transition latencies result
 	 * in dropped transactions.
 	 */
-	if (adapter->netdev->mtu > ETH_DATA_LEN) {
+	if (adapter->max_frame_size > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN) {
 		u32 lat =
 		    ((er32(PBA) & E1000_PBA_RXA_MASK) * 1024 -
 		     adapter->max_frame_size) * 8 / 1000;
@@ -3978,7 +3978,8 @@ void e1000e_reset(struct e1000_adapter *adapter)
 	switch (hw->mac.type) {
 	case e1000_ich9lan:
 	case e1000_ich10lan:
-		if (adapter->netdev->mtu > ETH_DATA_LEN) {
+		if (adapter->max_frame_size > VLAN_ETH_FRAME_LEN +
+					      ETH_FCS_LEN) {
 			pba = 14;
 			ew32(PBA, pba);
 			fc->high_water = 0x2800;
@@ -3997,7 +3998,8 @@ void e1000e_reset(struct e1000_adapter *adapter)
 		/* Workaround PCH LOM adapter hangs with certain network
 		 * loads.  If hangs persist, try disabling Tx flow control.
 		 */
-		if (adapter->netdev->mtu > ETH_DATA_LEN) {
+		if (adapter->max_frame_size > VLAN_ETH_FRAME_LEN +
+					      ETH_FCS_LEN) {
 			fc->high_water = 0x3500;
 			fc->low_water = 0x1500;
 		} else {
@@ -4011,7 +4013,8 @@ void e1000e_reset(struct e1000_adapter *adapter)
 	case e1000_pch_spt:
 		fc->refresh_time = 0x0400;
 
-		if (adapter->netdev->mtu <= ETH_DATA_LEN) {
+		if (adapter->max_frame_size <= VLAN_ETH_FRAME_LEN +
+					       ETH_FCS_LEN) {
 			fc->high_water = 0x05C20;
 			fc->low_water = 0x05048;
 			fc->pause_time = 0x0650;
@@ -4247,7 +4250,7 @@ void e1000e_down(struct e1000_adapter *adapter, bool reset)
 
 	/* Disable Si errata workaround on PCHx for jumbo frame flow */
 	if ((hw->mac.type >= e1000_pch2lan) &&
-	    (adapter->netdev->mtu > ETH_DATA_LEN) &&
+	    (adapter->max_frame_size > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN) &&
 	    e1000_lv_jumbo_workaround_ich8lan(hw, false))
 		e_dbg("failed to disable jumbo frame workaround mode\n");
 
@@ -4346,7 +4349,8 @@ static int e1000_sw_init(struct e1000_adapter *adapter)
 
 	adapter->rx_buffer_len = VLAN_ETH_FRAME_LEN + ETH_FCS_LEN;
 	adapter->rx_ps_bsize0 = 128;
-	adapter->max_frame_size = netdev->mtu + VLAN_ETH_HLEN + ETH_FCS_LEN;
+	adapter->max_frame_size = netdev->mtu + netdev->enc_hdr_len +
+				  ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 	adapter->tx_ring_count = E1000_DEFAULT_TXD;
 	adapter->rx_ring_count = E1000_DEFAULT_RXD;
@@ -5920,17 +5924,10 @@ struct rtnl_link_stats64 *e1000e_get_stats64(struct net_device *netdev,
 	return stats;
 }
 
-/**
- * e1000_change_mtu - Change the Maximum Transfer Unit
- * @netdev: network interface device structure
- * @new_mtu: new value for maximum frame size
- *
- * Returns 0 on success, negative on failure
- **/
-static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
+static int e1000_change_max_frame(struct net_device *netdev, int max_frame,
+				  int new_mtu)
 {
 	struct e1000_adapter *adapter = netdev_priv(netdev);
-	int max_frame = new_mtu + VLAN_ETH_HLEN + ETH_FCS_LEN;
 
 	/* Jumbo frame support */
 	if ((max_frame > (VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)) &&
@@ -5940,7 +5937,7 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	}
 
 	/* Supported frame sizes */
-	if ((new_mtu < (VLAN_ETH_ZLEN + ETH_FCS_LEN)) ||
+	if ((new_mtu && new_mtu < (VLAN_ETH_ZLEN + ETH_FCS_LEN)) ||
 	    (max_frame > adapter->max_hw_frame_size)) {
 		e_err("Unsupported MTU setting\n");
 		return -EINVAL;
@@ -5949,7 +5946,7 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	/* Jumbo frame workaround on 82579 and newer requires CRC be stripped */
 	if ((adapter->hw.mac.type >= e1000_pch2lan) &&
 	    !(adapter->flags2 & FLAG2_CRC_STRIPPING) &&
-	    (new_mtu > ETH_DATA_LEN)) {
+	    (max_frame > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)) {
 		e_err("Jumbo Frames not supported on this device when CRC stripping is disabled.\n");
 		return -EINVAL;
 	}
@@ -5957,9 +5954,14 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	while (test_and_set_bit(__E1000_RESETTING, &adapter->state))
 		usleep_range(1000, 2000);
 	/* e1000e_down -> e1000e_reset dependent on max_frame_size & mtu */
+	if (new_mtu) {
+		e_info("changing MTU from %d to %d\n", netdev->mtu, new_mtu);
+		netdev->mtu = new_mtu;
+	} else {
+		e_info("changing max frame size from %d to %d\n",
+		       adapter->max_frame_size, max_frame);
+	}
 	adapter->max_frame_size = max_frame;
-	e_info("changing MTU from %d to %d\n", netdev->mtu, new_mtu);
-	netdev->mtu = new_mtu;
 
 	pm_runtime_get_sync(netdev->dev.parent);
 
@@ -5995,6 +5997,20 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	return 0;
 }
 
+/**
+ * e1000_change_mtu - Change the Maximum Transfer Unit
+ * @netdev: network interface device structure
+ * @new_mtu: new value for maximum frame size
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
+{
+	int max_frame = new_mtu + netdev->enc_hdr_len + ETH_HLEN + ETH_FCS_LEN;
+
+	return e1000_change_max_frame(netdev, max_frame, new_mtu);
+}
+
 static int e1000_mii_ioctl(struct net_device *netdev, struct ifreq *ifr,
 			   int cmd)
 {
@@ -6889,7 +6905,8 @@ static netdev_features_t e1000_fix_features(struct net_device *netdev,
 	struct e1000_hw *hw = &adapter->hw;
 
 	/* Jumbo frame workaround on 82579 and newer requires CRC be stripped */
-	if ((hw->mac.type >= e1000_pch2lan) && (netdev->mtu > ETH_DATA_LEN))
+	if (hw->mac.type >= e1000_pch2lan &&
+	    adapter->max_frame_size > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
 		features &= ~NETIF_F_RXFCS;
 
 	return features;
@@ -6933,6 +6950,24 @@ static int e1000_set_features(struct net_device *netdev,
 	return 0;
 }
 
+/**
+ * e1000_enc_hdr_len - Expand encapsulation header room
+ * @netdev: network interface device structure
+ * @new_mtu: new value for maximum encapsulation header length
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int e1000_enc_hdr_len(struct net_device *netdev, int new_len)
+{
+	struct e1000_adapter *adapter = netdev_priv(netdev);
+	int max_frame = netdev->mtu + new_len + ETH_HLEN + ETH_FCS_LEN;
+
+	if (max_frame <= adapter->max_frame_size)
+		return 0;
+
+	return e1000_change_max_frame(netdev, max_frame, 0);
+}
+
 static const struct net_device_ops e1000e_netdev_ops = {
 	.ndo_open		= e1000_open,
 	.ndo_stop		= e1000_close,
@@ -6953,6 +6988,7 @@ static const struct net_device_ops e1000e_netdev_ops = {
 	.ndo_set_features = e1000_set_features,
 	.ndo_fix_features = e1000_fix_features,
 	.ndo_features_check	= passthru_features_check,
+	.ndo_enc_hdr_len	= e1000_enc_hdr_len,
 };
 
 /**
@@ -7075,6 +7111,8 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	netdev->mem_start = mmio_start;
 	netdev->mem_end = mmio_start + mmio_len;
 
+	netdev->enc_hdr_len = VLAN_HLEN;
+
 	adapter->bd_number = cards_found++;
 
 	e1000e_check_options(adapter);
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-wired-lan] [PATCH v2 net-next 2/4] e1000e: Add ndo_enc_hdr_len
@ 2015-10-26  3:40   ` Toshiaki Makita
  0 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: intel-wired-lan

e1000e has 4 bytes additional room for vlan header, so set default
enc_hdr_len to 4.
Note that e1000e uses mtu to validate frame size in some places, which
are needed to be modified to use max_frame_size as extra header room
became variable.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 82 ++++++++++++++++++++++--------
 1 file changed, 60 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 0a854a4..61ce986 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3046,7 +3046,7 @@ static void e1000_setup_rctl(struct e1000_adapter *adapter)
 	if (hw->mac.type >= e1000_pch2lan) {
 		s32 ret_val;
 
-		if (adapter->netdev->mtu > ETH_DATA_LEN)
+		if (adapter->max_frame_size > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
 			ret_val = e1000_lv_jumbo_workaround_ich8lan(hw, true);
 		else
 			ret_val = e1000_lv_jumbo_workaround_ich8lan(hw, false);
@@ -3066,7 +3066,7 @@ static void e1000_setup_rctl(struct e1000_adapter *adapter)
 	rctl &= ~E1000_RCTL_SBP;
 
 	/* Enable Long Packet receive */
-	if (adapter->netdev->mtu <= ETH_DATA_LEN)
+	if (adapter->max_frame_size <= VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
 		rctl &= ~E1000_RCTL_LPE;
 	else
 		rctl |= E1000_RCTL_LPE;
@@ -3286,7 +3286,7 @@ static void e1000_configure_rx(struct e1000_adapter *adapter)
 	/* With jumbo frames, excessive C-state transition latencies result
 	 * in dropped transactions.
 	 */
-	if (adapter->netdev->mtu > ETH_DATA_LEN) {
+	if (adapter->max_frame_size > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN) {
 		u32 lat =
 		    ((er32(PBA) & E1000_PBA_RXA_MASK) * 1024 -
 		     adapter->max_frame_size) * 8 / 1000;
@@ -3978,7 +3978,8 @@ void e1000e_reset(struct e1000_adapter *adapter)
 	switch (hw->mac.type) {
 	case e1000_ich9lan:
 	case e1000_ich10lan:
-		if (adapter->netdev->mtu > ETH_DATA_LEN) {
+		if (adapter->max_frame_size > VLAN_ETH_FRAME_LEN +
+					      ETH_FCS_LEN) {
 			pba = 14;
 			ew32(PBA, pba);
 			fc->high_water = 0x2800;
@@ -3997,7 +3998,8 @@ void e1000e_reset(struct e1000_adapter *adapter)
 		/* Workaround PCH LOM adapter hangs with certain network
 		 * loads.  If hangs persist, try disabling Tx flow control.
 		 */
-		if (adapter->netdev->mtu > ETH_DATA_LEN) {
+		if (adapter->max_frame_size > VLAN_ETH_FRAME_LEN +
+					      ETH_FCS_LEN) {
 			fc->high_water = 0x3500;
 			fc->low_water = 0x1500;
 		} else {
@@ -4011,7 +4013,8 @@ void e1000e_reset(struct e1000_adapter *adapter)
 	case e1000_pch_spt:
 		fc->refresh_time = 0x0400;
 
-		if (adapter->netdev->mtu <= ETH_DATA_LEN) {
+		if (adapter->max_frame_size <= VLAN_ETH_FRAME_LEN +
+					       ETH_FCS_LEN) {
 			fc->high_water = 0x05C20;
 			fc->low_water = 0x05048;
 			fc->pause_time = 0x0650;
@@ -4247,7 +4250,7 @@ void e1000e_down(struct e1000_adapter *adapter, bool reset)
 
 	/* Disable Si errata workaround on PCHx for jumbo frame flow */
 	if ((hw->mac.type >= e1000_pch2lan) &&
-	    (adapter->netdev->mtu > ETH_DATA_LEN) &&
+	    (adapter->max_frame_size > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN) &&
 	    e1000_lv_jumbo_workaround_ich8lan(hw, false))
 		e_dbg("failed to disable jumbo frame workaround mode\n");
 
@@ -4346,7 +4349,8 @@ static int e1000_sw_init(struct e1000_adapter *adapter)
 
 	adapter->rx_buffer_len = VLAN_ETH_FRAME_LEN + ETH_FCS_LEN;
 	adapter->rx_ps_bsize0 = 128;
-	adapter->max_frame_size = netdev->mtu + VLAN_ETH_HLEN + ETH_FCS_LEN;
+	adapter->max_frame_size = netdev->mtu + netdev->enc_hdr_len +
+				  ETH_HLEN + ETH_FCS_LEN;
 	adapter->min_frame_size = ETH_ZLEN + ETH_FCS_LEN;
 	adapter->tx_ring_count = E1000_DEFAULT_TXD;
 	adapter->rx_ring_count = E1000_DEFAULT_RXD;
@@ -5920,17 +5924,10 @@ struct rtnl_link_stats64 *e1000e_get_stats64(struct net_device *netdev,
 	return stats;
 }
 
-/**
- * e1000_change_mtu - Change the Maximum Transfer Unit
- * @netdev: network interface device structure
- * @new_mtu: new value for maximum frame size
- *
- * Returns 0 on success, negative on failure
- **/
-static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
+static int e1000_change_max_frame(struct net_device *netdev, int max_frame,
+				  int new_mtu)
 {
 	struct e1000_adapter *adapter = netdev_priv(netdev);
-	int max_frame = new_mtu + VLAN_ETH_HLEN + ETH_FCS_LEN;
 
 	/* Jumbo frame support */
 	if ((max_frame > (VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)) &&
@@ -5940,7 +5937,7 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	}
 
 	/* Supported frame sizes */
-	if ((new_mtu < (VLAN_ETH_ZLEN + ETH_FCS_LEN)) ||
+	if ((new_mtu && new_mtu < (VLAN_ETH_ZLEN + ETH_FCS_LEN)) ||
 	    (max_frame > adapter->max_hw_frame_size)) {
 		e_err("Unsupported MTU setting\n");
 		return -EINVAL;
@@ -5949,7 +5946,7 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	/* Jumbo frame workaround on 82579 and newer requires CRC be stripped */
 	if ((adapter->hw.mac.type >= e1000_pch2lan) &&
 	    !(adapter->flags2 & FLAG2_CRC_STRIPPING) &&
-	    (new_mtu > ETH_DATA_LEN)) {
+	    (max_frame > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)) {
 		e_err("Jumbo Frames not supported on this device when CRC stripping is disabled.\n");
 		return -EINVAL;
 	}
@@ -5957,9 +5954,14 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	while (test_and_set_bit(__E1000_RESETTING, &adapter->state))
 		usleep_range(1000, 2000);
 	/* e1000e_down -> e1000e_reset dependent on max_frame_size & mtu */
+	if (new_mtu) {
+		e_info("changing MTU from %d to %d\n", netdev->mtu, new_mtu);
+		netdev->mtu = new_mtu;
+	} else {
+		e_info("changing max frame size from %d to %d\n",
+		       adapter->max_frame_size, max_frame);
+	}
 	adapter->max_frame_size = max_frame;
-	e_info("changing MTU from %d to %d\n", netdev->mtu, new_mtu);
-	netdev->mtu = new_mtu;
 
 	pm_runtime_get_sync(netdev->dev.parent);
 
@@ -5995,6 +5997,20 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
 	return 0;
 }
 
+/**
+ * e1000_change_mtu - Change the Maximum Transfer Unit
+ * @netdev: network interface device structure
+ * @new_mtu: new value for maximum frame size
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int e1000_change_mtu(struct net_device *netdev, int new_mtu)
+{
+	int max_frame = new_mtu + netdev->enc_hdr_len + ETH_HLEN + ETH_FCS_LEN;
+
+	return e1000_change_max_frame(netdev, max_frame, new_mtu);
+}
+
 static int e1000_mii_ioctl(struct net_device *netdev, struct ifreq *ifr,
 			   int cmd)
 {
@@ -6889,7 +6905,8 @@ static netdev_features_t e1000_fix_features(struct net_device *netdev,
 	struct e1000_hw *hw = &adapter->hw;
 
 	/* Jumbo frame workaround on 82579 and newer requires CRC be stripped */
-	if ((hw->mac.type >= e1000_pch2lan) && (netdev->mtu > ETH_DATA_LEN))
+	if (hw->mac.type >= e1000_pch2lan &&
+	    adapter->max_frame_size > VLAN_ETH_FRAME_LEN + ETH_FCS_LEN)
 		features &= ~NETIF_F_RXFCS;
 
 	return features;
@@ -6933,6 +6950,24 @@ static int e1000_set_features(struct net_device *netdev,
 	return 0;
 }
 
+/**
+ * e1000_enc_hdr_len - Expand encapsulation header room
+ * @netdev: network interface device structure
+ * @new_mtu: new value for maximum encapsulation header length
+ *
+ * Returns 0 on success, negative on failure
+ **/
+static int e1000_enc_hdr_len(struct net_device *netdev, int new_len)
+{
+	struct e1000_adapter *adapter = netdev_priv(netdev);
+	int max_frame = netdev->mtu + new_len + ETH_HLEN + ETH_FCS_LEN;
+
+	if (max_frame <= adapter->max_frame_size)
+		return 0;
+
+	return e1000_change_max_frame(netdev, max_frame, 0);
+}
+
 static const struct net_device_ops e1000e_netdev_ops = {
 	.ndo_open		= e1000_open,
 	.ndo_stop		= e1000_close,
@@ -6953,6 +6988,7 @@ static const struct net_device_ops e1000e_netdev_ops = {
 	.ndo_set_features = e1000_set_features,
 	.ndo_fix_features = e1000_fix_features,
 	.ndo_features_check	= passthru_features_check,
+	.ndo_enc_hdr_len	= e1000_enc_hdr_len,
 };
 
 /**
@@ -7075,6 +7111,8 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	netdev->mem_start = mmio_start;
 	netdev->mem_end = mmio_start + mmio_len;
 
+	netdev->enc_hdr_len = VLAN_HLEN;
+
 	adapter->bd_number = cards_found++;
 
 	e1000e_check_options(adapter);
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 net-next 3/4] vlan: Notify real device of encap header length
  2015-10-26  3:40 ` [Intel-wired-lan] " Toshiaki Makita
@ 2015-10-26  3:40   ` Toshiaki Makita
  -1 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: David S . Miller
  Cc: Toshiaki Makita, Patrick McHardy, Stephen Hemminger,
	Vlad Yasevich, Jeff Kirsher, intel-wired-lan, toshiaki.makita1,
	netdev

Reserve extra 4 bytes on real_dev in addition to the length notified
from upper device.
In 802.1ad mode, set enc_hdr_len to 4 bytes by default, since it is
likely to send already tagged frame.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 net/8021q/vlan.c     | 16 ++++++++++++++--
 net/8021q/vlan_dev.c | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index d2cd9de..bbafc5c 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -355,6 +355,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
 	struct net_device *vlandev;
 	struct vlan_dev_priv *vlan;
 	bool last = false;
+	int dev_max_frame;
 	LIST_HEAD(list);
 
 	if (is_vlan_dev(dev)) {
@@ -399,11 +400,22 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
 		break;
 
 	case NETDEV_CHANGEMTU:
+		dev_max_frame = dev->mtu + dev->enc_hdr_len;
 		vlan_group_for_each_dev(grp, i, vlandev) {
-			if (vlandev->mtu <= dev->mtu)
+			int enc_hdr_len = vlandev->enc_hdr_len + VLAN_HLEN;
+
+			if (vlandev->mtu <= dev->mtu &&
+			    vlandev->mtu + enc_hdr_len <= dev_max_frame)
 				continue;
 
-			dev_set_mtu(vlandev, dev->mtu);
+			if (vlandev->mtu > dev->mtu) {
+				dev_set_mtu(vlandev, dev->mtu);
+			} else {
+				int mtu_room = dev->mtu - vlandev->mtu;
+
+				dev_set_enc_hdr_len(dev,
+						    enc_hdr_len - mtu_room);
+			}
 		}
 		break;
 
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index fded865..ac377cf 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -143,17 +143,35 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
 	return ret;
 }
 
+static int vlan_dev_enc_hdr_len(struct net_device *dev, int new_len)
+{
+	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	int mtu_room = real_dev->mtu - dev->mtu;
+
+	new_len += VLAN_HLEN;
+	if (new_len > mtu_room)
+		return dev_set_enc_hdr_len(real_dev, new_len - mtu_room);
+
+	return 0;
+}
+
 static int vlan_dev_change_mtu(struct net_device *dev, int new_mtu)
 {
-	/* TODO: gotta make sure the underlying layer can handle it,
-	 * maybe an IFF_VLAN_CAPABLE flag for devices?
-	 */
-	if (vlan_dev_priv(dev)->real_dev->mtu < new_mtu)
+	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	int orig_mtu;
+	int err;
+
+	if (real_dev->mtu < new_mtu)
 		return -ERANGE;
 
+	orig_mtu = dev->mtu;
 	dev->mtu = new_mtu;
 
-	return 0;
+	err = vlan_dev_enc_hdr_len(dev, dev->enc_hdr_len);
+	if (err)
+		dev->mtu = orig_mtu;
+
+	return err;
 }
 
 void vlan_dev_set_ingress_priority(const struct net_device *dev,
@@ -532,6 +550,8 @@ static const struct net_device_ops vlan_netdev_ops;
 static int vlan_dev_init(struct net_device *dev)
 {
 	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	int enc_hdr_len;
+	int err;
 
 	netif_carrier_off(dev);
 
@@ -583,6 +603,23 @@ static int vlan_dev_init(struct net_device *dev)
 
 	vlan_dev_set_lockdep_class(dev, vlan_dev_get_lock_subclass(dev));
 
+	if (vlan_dev_priv(dev)->vlan_proto == htons(ETH_P_8021AD))
+		dev->enc_hdr_len = VLAN_HLEN;
+
+	enc_hdr_len = dev->enc_hdr_len + VLAN_HLEN;
+	err = dev_set_enc_hdr_len(real_dev, enc_hdr_len);
+	if (err) {
+		int new_mtu = real_dev->mtu + real_dev->enc_hdr_len -
+			      enc_hdr_len;
+
+		if (new_mtu < 0)
+			return -ENOSPC;
+
+		netdev_warn(dev, "Failed to expand encap header room to %d on real device. Decrease MTU to %d on vlan device.\n",
+			    enc_hdr_len, new_mtu);
+		dev->mtu = new_mtu;
+	}
+
 	vlan_dev_priv(dev)->vlan_pcpu_stats = netdev_alloc_pcpu_stats(struct vlan_pcpu_stats);
 	if (!vlan_dev_priv(dev)->vlan_pcpu_stats)
 		return -ENOMEM;
@@ -776,6 +813,7 @@ static const struct net_device_ops vlan_netdev_ops = {
 	.ndo_fix_features	= vlan_dev_fix_features,
 	.ndo_get_lock_subclass  = vlan_dev_get_lock_subclass,
 	.ndo_get_iflink		= vlan_dev_get_iflink,
+	.ndo_enc_hdr_len	= vlan_dev_enc_hdr_len,
 };
 
 static void vlan_dev_free(struct net_device *dev)
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-wired-lan] [PATCH v2 net-next 3/4] vlan: Notify real device of encap header length
@ 2015-10-26  3:40   ` Toshiaki Makita
  0 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: intel-wired-lan

Reserve extra 4 bytes on real_dev in addition to the length notified
from upper device.
In 802.1ad mode, set enc_hdr_len to 4 bytes by default, since it is
likely to send already tagged frame.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 net/8021q/vlan.c     | 16 ++++++++++++++--
 net/8021q/vlan_dev.c | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index d2cd9de..bbafc5c 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -355,6 +355,7 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
 	struct net_device *vlandev;
 	struct vlan_dev_priv *vlan;
 	bool last = false;
+	int dev_max_frame;
 	LIST_HEAD(list);
 
 	if (is_vlan_dev(dev)) {
@@ -399,11 +400,22 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event,
 		break;
 
 	case NETDEV_CHANGEMTU:
+		dev_max_frame = dev->mtu + dev->enc_hdr_len;
 		vlan_group_for_each_dev(grp, i, vlandev) {
-			if (vlandev->mtu <= dev->mtu)
+			int enc_hdr_len = vlandev->enc_hdr_len + VLAN_HLEN;
+
+			if (vlandev->mtu <= dev->mtu &&
+			    vlandev->mtu + enc_hdr_len <= dev_max_frame)
 				continue;
 
-			dev_set_mtu(vlandev, dev->mtu);
+			if (vlandev->mtu > dev->mtu) {
+				dev_set_mtu(vlandev, dev->mtu);
+			} else {
+				int mtu_room = dev->mtu - vlandev->mtu;
+
+				dev_set_enc_hdr_len(dev,
+						    enc_hdr_len - mtu_room);
+			}
 		}
 		break;
 
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index fded865..ac377cf 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -143,17 +143,35 @@ static netdev_tx_t vlan_dev_hard_start_xmit(struct sk_buff *skb,
 	return ret;
 }
 
+static int vlan_dev_enc_hdr_len(struct net_device *dev, int new_len)
+{
+	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	int mtu_room = real_dev->mtu - dev->mtu;
+
+	new_len += VLAN_HLEN;
+	if (new_len > mtu_room)
+		return dev_set_enc_hdr_len(real_dev, new_len - mtu_room);
+
+	return 0;
+}
+
 static int vlan_dev_change_mtu(struct net_device *dev, int new_mtu)
 {
-	/* TODO: gotta make sure the underlying layer can handle it,
-	 * maybe an IFF_VLAN_CAPABLE flag for devices?
-	 */
-	if (vlan_dev_priv(dev)->real_dev->mtu < new_mtu)
+	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	int orig_mtu;
+	int err;
+
+	if (real_dev->mtu < new_mtu)
 		return -ERANGE;
 
+	orig_mtu = dev->mtu;
 	dev->mtu = new_mtu;
 
-	return 0;
+	err = vlan_dev_enc_hdr_len(dev, dev->enc_hdr_len);
+	if (err)
+		dev->mtu = orig_mtu;
+
+	return err;
 }
 
 void vlan_dev_set_ingress_priority(const struct net_device *dev,
@@ -532,6 +550,8 @@ static const struct net_device_ops vlan_netdev_ops;
 static int vlan_dev_init(struct net_device *dev)
 {
 	struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
+	int enc_hdr_len;
+	int err;
 
 	netif_carrier_off(dev);
 
@@ -583,6 +603,23 @@ static int vlan_dev_init(struct net_device *dev)
 
 	vlan_dev_set_lockdep_class(dev, vlan_dev_get_lock_subclass(dev));
 
+	if (vlan_dev_priv(dev)->vlan_proto == htons(ETH_P_8021AD))
+		dev->enc_hdr_len = VLAN_HLEN;
+
+	enc_hdr_len = dev->enc_hdr_len + VLAN_HLEN;
+	err = dev_set_enc_hdr_len(real_dev, enc_hdr_len);
+	if (err) {
+		int new_mtu = real_dev->mtu + real_dev->enc_hdr_len -
+			      enc_hdr_len;
+
+		if (new_mtu < 0)
+			return -ENOSPC;
+
+		netdev_warn(dev, "Failed to expand encap header room to %d on real device. Decrease MTU to %d on vlan device.\n",
+			    enc_hdr_len, new_mtu);
+		dev->mtu = new_mtu;
+	}
+
 	vlan_dev_priv(dev)->vlan_pcpu_stats = netdev_alloc_pcpu_stats(struct vlan_pcpu_stats);
 	if (!vlan_dev_priv(dev)->vlan_pcpu_stats)
 		return -ENOMEM;
@@ -776,6 +813,7 @@ static const struct net_device_ops vlan_netdev_ops = {
 	.ndo_fix_features	= vlan_dev_fix_features,
 	.ndo_get_lock_subclass  = vlan_dev_get_lock_subclass,
 	.ndo_get_iflink		= vlan_dev_get_iflink,
+	.ndo_enc_hdr_len	= vlan_dev_enc_hdr_len,
 };
 
 static void vlan_dev_free(struct net_device *dev)
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 net-next 4/4] bridge: Notify port device of encap header length
  2015-10-26  3:40 ` [Intel-wired-lan] " Toshiaki Makita
@ 2015-10-26  3:40   ` Toshiaki Makita
  -1 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: David S . Miller
  Cc: Toshiaki Makita, Patrick McHardy, Stephen Hemminger,
	Vlad Yasevich, Jeff Kirsher, intel-wired-lan, toshiaki.makita1,
	netdev

If vlan is assigned to a port, notify the port of 4 bytes header length,
or 8 bytes if 802.1ad.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 net/bridge/br_vlan.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 5f0d0cc..9c678de 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -208,6 +208,8 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags)
 	}
 
 	if (p) {
+		int enc_hdr_len;
+
 		/* Add VLAN to the device filter if it is supported.
 		 * This ensures tagged traffic enters the bridge when
 		 * promiscuous mode is disabled by br_manage_promisc().
@@ -228,6 +230,13 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags)
 		if (!masterv)
 			goto out_filt;
 		v->brvlan = masterv;
+
+		if (!vg->num_vlans) {
+			enc_hdr_len = VLAN_HLEN;
+			if (br->vlan_proto == htons(ETH_P_8021AD))
+				enc_hdr_len += VLAN_HLEN;
+			dev_set_enc_hdr_len(dev, enc_hdr_len);
+		}
 	}
 
 	/* Add the dev mac and count the vlan only if it's usable */
@@ -667,6 +676,15 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto)
 		}
 	}
 
+	if (proto == htons(ETH_P_8021AD)) {
+		list_for_each_entry(p, &br->port_list, list) {
+			if (!p->vlgrp->num_vlans)
+				continue;
+
+			dev_set_enc_hdr_len(p->dev, VLAN_HLEN * 2);
+		}
+	}
+
 	oldproto = br->vlan_proto;
 	br->vlan_proto = proto;
 
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Intel-wired-lan] [PATCH v2 net-next 4/4] bridge: Notify port device of encap header length
@ 2015-10-26  3:40   ` Toshiaki Makita
  0 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-26  3:40 UTC (permalink / raw)
  To: intel-wired-lan

If vlan is assigned to a port, notify the port of 4 bytes header length,
or 8 bytes if 802.1ad.

Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 net/bridge/br_vlan.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 5f0d0cc..9c678de 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -208,6 +208,8 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags)
 	}
 
 	if (p) {
+		int enc_hdr_len;
+
 		/* Add VLAN to the device filter if it is supported.
 		 * This ensures tagged traffic enters the bridge when
 		 * promiscuous mode is disabled by br_manage_promisc().
@@ -228,6 +230,13 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags)
 		if (!masterv)
 			goto out_filt;
 		v->brvlan = masterv;
+
+		if (!vg->num_vlans) {
+			enc_hdr_len = VLAN_HLEN;
+			if (br->vlan_proto == htons(ETH_P_8021AD))
+				enc_hdr_len += VLAN_HLEN;
+			dev_set_enc_hdr_len(dev, enc_hdr_len);
+		}
 	}
 
 	/* Add the dev mac and count the vlan only if it's usable */
@@ -667,6 +676,15 @@ int __br_vlan_set_proto(struct net_bridge *br, __be16 proto)
 		}
 	}
 
+	if (proto == htons(ETH_P_8021AD)) {
+		list_for_each_entry(p, &br->port_list, list) {
+			if (!p->vlgrp->num_vlans)
+				continue;
+
+			dev_set_enc_hdr_len(p->dev, VLAN_HLEN * 2);
+		}
+	}
+
 	oldproto = br->vlan_proto;
 	br->vlan_proto = proto;
 
-- 
1.8.1.2



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 net-next 0/4] Automatic adjustment of max frame size
  2015-10-26  3:40 ` [Intel-wired-lan] " Toshiaki Makita
@ 2015-10-28  2:30   ` David Miller
  -1 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2015-10-28  2:30 UTC (permalink / raw)
  To: makita.toshiaki
  Cc: kaber, stephen, vyasevich, jeffrey.t.kirsher, intel-wired-lan,
	toshiaki.makita1, netdev

From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Date: Mon, 26 Oct 2015 12:40:55 +0900

> This patch set tries to resolve packet drop by oversize error on
> receiving double tagged packets and possibly other encapsulated
> packets.

Nobody is reviewing this patch series, therefore I am not applying
it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-wired-lan] [PATCH v2 net-next 0/4] Automatic adjustment of max frame size
@ 2015-10-28  2:30   ` David Miller
  0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2015-10-28  2:30 UTC (permalink / raw)
  To: intel-wired-lan

From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Date: Mon, 26 Oct 2015 12:40:55 +0900

> This patch set tries to resolve packet drop by oversize error on
> receiving double tagged packets and possibly other encapsulated
> packets.

Nobody is reviewing this patch series, therefore I am not applying
it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 net-next 0/4] Automatic adjustment of max frame size
  2015-10-26  3:40 ` [Intel-wired-lan] " Toshiaki Makita
@ 2015-10-28  4:58   ` Stephen Hemminger
  -1 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2015-10-28  4:58 UTC (permalink / raw)
  To: Toshiaki Makita
  Cc: David S . Miller, Patrick McHardy, Vlad Yasevich, Jeff Kirsher,
	intel-wired-lan, toshiaki.makita1, netdev

On Mon, 26 Oct 2015 12:40:55 +0900
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:

> This patch set tries to resolve packet drop by oversize error on receiving
> double tagged packets and possibly other encapsulated packets.
> 
> Problem description:
> Currently most NICs have 4 bytes room of receive buffer for vlan header and
> can receive 1522 bytes frame at maximum.
> This is, however, not sufficient once double tagged vlan is used.
> As MEF [1] says, maximum frame size of double tagged packets need to be at
> least 1526 to provide transparent ethernet VPN, and along the same line,
> HW switches send 1526 bytes double tagged packets.
> Thus, double tagged packets are dropped by default in most cases by
> oversize error. NICs need to accept 1526 bytes packets in this situation.
> 
> Approaches:
> To satisfy this requirement, this patch set introduces a way to indicate
> needed extra buffer space to drivers.
> This way can be re-used by other protocols than vlan, like mpls, vxlan, etc.
> 
> Other possible solutions:
> 
> - To adjust mtu automatically when stacked vlan device is created.
>   This is suboptimal because lower device is not necessarily used for only
>   vlan. Sometimes tagged and untagged traffic are both used at the same time.
>   Also, there are devices that already reserve 8 bytes room, in which case mtu
>   adjustment is unnecessary.
> 
> - To reserve more room by default.
>   This is also suboptimal because there are devices that chages behavior
>   when max acceptable frame size gets larger. For exapmle, e1000e enters
>   jumbo frame mode which has some additional ristrictions than normal.
>   Also, this is vlan-specific solution and not reusable by other encapsulation
>   protocols.
> 
> This patch set introduces .ndo_enc_hdr_len() and I chose e1000e as the first
> implementation. Patch 3 makes vlan driver utilize this API and automatically
> expand max frame size of the real device. Patch 4 makes bridge use the API
> in similar way as vlan.
> 
> Challenges:
> - Restore/shrink extra header room after vlan devices are deleted.
>   This will need some additional memory storage.
> - Manual modification of extra buffer size (by iproute2).
> 
> Note:
> - This problem was once discussed in Netdev 0.1 [2].
>   This patch set is based on the conclusion of the discussion.
> 
> Changes:
>  v2: Fixed chackpatch warnings
> 
> [1] https://wiki.mef.net/display/CESG/ENNI+Frame
> [2] https://www.netdev01.org/docs/netdev01_bof_8021ad_makita_150212.pdf
> 
> Toshiaki Makita (4):
>   net: Add ndo_enc_hdr_len to notify extra header room for encapsulated
>     frames
>   e1000e: Add ndo_enc_hdr_len
>   vlan: Notify real device of encap header length
>   bridge: Notify port device of encap header length
> 
>  drivers/net/ethernet/intel/e1000e/netdev.c | 82 ++++++++++++++++++++++--------
>  include/linux/netdevice.h                  |  9 ++++
>  net/8021q/vlan.c                           | 16 +++++-
>  net/8021q/vlan_dev.c                       | 48 +++++++++++++++--
>  net/bridge/br_vlan.c                       | 18 +++++++
>  net/core/dev.c                             | 36 +++++++++++++
>  6 files changed, 180 insertions(+), 29 deletions(-)
> 

The problem is that you require changing network device drivers
and device specific knowledge about what will work or not. Because
of that the modificaton can't be automated.

Also, this effects even more layered devices like tunnels etc.
The problem is quite large, and this patch only begins to address it.

It seems to me that just having the vlan driver to a sane
auto default is the best solution. It might cause a smaller MTU
than ideal, but at least it will still work. Then the user can
manually set a larger MTU if they know their hardware will work.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-wired-lan] [PATCH v2 net-next 0/4] Automatic adjustment of max frame size
@ 2015-10-28  4:58   ` Stephen Hemminger
  0 siblings, 0 replies; 16+ messages in thread
From: Stephen Hemminger @ 2015-10-28  4:58 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, 26 Oct 2015 12:40:55 +0900
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:

> This patch set tries to resolve packet drop by oversize error on receiving
> double tagged packets and possibly other encapsulated packets.
> 
> Problem description:
> Currently most NICs have 4 bytes room of receive buffer for vlan header and
> can receive 1522 bytes frame at maximum.
> This is, however, not sufficient once double tagged vlan is used.
> As MEF [1] says, maximum frame size of double tagged packets need to be at
> least 1526 to provide transparent ethernet VPN, and along the same line,
> HW switches send 1526 bytes double tagged packets.
> Thus, double tagged packets are dropped by default in most cases by
> oversize error. NICs need to accept 1526 bytes packets in this situation.
> 
> Approaches:
> To satisfy this requirement, this patch set introduces a way to indicate
> needed extra buffer space to drivers.
> This way can be re-used by other protocols than vlan, like mpls, vxlan, etc.
> 
> Other possible solutions:
> 
> - To adjust mtu automatically when stacked vlan device is created.
>   This is suboptimal because lower device is not necessarily used for only
>   vlan. Sometimes tagged and untagged traffic are both used at the same time.
>   Also, there are devices that already reserve 8 bytes room, in which case mtu
>   adjustment is unnecessary.
> 
> - To reserve more room by default.
>   This is also suboptimal because there are devices that chages behavior
>   when max acceptable frame size gets larger. For exapmle, e1000e enters
>   jumbo frame mode which has some additional ristrictions than normal.
>   Also, this is vlan-specific solution and not reusable by other encapsulation
>   protocols.
> 
> This patch set introduces .ndo_enc_hdr_len() and I chose e1000e as the first
> implementation. Patch 3 makes vlan driver utilize this API and automatically
> expand max frame size of the real device. Patch 4 makes bridge use the API
> in similar way as vlan.
> 
> Challenges:
> - Restore/shrink extra header room after vlan devices are deleted.
>   This will need some additional memory storage.
> - Manual modification of extra buffer size (by iproute2).
> 
> Note:
> - This problem was once discussed in Netdev 0.1 [2].
>   This patch set is based on the conclusion of the discussion.
> 
> Changes:
>  v2: Fixed chackpatch warnings
> 
> [1] https://wiki.mef.net/display/CESG/ENNI+Frame
> [2] https://www.netdev01.org/docs/netdev01_bof_8021ad_makita_150212.pdf
> 
> Toshiaki Makita (4):
>   net: Add ndo_enc_hdr_len to notify extra header room for encapsulated
>     frames
>   e1000e: Add ndo_enc_hdr_len
>   vlan: Notify real device of encap header length
>   bridge: Notify port device of encap header length
> 
>  drivers/net/ethernet/intel/e1000e/netdev.c | 82 ++++++++++++++++++++++--------
>  include/linux/netdevice.h                  |  9 ++++
>  net/8021q/vlan.c                           | 16 +++++-
>  net/8021q/vlan_dev.c                       | 48 +++++++++++++++--
>  net/bridge/br_vlan.c                       | 18 +++++++
>  net/core/dev.c                             | 36 +++++++++++++
>  6 files changed, 180 insertions(+), 29 deletions(-)
> 

The problem is that you require changing network device drivers
and device specific knowledge about what will work or not. Because
of that the modificaton can't be automated.

Also, this effects even more layered devices like tunnels etc.
The problem is quite large, and this patch only begins to address it.

It seems to me that just having the vlan driver to a sane
auto default is the best solution. It might cause a smaller MTU
than ideal, but at least it will still work. Then the user can
manually set a larger MTU if they know their hardware will work.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 net-next 0/4] Automatic adjustment of max frame size
  2015-10-28  4:58   ` [Intel-wired-lan] " Stephen Hemminger
@ 2015-10-28  7:40     ` Toshiaki Makita
  -1 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-28  7:40 UTC (permalink / raw)
  To: Stephen Hemminger, Toshiaki Makita
  Cc: David S . Miller, Patrick McHardy, Vlad Yasevich, Jeff Kirsher,
	intel-wired-lan, netdev

On 15/10/28 (水) 13:58, Stephen Hemminger wrote:
> On Mon, 26 Oct 2015 12:40:55 +0900
> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
...

Thank you for taking a look at the patch set.
I'm not sure if I fully understand you, so please correct me if I 
misread you.

> The problem is that you require changing network device drivers
> and device specific knowledge about what will work or not. Because
> of that the modificaton can't be automated.

I'm not sure what you mean by "device specific knowledge" and "automated"...
Indeed, this requires change in each driver.
But required changes in drivers should be mostly making use of 
ndo_change_mtu implementation code and not hard. We can progressively 
implement ndo_enc_hdr_len for each driver.
If max frame size cannot be changed on a certain NIC, vlan driver will 
emit a warning message and make MTU smaller, then userspace can handle 
it (patch 3). If needed, maybe we can expose this feature via ethtool.

>
> Also, this effects even more layered devices like tunnels etc.

Yes, if tunnel devices start to utilize this framework. This is one of 
purposes of my patch set.

> The problem is quite large, and this patch only begins to address it.

Yes, this is the first step to address the problem.

>
> It seems to me that just having the vlan driver to a sane
> auto default is the best solution.

For now, this patch implementation is limited to vlan. For other 
protocols, auto-expansion may not be suitable and may need some nob to 
use the framework.

If you mean just making MTU smaller on vlan device instead of adjusting 
max frame size of real device, then it would not work. 802.1ad HW 
switches, at any rate, send 1526 bytes frames so they will be dropped on 
the real device.

> It might cause a smaller MTU
> than ideal, but at least it will still work. Then the user can
> manually set a larger MTU if they know their hardware will work.

Toshiaki Makita

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Intel-wired-lan] [PATCH v2 net-next 0/4] Automatic adjustment of max frame size
@ 2015-10-28  7:40     ` Toshiaki Makita
  0 siblings, 0 replies; 16+ messages in thread
From: Toshiaki Makita @ 2015-10-28  7:40 UTC (permalink / raw)
  To: intel-wired-lan

On 15/10/28 (?) 13:58, Stephen Hemminger wrote:
> On Mon, 26 Oct 2015 12:40:55 +0900
> Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> wrote:
...

Thank you for taking a look at the patch set.
I'm not sure if I fully understand you, so please correct me if I 
misread you.

> The problem is that you require changing network device drivers
> and device specific knowledge about what will work or not. Because
> of that the modificaton can't be automated.

I'm not sure what you mean by "device specific knowledge" and "automated"...
Indeed, this requires change in each driver.
But required changes in drivers should be mostly making use of 
ndo_change_mtu implementation code and not hard. We can progressively 
implement ndo_enc_hdr_len for each driver.
If max frame size cannot be changed on a certain NIC, vlan driver will 
emit a warning message and make MTU smaller, then userspace can handle 
it (patch 3). If needed, maybe we can expose this feature via ethtool.

>
> Also, this effects even more layered devices like tunnels etc.

Yes, if tunnel devices start to utilize this framework. This is one of 
purposes of my patch set.

> The problem is quite large, and this patch only begins to address it.

Yes, this is the first step to address the problem.

>
> It seems to me that just having the vlan driver to a sane
> auto default is the best solution.

For now, this patch implementation is limited to vlan. For other 
protocols, auto-expansion may not be suitable and may need some nob to 
use the framework.

If you mean just making MTU smaller on vlan device instead of adjusting 
max frame size of real device, then it would not work. 802.1ad HW 
switches, at any rate, send 1526 bytes frames so they will be dropped on 
the real device.

> It might cause a smaller MTU
> than ideal, but at least it will still work. Then the user can
> manually set a larger MTU if they know their hardware will work.

Toshiaki Makita

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-10-28  7:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-26  3:40 [PATCH v2 net-next 0/4] Automatic adjustment of max frame size Toshiaki Makita
2015-10-26  3:40 ` [Intel-wired-lan] " Toshiaki Makita
2015-10-26  3:40 ` [PATCH v2 net-next 1/4] net: Add ndo_enc_hdr_len to notify extra header room for encapsulated frames Toshiaki Makita
2015-10-26  3:40   ` [Intel-wired-lan] " Toshiaki Makita
2015-10-26  3:40 ` [PATCH v2 net-next 2/4] e1000e: Add ndo_enc_hdr_len Toshiaki Makita
2015-10-26  3:40   ` [Intel-wired-lan] " Toshiaki Makita
2015-10-26  3:40 ` [PATCH v2 net-next 3/4] vlan: Notify real device of encap header length Toshiaki Makita
2015-10-26  3:40   ` [Intel-wired-lan] " Toshiaki Makita
2015-10-26  3:40 ` [PATCH v2 net-next 4/4] bridge: Notify port " Toshiaki Makita
2015-10-26  3:40   ` [Intel-wired-lan] " Toshiaki Makita
2015-10-28  2:30 ` [PATCH v2 net-next 0/4] Automatic adjustment of max frame size David Miller
2015-10-28  2:30   ` [Intel-wired-lan] " David Miller
2015-10-28  4:58 ` Stephen Hemminger
2015-10-28  4:58   ` [Intel-wired-lan] " Stephen Hemminger
2015-10-28  7:40   ` Toshiaki Makita
2015-10-28  7:40     ` [Intel-wired-lan] " Toshiaki Makita

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.