All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] AF_PACKET transport_offset fix
@ 2019-01-14 13:18 Maxim Mikityanskiy
  2019-01-14 13:18 ` [PATCH 1/7] net: Don't set transport offset to invalid value Maxim Mikityanskiy
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:18 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan, Maxim Mikityanskiy

This patch series contains the implementation of the RFC that was posted
on this mailing list previously:
https://www.spinics.net/lists/netdev/msg541709.html

It fixes having incorrect skb->transport_header values in cases when
dissect fails. Having correct values set by the kernel fixes mlx5
operation and allows to remove some unnecessary code flows in mlx5.

Maxim Mikityanskiy (7):
  net: Don't set transport offset to invalid value
  net: Introduce parse_protocol header_ops callback
  net/ethernet: Add parse_protocol header_ops support
  net/packet: Ask driver for protocol if not provided by user
  net/packet: Remove redundant skb->protocol set
  net/mlx5e: Remove the wrong assumption about transport offset
  net/mlx5e: Trust kernel regarding transport offset

 .../net/ethernet/mellanox/mlx5/core/en_tx.c   | 15 ++---------
 drivers/net/tap.c                             |  4 +--
 drivers/net/tun.c                             |  4 +--
 drivers/net/xen-netback/netback.c             | 19 ++++++++------
 include/linux/etherdevice.h                   |  1 +
 include/linux/netdevice.h                     | 10 +++++++
 include/linux/skbuff.h                        | 14 +++++-----
 net/ethernet/eth.c                            | 13 ++++++++++
 net/packet/af_packet.c                        | 26 +++++++++----------
 9 files changed, 60 insertions(+), 46 deletions(-)

-- 
2.19.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/7] net: Don't set transport offset to invalid value
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
@ 2019-01-14 13:18 ` Maxim Mikityanskiy
  2019-01-14 16:49   ` Willem de Bruijn
  2019-01-14 13:18 ` [PATCH 2/7] net: Introduce parse_protocol header_ops callback Maxim Mikityanskiy
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:18 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan, Maxim Mikityanskiy

If the socket was created with socket(AF_PACKET, SOCK_RAW, 0),
skb->protocol will be unset, __skb_flow_dissect() will fail, and
skb_probe_transport_header() will fall back to the offset_hint, making
the resulting skb_transport_offset incorrect.

If, however, there is no transport header in the packet,
transport_header shouldn't be set to an arbitrary value.

Fix it by leaving the transport offset unset if it couldn't be found, to
be explicit rather than to fill it with some wrong value. It changes the
behavior, but if some code relied on the old behavior, it would be
broken anyway, as the old one is incorrect.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
---
 drivers/net/tap.c                 |  4 ++--
 drivers/net/tun.c                 |  4 ++--
 drivers/net/xen-netback/netback.c | 19 +++++++++++--------
 include/linux/skbuff.h            | 14 +++++++-------
 net/packet/af_packet.c            |  6 +++---
 5 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 443b2694130c..a35b44b13a34 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -712,7 +712,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control,
 			goto err_kfree;
 	}
 
-	skb_probe_transport_header(skb, ETH_HLEN);
+	skb_try_probe_transport_header(skb);
 
 	/* Move network header to the right position for VLAN tagged packets */
 	if ((skb->protocol == htons(ETH_P_8021Q) ||
@@ -1177,7 +1177,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
 			goto err_kfree;
 	}
 
-	skb_probe_transport_header(skb, ETH_HLEN);
+	skb_try_probe_transport_header(skb);
 
 	/* Move network header to the right position for VLAN tagged packets */
 	if ((skb->protocol == htons(ETH_P_8021Q) ||
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a4fdad475594..f73a156379e6 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1927,7 +1927,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	}
 
 	skb_reset_network_header(skb);
-	skb_probe_transport_header(skb, 0);
+	skb_try_probe_transport_header(skb);
 
 	if (skb_xdp) {
 		struct bpf_prog *xdp_prog;
@@ -2480,7 +2480,7 @@ static int tun_xdp_one(struct tun_struct *tun,
 
 	skb->protocol = eth_type_trans(skb, tun->dev);
 	skb_reset_network_header(skb);
-	skb_probe_transport_header(skb, 0);
+	skb_try_probe_transport_header(skb);
 
 	if (skb_xdp) {
 		err = do_xdp_generic(xdp_prog, skb);
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 80aae3a32c2a..b49b6e56ca47 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1105,6 +1105,7 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
 		struct xen_netif_tx_request *txp;
 		u16 pending_idx;
 		unsigned data_len;
+		bool th_set;
 
 		pending_idx = XENVIF_TX_CB(skb)->pending_idx;
 		txp = &queue->pending_tx_info[pending_idx].req;
@@ -1169,20 +1170,22 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
 			continue;
 		}
 
-		skb_probe_transport_header(skb, 0);
+		th_set = skb_try_probe_transport_header(skb);
 
 		/* If the packet is GSO then we will have just set up the
 		 * transport header offset in checksum_setup so it's now
 		 * straightforward to calculate gso_segs.
 		 */
 		if (skb_is_gso(skb)) {
-			int mss = skb_shinfo(skb)->gso_size;
-			int hdrlen = skb_transport_header(skb) -
-				skb_mac_header(skb) +
-				tcp_hdrlen(skb);
-
-			skb_shinfo(skb)->gso_segs =
-				DIV_ROUND_UP(skb->len - hdrlen, mss);
+			if (likely(th_set)) { /* GSO implies having L4 header */
+				int mss = skb_shinfo(skb)->gso_size;
+				int hdrlen = skb_transport_header(skb) -
+					skb_mac_header(skb) +
+					tcp_hdrlen(skb);
+
+				skb_shinfo(skb)->gso_segs =
+					DIV_ROUND_UP(skb->len - hdrlen, mss);
+			}
 		}
 
 		queue->stats.rx_bytes += skb->len;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2a57a365c711..b3aa2be1afb3 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2424,18 +2424,18 @@ static inline void skb_pop_mac_header(struct sk_buff *skb)
 	skb->mac_header = skb->network_header;
 }
 
-static inline void skb_probe_transport_header(struct sk_buff *skb,
-					      const int offset_hint)
+static inline bool skb_try_probe_transport_header(struct sk_buff *skb)
 {
 	struct flow_keys_basic keys;
 
 	if (skb_transport_header_was_set(skb))
-		return;
+		return true;
 
-	if (skb_flow_dissect_flow_keys_basic(skb, &keys, NULL, 0, 0, 0, 0))
-		skb_set_transport_header(skb, keys.control.thoff);
-	else
-		skb_set_transport_header(skb, offset_hint);
+	if (!skb_flow_dissect_flow_keys_basic(skb, &keys, NULL, 0, 0, 0, 0))
+		return false;
+
+	skb_set_transport_header(skb, keys.control.thoff);
+	return true;
 }
 
 static inline void skb_mac_header_rebuild(struct sk_buff *skb)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index eedacdebcd4c..8fc76e68777a 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1970,7 +1970,7 @@ static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg,
 	if (unlikely(extra_len == 4))
 		skb->no_fcs = 1;
 
-	skb_probe_transport_header(skb, 0);
+	skb_try_probe_transport_header(skb);
 
 	dev_queue_xmit(skb);
 	rcu_read_unlock();
@@ -2519,7 +2519,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		len = ((to_write > len_max) ? len_max : to_write);
 	}
 
-	skb_probe_transport_header(skb, 0);
+	skb_try_probe_transport_header(skb);
 
 	return tp_len;
 }
@@ -2924,7 +2924,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		virtio_net_hdr_set_proto(skb, &vnet_hdr);
 	}
 
-	skb_probe_transport_header(skb, reserve);
+	skb_try_probe_transport_header(skb);
 
 	if (unlikely(extra_len == 4))
 		skb->no_fcs = 1;
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/7] net: Introduce parse_protocol header_ops callback
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
  2019-01-14 13:18 ` [PATCH 1/7] net: Don't set transport offset to invalid value Maxim Mikityanskiy
@ 2019-01-14 13:18 ` Maxim Mikityanskiy
  2019-01-14 13:19 ` [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support Maxim Mikityanskiy
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:18 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan, Maxim Mikityanskiy

Introduce a new optional header_ops callback called parse_protocol and a
wrapper function dev_parse_header_protocol, similar to dev_parse_header.

The new callback's purpose is to extract the protocol number from the L2
header, the format of which is known to the driver, but not to the upper
layers of the stack.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
---
 include/linux/netdevice.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1377d085ef99..44b12430b0a7 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -274,6 +274,7 @@ struct header_ops {
 				const struct net_device *dev,
 				const unsigned char *haddr);
 	bool	(*validate)(const char *ll_header, unsigned int len);
+	__be16	(*parse_protocol)(const struct sk_buff *skb);
 };
 
 /* These flag bits are private to the generic network queueing
@@ -2928,6 +2929,15 @@ static inline int dev_parse_header(const struct sk_buff *skb,
 	return dev->header_ops->parse(skb, haddr);
 }
 
+static inline __be16 dev_parse_header_protocol(const struct sk_buff *skb)
+{
+	const struct net_device *dev = skb->dev;
+
+	if (!dev->header_ops || !dev->header_ops->parse_protocol)
+		return 0;
+	return dev->header_ops->parse_protocol(skb);
+}
+
 /* ll_header must have at least hard_header_len allocated */
 static inline bool dev_validate_header(const struct net_device *dev,
 				       char *ll_header, int len)
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
  2019-01-14 13:18 ` [PATCH 1/7] net: Don't set transport offset to invalid value Maxim Mikityanskiy
  2019-01-14 13:18 ` [PATCH 2/7] net: Introduce parse_protocol header_ops callback Maxim Mikityanskiy
@ 2019-01-14 13:19 ` Maxim Mikityanskiy
  2019-01-23 14:14   ` Willem de Bruijn
  2019-01-14 13:19 ` [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user Maxim Mikityanskiy
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:19 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan, Maxim Mikityanskiy

The previous commit introduced parse_protocol callback which should
extract the protocol number from the L2 header. Make all Ethernet
devices support it.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
---
 include/linux/etherdevice.h |  1 +
 net/ethernet/eth.c          | 13 +++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
index 2c0af7b00715..e2f3b21cd72a 100644
--- a/include/linux/etherdevice.h
+++ b/include/linux/etherdevice.h
@@ -44,6 +44,7 @@ int eth_header_cache(const struct neighbour *neigh, struct hh_cache *hh,
 		     __be16 type);
 void eth_header_cache_update(struct hh_cache *hh, const struct net_device *dev,
 			     const unsigned char *haddr);
+__be16 eth_header_parse_protocol(const struct sk_buff *skb);
 int eth_prepare_mac_addr_change(struct net_device *dev, void *p);
 void eth_commit_mac_addr_change(struct net_device *dev, void *p);
 int eth_mac_addr(struct net_device *dev, void *p);
diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
index 4c520110b04f..f7a3d7a171c7 100644
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -264,6 +264,18 @@ void eth_header_cache_update(struct hh_cache *hh,
 }
 EXPORT_SYMBOL(eth_header_cache_update);
 
+/**
+ * eth_header_parser_protocol - extract protocol from L2 header
+ * @skb: packet to extract protocol from
+ */
+__be16 eth_header_parse_protocol(const struct sk_buff *skb)
+{
+	const struct ethhdr *eth = eth_hdr(skb);
+
+	return eth->h_proto;
+}
+EXPORT_SYMBOL(eth_header_parse_protocol);
+
 /**
  * eth_prepare_mac_addr_change - prepare for mac change
  * @dev: network device
@@ -346,6 +358,7 @@ const struct header_ops eth_header_ops ____cacheline_aligned = {
 	.parse		= eth_header_parse,
 	.cache		= eth_header_cache,
 	.cache_update	= eth_header_cache_update,
+	.parse_protocol	= eth_header_parse_protocol,
 };
 
 /**
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
                   ` (2 preceding siblings ...)
  2019-01-14 13:19 ` [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support Maxim Mikityanskiy
@ 2019-01-14 13:19 ` Maxim Mikityanskiy
  2019-01-14 16:52   ` Willem de Bruijn
  2019-01-14 13:19 ` [PATCH 5/7] net/packet: Remove redundant skb->protocol set Maxim Mikityanskiy
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:19 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan, Maxim Mikityanskiy

If a socket was created with socket(AF_PACKET, SOCK_RAW, 0), the
protocol number is unavailable. Try to ask the driver to extract it from
the L2 header in order for skb_try_probe_transport_header to succeed.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
---
 net/packet/af_packet.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 8fc76e68777a..d1d89749a17a 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1850,6 +1850,15 @@ static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev,
 	return 0;
 }
 
+static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
+{
+	if (!skb->protocol && sock->type == SOCK_RAW) {
+		skb_reset_mac_header(skb);
+		skb->protocol = dev_parse_header_protocol(skb);
+	}
+
+	skb_try_probe_transport_header(skb);
+}
 
 /*
  *	Output a raw packet to a device layer. This bypasses all the other
@@ -1970,7 +1979,7 @@ static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg,
 	if (unlikely(extra_len == 4))
 		skb->no_fcs = 1;
 
-	skb_try_probe_transport_header(skb);
+	packet_parse_headers(skb, sock);
 
 	dev_queue_xmit(skb);
 	rcu_read_unlock();
@@ -2519,7 +2528,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		len = ((to_write > len_max) ? len_max : to_write);
 	}
 
-	skb_try_probe_transport_header(skb);
+	packet_parse_headers(skb, sock);
 
 	return tp_len;
 }
@@ -2924,7 +2933,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		virtio_net_hdr_set_proto(skb, &vnet_hdr);
 	}
 
-	skb_try_probe_transport_header(skb);
+	packet_parse_headers(skb, sock);
 
 	if (unlikely(extra_len == 4))
 		skb->no_fcs = 1;
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 5/7] net/packet: Remove redundant skb->protocol set
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
                   ` (3 preceding siblings ...)
  2019-01-14 13:19 ` [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user Maxim Mikityanskiy
@ 2019-01-14 13:19 ` Maxim Mikityanskiy
  2019-01-14 13:19 ` [PATCH 6/7] net/mlx5e: Remove the wrong assumption about transport offset Maxim Mikityanskiy
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:19 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan, Maxim Mikityanskiy

c72219b75f introduced tpacket_set_protocol that parses the Ethernet L2
header and sets skb->protocol if it's unset. It is no longer needed
since the introduction of packet_parse_headers. In case of SOCK_RAW and
unset skb->protocol, packet_parse_headers asks the driver to tell the
protocol number, and it's implemented for all Ethernet devices. As the
old function supported only Ethernet, no functionality is lost.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
---
 net/packet/af_packet.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d1d89749a17a..e41bddf3d8bb 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2413,15 +2413,6 @@ static void tpacket_destruct_skb(struct sk_buff *skb)
 	sock_wfree(skb);
 }
 
-static void tpacket_set_protocol(const struct net_device *dev,
-				 struct sk_buff *skb)
-{
-	if (dev->type == ARPHRD_ETHER) {
-		skb_reset_mac_header(skb);
-		skb->protocol = eth_hdr(skb)->h_proto;
-	}
-}
-
 static int __packet_snd_vnet_parse(struct virtio_net_hdr *vnet_hdr, size_t len)
 {
 	if ((vnet_hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
@@ -2492,8 +2483,6 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 			return err;
 		if (!dev_validate_header(dev, skb->data, hdrlen))
 			return -EINVAL;
-		if (!skb->protocol)
-			tpacket_set_protocol(dev, skb);
 
 		data += hdrlen;
 		to_write -= hdrlen;
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 6/7] net/mlx5e: Remove the wrong assumption about transport offset
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
                   ` (4 preceding siblings ...)
  2019-01-14 13:19 ` [PATCH 5/7] net/packet: Remove redundant skb->protocol set Maxim Mikityanskiy
@ 2019-01-14 13:19 ` Maxim Mikityanskiy
  2019-01-14 13:19 ` [PATCH 7/7] net/mlx5e: Trust kernel regarding " Maxim Mikityanskiy
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:19 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan, Maxim Mikityanskiy

skb_transport_offset() == 0 is not a special value. The only special
value is when skb->transport_header is ~0U, and it's checked by
skb_transport_header_was_set().

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 598ad7e4d5c9..563fdb1dd15e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -172,15 +172,8 @@ static inline u16 mlx5e_calc_min_inline(enum mlx5_inline_modes mode,
 			hlen += VLAN_HLEN;
 		break;
 	case MLX5_INLINE_MODE_IP:
-		/* When transport header is set to zero, it means no transport
-		 * header. When transport header is set to 0xff's, it means
-		 * transport header wasn't set.
-		 */
-		if (skb_transport_offset(skb)) {
-			hlen = mlx5e_skb_l3_header_offset(skb);
-			break;
-		}
-		/* fall through */
+		hlen = mlx5e_skb_l3_header_offset(skb);
+		break;
 	case MLX5_INLINE_MODE_L2:
 	default:
 		hlen = mlx5e_skb_l2_header_offset(skb);
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 7/7] net/mlx5e: Trust kernel regarding transport offset
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
                   ` (5 preceding siblings ...)
  2019-01-14 13:19 ` [PATCH 6/7] net/mlx5e: Remove the wrong assumption about transport offset Maxim Mikityanskiy
@ 2019-01-14 13:19 ` Maxim Mikityanskiy
  2019-01-14 13:52 ` [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
  2019-01-23 10:16 ` Maxim Mikityanskiy
  8 siblings, 0 replies; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:19 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan, Maxim Mikityanskiy

After AF_PACKET is fixed to calculate the transport header offset
correctly, trust the value set by the kernel. If the offset wasn't set,
it means there is no transport header in the packet.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index 563fdb1dd15e..119bbbbe4ad0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -148,12 +148,8 @@ static inline int mlx5e_skb_l2_header_offset(struct sk_buff *skb)
 
 static inline int mlx5e_skb_l3_header_offset(struct sk_buff *skb)
 {
-	struct flow_keys keys;
-
 	if (skb_transport_header_was_set(skb))
 		return skb_transport_offset(skb);
-	else if (skb_flow_dissect_flow_keys(skb, &keys, 0))
-		return keys.control.thoff;
 	else
 		return mlx5e_skb_l2_header_offset(skb);
 }
-- 
2.19.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* RE: [PATCH 0/7] AF_PACKET transport_offset fix
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
                   ` (6 preceding siblings ...)
  2019-01-14 13:19 ` [PATCH 7/7] net/mlx5e: Trust kernel regarding " Maxim Mikityanskiy
@ 2019-01-14 13:52 ` Maxim Mikityanskiy
  2019-01-23 10:16 ` Maxim Mikityanskiy
  8 siblings, 0 replies; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-14 13:52 UTC (permalink / raw)
  To: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet
  Cc: netdev, Eran Ben Elisha, Tariq Toukan

I just realized that the window is still closed. I'll resubmit it when it opens. Sorry for the extra noise.

> -----Original Message-----
> From: Maxim Mikityanskiy
> Sent: 14 January, 2019 15:19
> To: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>
> Cc: netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq
> Toukan <tariqt@mellanox.com>; Maxim Mikityanskiy <maximmi@mellanox.com>
> Subject: [PATCH 0/7] AF_PACKET transport_offset fix
> 
> This patch series contains the implementation of the RFC that was posted
> on this mailing list previously:
> https://www.spinics.net/lists/netdev/msg541709.html
> 
> It fixes having incorrect skb->transport_header values in cases when
> dissect fails. Having correct values set by the kernel fixes mlx5
> operation and allows to remove some unnecessary code flows in mlx5.
> 
> Maxim Mikityanskiy (7):
>   net: Don't set transport offset to invalid value
>   net: Introduce parse_protocol header_ops callback
>   net/ethernet: Add parse_protocol header_ops support
>   net/packet: Ask driver for protocol if not provided by user
>   net/packet: Remove redundant skb->protocol set
>   net/mlx5e: Remove the wrong assumption about transport offset
>   net/mlx5e: Trust kernel regarding transport offset
> 
>  .../net/ethernet/mellanox/mlx5/core/en_tx.c   | 15 ++---------
>  drivers/net/tap.c                             |  4 +--
>  drivers/net/tun.c                             |  4 +--
>  drivers/net/xen-netback/netback.c             | 19 ++++++++------
>  include/linux/etherdevice.h                   |  1 +
>  include/linux/netdevice.h                     | 10 +++++++
>  include/linux/skbuff.h                        | 14 +++++-----
>  net/ethernet/eth.c                            | 13 ++++++++++
>  net/packet/af_packet.c                        | 26 +++++++++----------
>  9 files changed, 60 insertions(+), 46 deletions(-)
> 
> --
> 2.19.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/7] net: Don't set transport offset to invalid value
  2019-01-14 13:18 ` [PATCH 1/7] net: Don't set transport offset to invalid value Maxim Mikityanskiy
@ 2019-01-14 16:49   ` Willem de Bruijn
  2019-01-17  9:10     ` Maxim Mikityanskiy
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-14 16:49 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Mon, Jan 14, 2019 at 8:20 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> If the socket was created with socket(AF_PACKET, SOCK_RAW, 0),
> skb->protocol will be unset, __skb_flow_dissect() will fail, and
> skb_probe_transport_header() will fall back to the offset_hint, making
> the resulting skb_transport_offset incorrect.
>
> If, however, there is no transport header in the packet,
> transport_header shouldn't be set to an arbitrary value.
>
> Fix it by leaving the transport offset unset if it couldn't be found, to
> be explicit rather than to fill it with some wrong value. It changes the
> behavior, but if some code relied on the old behavior, it would be
> broken anyway, as the old one is incorrect.
>
>
> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> ---
>  drivers/net/tap.c                 |  4 ++--
>  drivers/net/tun.c                 |  4 ++--
>  drivers/net/xen-netback/netback.c | 19 +++++++++++--------
>  include/linux/skbuff.h            | 14 +++++++-------
>  net/packet/af_packet.c            |  6 +++---
>  5 files changed, 25 insertions(+), 22 deletions(-)

This is a lot of code change. This would do.

@@ -2434,8 +2434,6 @@ static inline void
skb_probe_transport_header(struct sk_buff *skb,

        if (skb_flow_dissect_flow_keys_basic(skb, &keys, NULL, 0, 0, 0, 0))
                skb_set_transport_header(skb, keys.control.thoff);
-       else
-               skb_set_transport_header(skb, offset_hint);
 }

Though leaving an unused argument is a bit ugly. For net-next, indeed
better to clean up (please mark your patchset with net or net-next,
btw)

>
> diff --git a/drivers/net/tap.c b/drivers/net/tap.c
> index 443b2694130c..a35b44b13a34 100644
> --- a/drivers/net/tap.c
> +++ b/drivers/net/tap.c
> @@ -712,7 +712,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void *msg_control,
>                         goto err_kfree;
>         }
>
> -       skb_probe_transport_header(skb, ETH_HLEN);
> +       skb_try_probe_transport_header(skb);
>
>         /* Move network header to the right position for VLAN tagged packets */
>         if ((skb->protocol == htons(ETH_P_8021Q) ||
> @@ -1177,7 +1177,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
>                         goto err_kfree;
>         }
>
> -       skb_probe_transport_header(skb, ETH_HLEN);
> +       skb_try_probe_transport_header(skb);
>
>         /* Move network header to the right position for VLAN tagged packets */
>         if ((skb->protocol == htons(ETH_P_8021Q) ||
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index a4fdad475594..f73a156379e6 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1927,7 +1927,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>         }
>
>         skb_reset_network_header(skb);
> -       skb_probe_transport_header(skb, 0);
> +       skb_try_probe_transport_header(skb);
>
>         if (skb_xdp) {
>                 struct bpf_prog *xdp_prog;
> @@ -2480,7 +2480,7 @@ static int tun_xdp_one(struct tun_struct *tun,
>
>         skb->protocol = eth_type_trans(skb, tun->dev);
>         skb_reset_network_header(skb);
> -       skb_probe_transport_header(skb, 0);
> +       skb_try_probe_transport_header(skb);
>
>         if (skb_xdp) {
>                 err = do_xdp_generic(xdp_prog, skb);
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 80aae3a32c2a..b49b6e56ca47 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1105,6 +1105,7 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
>                 struct xen_netif_tx_request *txp;
>                 u16 pending_idx;
>                 unsigned data_len;
> +               bool th_set;
>
>                 pending_idx = XENVIF_TX_CB(skb)->pending_idx;
>                 txp = &queue->pending_tx_info[pending_idx].req;
> @@ -1169,20 +1170,22 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
>                         continue;
>                 }
>
> -               skb_probe_transport_header(skb, 0);
> +               th_set = skb_try_probe_transport_header(skb);

Can use skb_transport_header_was_set(). Then at least there is no need
to change the function's return value.

Normally, a GSO packet will also have skb_transport_header set even
from packet sockets, due to skb_partial_csum_set called from
virtio_net_hdr_to_skb. But I don't think we can be sure that TSO
packets without checksum offload are dropped before reaching this
code.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user
  2019-01-14 13:19 ` [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user Maxim Mikityanskiy
@ 2019-01-14 16:52   ` Willem de Bruijn
  2019-01-15 16:32     ` Willem de Bruijn
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-14 16:52 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Mon, Jan 14, 2019 at 8:20 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> If a socket was created with socket(AF_PACKET, SOCK_RAW, 0), the
> protocol number is unavailable. Try to ask the driver to extract it from
> the L2 header in order for skb_try_probe_transport_header to succeed.
>
> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> ---
>  net/packet/af_packet.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 8fc76e68777a..d1d89749a17a 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1850,6 +1850,15 @@ static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev,
>         return 0;
>  }
>
> +static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
> +{
> +       if (!skb->protocol && sock->type == SOCK_RAW) {
> +               skb_reset_mac_header(skb);
> +               skb->protocol = dev_parse_header_protocol(skb);
> +       }
> +
> +       skb_try_probe_transport_header(skb);
> +}


In relation to the discussion at

  af_packet: fix raw sockets over 6in4 tunnel
  http://patchwork.ozlabs.org/patch/1023623/

if adding a new header_ops callback to parse link layer headers,
please have it return both protocol and link layer header length.

also, this information may be needed earlier in (t)packet_snd than the
transport header, so I would suggest not combining the two in a new
packet_parse_headers wrapper function.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user
  2019-01-14 16:52   ` Willem de Bruijn
@ 2019-01-15 16:32     ` Willem de Bruijn
  2019-01-17  9:10       ` Maxim Mikityanskiy
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-15 16:32 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Mon, Jan 14, 2019 at 11:52 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Mon, Jan 14, 2019 at 8:20 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
> >
> > If a socket was created with socket(AF_PACKET, SOCK_RAW, 0), the
> > protocol number is unavailable. Try to ask the driver to extract it from
> > the L2 header in order for skb_try_probe_transport_header to succeed.
> >
> > Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> > ---
> >  net/packet/af_packet.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> > index 8fc76e68777a..d1d89749a17a 100644
> > --- a/net/packet/af_packet.c
> > +++ b/net/packet/af_packet.c
> > @@ -1850,6 +1850,15 @@ static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev,
> >         return 0;
> >  }
> >
> > +static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
> > +{
> > +       if (!skb->protocol && sock->type == SOCK_RAW) {
> > +               skb_reset_mac_header(skb);
> > +               skb->protocol = dev_parse_header_protocol(skb);
> > +       }
> > +
> > +       skb_try_probe_transport_header(skb);
> > +}
>
>
> In relation to the discussion at
>
>   af_packet: fix raw sockets over 6in4 tunnel
>   http://patchwork.ozlabs.org/patch/1023623/
>
> if adding a new header_ops callback to parse link layer headers,
> please have it return both protocol and link layer header length.

This could just be an extension of existing header_ops->validate.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user
  2019-01-15 16:32     ` Willem de Bruijn
@ 2019-01-17  9:10       ` Maxim Mikityanskiy
  2019-01-17 15:41         ` Willem de Bruijn
  0 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-17  9:10 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

> > > +static void packet_parse_headers(struct sk_buff *skb, struct socket
> *sock)
> > > +{
> > > +       if (!skb->protocol && sock->type == SOCK_RAW) {
> > > +               skb_reset_mac_header(skb);
> > > +               skb->protocol = dev_parse_header_protocol(skb);
> > > +       }
> > > +
> > > +       skb_try_probe_transport_header(skb);
> > > +}
> >
> >
> > In relation to the discussion at
> >
> >   af_packet: fix raw sockets over 6in4 tunnel
> >   http://patchwork.ozlabs.org/patch/1023623/
> >
> > if adding a new header_ops callback to parse link layer headers,
> > please have it return both protocol and link layer header length.

Sorry, I miss the point here, can you elaborate more? If all you need is
to have some header_ops callback that returns the L2 header length,
there is one already, it's called parse. Or do you have a specific
reason why you want my callback to also return the header length?

> This could just be an extension of existing header_ops->validate.

If you suggest extending an existing function, parse looks more
suitable, but I decided not to touch the existing ones for two reasons:

1. I don't want to break the existing code that uses the parse function
and will need to be modified to pass an extra parameter.

2. I don't want to spend machine time on copying the destination MAC
when I only need the protocol, and vice versa.

I'm looking forward to hearing your thoughts about it.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 1/7] net: Don't set transport offset to invalid value
  2019-01-14 16:49   ` Willem de Bruijn
@ 2019-01-17  9:10     ` Maxim Mikityanskiy
  2019-01-17 15:16       ` Willem de Bruijn
  0 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-17  9:10 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

> This is a lot of code change. This would do.
> 
> @@ -2434,8 +2434,6 @@ static inline void
> skb_probe_transport_header(struct sk_buff *skb,
> 
>         if (skb_flow_dissect_flow_keys_basic(skb, &keys, NULL, 0, 0, 0, 0))
>                 skb_set_transport_header(skb, keys.control.thoff);
> -       else
> -               skb_set_transport_header(skb, offset_hint);
>  }
> 
> Though leaving an unused argument is a bit ugly. For net-next, indeed
> better to clean up (please mark your patchset with net or net-next,
> btw)

It's for net-next (I'll resend with the correct mark), so I'll stick
with the current implementation.

> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> > index 80aae3a32c2a..b49b6e56ca47 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -1105,6 +1105,7 @@ static int xenvif_tx_submit(struct xenvif_queue
> *queue)
> >                 struct xen_netif_tx_request *txp;
> >                 u16 pending_idx;
> >                 unsigned data_len;
> > +               bool th_set;
> >
> >                 pending_idx = XENVIF_TX_CB(skb)->pending_idx;
> >                 txp = &queue->pending_tx_info[pending_idx].req;
> > @@ -1169,20 +1170,22 @@ static int xenvif_tx_submit(struct xenvif_queue
> *queue)
> >                         continue;
> >                 }
> >
> > -               skb_probe_transport_header(skb, 0);
> > +               th_set = skb_try_probe_transport_header(skb);
> 
> Can use skb_transport_header_was_set(). Then at least there is no need
> to change the function's return value.

I suppose this comment relates to the previous one, and if we do it for
net-next, it's fine to make change I made, isn't it?

Thanks for reviewing.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/7] net: Don't set transport offset to invalid value
  2019-01-17  9:10     ` Maxim Mikityanskiy
@ 2019-01-17 15:16       ` Willem de Bruijn
  2019-01-23 10:16         ` Maxim Mikityanskiy
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-17 15:16 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Thu, Jan 17, 2019 at 4:10 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> > This is a lot of code change. This would do.
> >
> > @@ -2434,8 +2434,6 @@ static inline void
> > skb_probe_transport_header(struct sk_buff *skb,
> >
> >         if (skb_flow_dissect_flow_keys_basic(skb, &keys, NULL, 0, 0, 0, 0))
> >                 skb_set_transport_header(skb, keys.control.thoff);
> > -       else
> > -               skb_set_transport_header(skb, offset_hint);
> >  }
> >
> > Though leaving an unused argument is a bit ugly. For net-next, indeed
> > better to clean up (please mark your patchset with net or net-next,
> > btw)
>
> It's for net-next (I'll resend with the correct mark), so I'll stick
> with the current implementation.

Absolutely, sounds good.

> > > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> > netback/netback.c
> > > index 80aae3a32c2a..b49b6e56ca47 100644
> > > --- a/drivers/net/xen-netback/netback.c
> > > +++ b/drivers/net/xen-netback/netback.c
> > > @@ -1105,6 +1105,7 @@ static int xenvif_tx_submit(struct xenvif_queue
> > *queue)
> > >                 struct xen_netif_tx_request *txp;
> > >                 u16 pending_idx;
> > >                 unsigned data_len;
> > > +               bool th_set;
> > >
> > >                 pending_idx = XENVIF_TX_CB(skb)->pending_idx;
> > >                 txp = &queue->pending_tx_info[pending_idx].req;
> > > @@ -1169,20 +1170,22 @@ static int xenvif_tx_submit(struct xenvif_queue
> > *queue)
> > >                         continue;
> > >                 }
> > >
> > > -               skb_probe_transport_header(skb, 0);
> > > +               th_set = skb_try_probe_transport_header(skb);
> >
> > Can use skb_transport_header_was_set(). Then at least there is no need
> > to change the function's return value.
>
> I suppose this comment relates to the previous one, and if we do it for
> net-next, it's fine to make change I made, isn't it?

If this is the only reason for the boolean return value, using
skb_transport_header_was_set() is more standard (I immediately know
what's happening when I read it), slightly less code change and avoids
introducing a situation where the majority of callers ignore a return
value. I think it's preferable. But these merits are certainly
debatable, so either is fine.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user
  2019-01-17  9:10       ` Maxim Mikityanskiy
@ 2019-01-17 15:41         ` Willem de Bruijn
  0 siblings, 0 replies; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-17 15:41 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Thu, Jan 17, 2019 at 4:10 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> > > > +static void packet_parse_headers(struct sk_buff *skb, struct socket
> > *sock)
> > > > +{
> > > > +       if (!skb->protocol && sock->type == SOCK_RAW) {
> > > > +               skb_reset_mac_header(skb);
> > > > +               skb->protocol = dev_parse_header_protocol(skb);
> > > > +       }
> > > > +
> > > > +       skb_try_probe_transport_header(skb);
> > > > +}
> > >
> > >
> > > In relation to the discussion at
> > >
> > >   af_packet: fix raw sockets over 6in4 tunnel
> > >   http://patchwork.ozlabs.org/patch/1023623/
> > >
> > > if adding a new header_ops callback to parse link layer headers,
> > > please have it return both protocol and link layer header length.
>
> Sorry, I miss the point here, can you elaborate more? If all you need is
> to have some header_ops callback that returns the L2 header length,
> there is one already, it's called parse. Or do you have a specific
> reason why you want my callback to also return the header length?

The main reason is to avoid multiple indirect function calls, both
essentially doing the same: parsing the ll header. But admittedly the
instances where dev->header_ops->validate are called are rare.

> > This could just be an extension of existing header_ops->validate.
>
> If you suggest extending an existing function, parse looks more
> suitable, but I decided not to touch the existing ones for two reasons:
>
> 1. I don't want to break the existing code that uses the parse function
> and will need to be modified to pass an extra parameter.
>
> 2. I don't want to spend machine time on copying the destination MAC
> when I only need the protocol, and vice versa.
>
> I'm looking forward to hearing your thoughts about it.

header_ops.parse is also a good candidate. As a matter of fact, parse
and validate could (eventually) probably be combined.

The only direct caller to header_ops.parse appears to be
dev_parse_header, so modifying its interface should be fairly
straightforward. Allowing a NULL haddr could avoid the address copy
cost if not needed. This does require modifying all implementations.
But from a quick scan, there appear to be only 8. And only 1 for
validate. So changing the implementation is quite acceptable. Another
issue, though, would be what to return as protocol if a header does
not encode that.

Given these non-trivial changes, if you prefer to just add the
dedicated new callback, that's fine. We can see independently whether
deduplication makes sense. With three ll header parse functions, I
think that it will be. But even if so, it is better to do so as a
stand-alone noop patch than combining refactoring and new features,
anyway.

Long story short, sounds good. Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 1/7] net: Don't set transport offset to invalid value
  2019-01-17 15:16       ` Willem de Bruijn
@ 2019-01-23 10:16         ` Maxim Mikityanskiy
  2019-01-23 14:11           ` Willem de Bruijn
  0 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-23 10:16 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

> -----Original Message-----
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Sent: 17 January, 2019 17:16
> To: Maxim Mikityanskiy <maximmi@mellanox.com>
> Cc: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>;
> netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq Toukan
> <tariqt@mellanox.com>
> Subject: Re: [PATCH 1/7] net: Don't set transport offset to invalid value
> 
> On Thu, Jan 17, 2019 at 4:10 AM Maxim Mikityanskiy <maximmi@mellanox.com>
> wrote:
> >
> > > This is a lot of code change. This would do.
> > >
> > > @@ -2434,8 +2434,6 @@ static inline void
> > > skb_probe_transport_header(struct sk_buff *skb,
> > >
> > >         if (skb_flow_dissect_flow_keys_basic(skb, &keys, NULL, 0, 0, 0,
> 0))
> > >                 skb_set_transport_header(skb, keys.control.thoff);
> > > -       else
> > > -               skb_set_transport_header(skb, offset_hint);
> > >  }
> > >
> > > Though leaving an unused argument is a bit ugly. For net-next, indeed
> > > better to clean up (please mark your patchset with net or net-next,
> > > btw)
> >
> > It's for net-next (I'll resend with the correct mark), so I'll stick
> > with the current implementation.
> 
> Absolutely, sounds good.
> 
> > > > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> > > netback/netback.c
> > > > index 80aae3a32c2a..b49b6e56ca47 100644
> > > > --- a/drivers/net/xen-netback/netback.c
> > > > +++ b/drivers/net/xen-netback/netback.c
> > > > @@ -1105,6 +1105,7 @@ static int xenvif_tx_submit(struct xenvif_queue
> > > *queue)
> > > >                 struct xen_netif_tx_request *txp;
> > > >                 u16 pending_idx;
> > > >                 unsigned data_len;
> > > > +               bool th_set;
> > > >
> > > >                 pending_idx = XENVIF_TX_CB(skb)->pending_idx;
> > > >                 txp = &queue->pending_tx_info[pending_idx].req;
> > > > @@ -1169,20 +1170,22 @@ static int xenvif_tx_submit(struct
> xenvif_queue
> > > *queue)
> > > >                         continue;
> > > >                 }
> > > >
> > > > -               skb_probe_transport_header(skb, 0);
> > > > +               th_set = skb_try_probe_transport_header(skb);
> > >
> > > Can use skb_transport_header_was_set(). Then at least there is no need
> > > to change the function's return value.
> >
> > I suppose this comment relates to the previous one, and if we do it for
> > net-next, it's fine to make change I made, isn't it?
> 
> If this is the only reason for the boolean return value, using
> skb_transport_header_was_set() is more standard (I immediately know
> what's happening when I read it), slightly less code change and avoids
> introducing a situation where the majority of callers ignore a return
> value. I think it's preferable. But these merits are certainly
> debatable, so either is fine.

From my side, I wanted to avoid calling skb_transport_header_was_set
twice, so I made skb_try_probe_transport_header return whether it
succeeded or not. I think "try" in the function name indicates this idea
pretty clearly. This result status is pretty useful, it just happened
that it's not needed in many places, but the general idea is that we
report this status, so if you say that my version is also good for you,
I'll leave it as is. It was just a rationale for my decision.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 0/7] AF_PACKET transport_offset fix
  2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
                   ` (7 preceding siblings ...)
  2019-01-14 13:52 ` [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
@ 2019-01-23 10:16 ` Maxim Mikityanskiy
  2019-01-23 14:12   ` Willem de Bruijn
  8 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-23 10:16 UTC (permalink / raw)
  To: Willem de Bruijn, David S. Miller
  Cc: netdev, Saeed Mahameed, Eran Ben Elisha, Tariq Toukan,
	Jason Wang, Eric Dumazet

From my perspective, after the discussion we had with Willem, the
current version of this series can be merged to net-next.

Willem, do you approve it?

Thanks for reviewing!

> -----Original Message-----
> From: Maxim Mikityanskiy
> Sent: 14 January, 2019 15:19
> To: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>
> Cc: netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq
> Toukan <tariqt@mellanox.com>; Maxim Mikityanskiy <maximmi@mellanox.com>
> Subject: [PATCH 0/7] AF_PACKET transport_offset fix
> 
> This patch series contains the implementation of the RFC that was posted
> on this mailing list previously:
> https://www.spinics.net/lists/netdev/msg541709.html
> 
> It fixes having incorrect skb->transport_header values in cases when
> dissect fails. Having correct values set by the kernel fixes mlx5
> operation and allows to remove some unnecessary code flows in mlx5.
> 
> Maxim Mikityanskiy (7):
>   net: Don't set transport offset to invalid value
>   net: Introduce parse_protocol header_ops callback
>   net/ethernet: Add parse_protocol header_ops support
>   net/packet: Ask driver for protocol if not provided by user
>   net/packet: Remove redundant skb->protocol set
>   net/mlx5e: Remove the wrong assumption about transport offset
>   net/mlx5e: Trust kernel regarding transport offset
> 
>  .../net/ethernet/mellanox/mlx5/core/en_tx.c   | 15 ++---------
>  drivers/net/tap.c                             |  4 +--
>  drivers/net/tun.c                             |  4 +--
>  drivers/net/xen-netback/netback.c             | 19 ++++++++------
>  include/linux/etherdevice.h                   |  1 +
>  include/linux/netdevice.h                     | 10 +++++++
>  include/linux/skbuff.h                        | 14 +++++-----
>  net/ethernet/eth.c                            | 13 ++++++++++
>  net/packet/af_packet.c                        | 26 +++++++++----------
>  9 files changed, 60 insertions(+), 46 deletions(-)
> 
> --
> 2.19.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/7] net: Don't set transport offset to invalid value
  2019-01-23 10:16         ` Maxim Mikityanskiy
@ 2019-01-23 14:11           ` Willem de Bruijn
  2019-01-24  9:47             ` Maxim Mikityanskiy
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-23 14:11 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Wed, Jan 23, 2019 at 5:17 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> > -----Original Message-----
> > From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> > Sent: 17 January, 2019 17:16
> > To: Maxim Mikityanskiy <maximmi@mellanox.com>
> > Cc: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> > <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> > <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>;
> > netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq Toukan
> > <tariqt@mellanox.com>
> > Subject: Re: [PATCH 1/7] net: Don't set transport offset to invalid value
> >
> > On Thu, Jan 17, 2019 at 4:10 AM Maxim Mikityanskiy <maximmi@mellanox.com>
> > wrote:
> > >
> > > > This is a lot of code change. This would do.
> > > >
> > > > @@ -2434,8 +2434,6 @@ static inline void
> > > > skb_probe_transport_header(struct sk_buff *skb,
> > > >
> > > >         if (skb_flow_dissect_flow_keys_basic(skb, &keys, NULL, 0, 0, 0,
> > 0))
> > > >                 skb_set_transport_header(skb, keys.control.thoff);
> > > > -       else
> > > > -               skb_set_transport_header(skb, offset_hint);
> > > >  }
> > > >
> > > > Though leaving an unused argument is a bit ugly. For net-next, indeed
> > > > better to clean up (please mark your patchset with net or net-next,
> > > > btw)
> > >
> > > It's for net-next (I'll resend with the correct mark), so I'll stick
> > > with the current implementation.
> >
> > Absolutely, sounds good.
> >
> > > > > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> > > > netback/netback.c
> > > > > index 80aae3a32c2a..b49b6e56ca47 100644
> > > > > --- a/drivers/net/xen-netback/netback.c
> > > > > +++ b/drivers/net/xen-netback/netback.c
> > > > > @@ -1105,6 +1105,7 @@ static int xenvif_tx_submit(struct xenvif_queue
> > > > *queue)
> > > > >                 struct xen_netif_tx_request *txp;
> > > > >                 u16 pending_idx;
> > > > >                 unsigned data_len;
> > > > > +               bool th_set;
> > > > >
> > > > >                 pending_idx = XENVIF_TX_CB(skb)->pending_idx;
> > > > >                 txp = &queue->pending_tx_info[pending_idx].req;
> > > > > @@ -1169,20 +1170,22 @@ static int xenvif_tx_submit(struct
> > xenvif_queue
> > > > *queue)
> > > > >                         continue;
> > > > >                 }
> > > > >
> > > > > -               skb_probe_transport_header(skb, 0);
> > > > > +               th_set = skb_try_probe_transport_header(skb);
> > > >
> > > > Can use skb_transport_header_was_set(). Then at least there is no need
> > > > to change the function's return value.
> > >
> > > I suppose this comment relates to the previous one, and if we do it for
> > > net-next, it's fine to make change I made, isn't it?
> >
> > If this is the only reason for the boolean return value, using
> > skb_transport_header_was_set() is more standard (I immediately know
> > what's happening when I read it), slightly less code change and avoids
> > introducing a situation where the majority of callers ignore a return
> > value. I think it's preferable. But these merits are certainly
> > debatable, so either is fine.
>
> From my side, I wanted to avoid calling skb_transport_header_was_set
> twice,  so I made skb_try_probe_transport_header return whether it
> succeeded or not. I think "try" in the function name indicates this idea
> pretty clearly. This result status is pretty useful, it just happened
> that it's not needed in many places,

Which is an indication that it's perhaps not needed.

> but the general idea is that we
> report this status, so if you say that my version is also good for you,
> I'll leave it as is. It was just a rationale for my decision.

It's fine. But please avoid the code churn in xenvif_tx_submit
with to extra indentation. This is equivalent:

-               if (skb_is_gso(skb)) {
+               if (skb_is_gso(skb) && th_set) {

More fundamentally, the code has the assumption that th_set
always holds if skb_is_gso(skb). Why add the check? This is
another example that the return value is not really needed.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/7] AF_PACKET transport_offset fix
  2019-01-23 10:16 ` Maxim Mikityanskiy
@ 2019-01-23 14:12   ` Willem de Bruijn
  0 siblings, 0 replies; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-23 14:12 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: Willem de Bruijn, David S. Miller, netdev, Saeed Mahameed,
	Eran Ben Elisha, Tariq Toukan, Jason Wang, Eric Dumazet

On Wed, Jan 23, 2019 at 5:19 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> From my perspective, after the discussion we had with Willem, the
> current version of this series can be merged to net-next.

net-next has moved forward. The patchset as is no longer applies cleanly.


> Willem, do you approve it?
>
> Thanks for reviewing!
>
> > -----Original Message-----
> > From: Maxim Mikityanskiy
> > Sent: 14 January, 2019 15:19
> > To: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> > <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> > <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>
> > Cc: netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq
> > Toukan <tariqt@mellanox.com>; Maxim Mikityanskiy <maximmi@mellanox.com>
> > Subject: [PATCH 0/7] AF_PACKET transport_offset fix
> >
> > This patch series contains the implementation of the RFC that was posted
> > on this mailing list previously:
> > https://www.spinics.net/lists/netdev/msg541709.html
> >
> > It fixes having incorrect skb->transport_header values in cases when
> > dissect fails. Having correct values set by the kernel fixes mlx5
> > operation and allows to remove some unnecessary code flows in mlx5.
> >
> > Maxim Mikityanskiy (7):
> >   net: Don't set transport offset to invalid value
> >   net: Introduce parse_protocol header_ops callback
> >   net/ethernet: Add parse_protocol header_ops support
> >   net/packet: Ask driver for protocol if not provided by user
> >   net/packet: Remove redundant skb->protocol set
> >   net/mlx5e: Remove the wrong assumption about transport offset
> >   net/mlx5e: Trust kernel regarding transport offset
> >
> >  .../net/ethernet/mellanox/mlx5/core/en_tx.c   | 15 ++---------
> >  drivers/net/tap.c                             |  4 +--
> >  drivers/net/tun.c                             |  4 +--
> >  drivers/net/xen-netback/netback.c             | 19 ++++++++------
> >  include/linux/etherdevice.h                   |  1 +
> >  include/linux/netdevice.h                     | 10 +++++++
> >  include/linux/skbuff.h                        | 14 +++++-----
> >  net/ethernet/eth.c                            | 13 ++++++++++
> >  net/packet/af_packet.c                        | 26 +++++++++----------
> >  9 files changed, 60 insertions(+), 46 deletions(-)
> >
> > --
> > 2.19.1
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
  2019-01-14 13:19 ` [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support Maxim Mikityanskiy
@ 2019-01-23 14:14   ` Willem de Bruijn
  2019-01-24  9:47     ` Maxim Mikityanskiy
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-23 14:14 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Mon, Jan 14, 2019 at 8:21 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> The previous commit introduced parse_protocol callback which should
> extract the protocol number from the L2 header. Make all Ethernet
> devices support it.
>
> Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> ---
>  include/linux/etherdevice.h |  1 +
>  net/ethernet/eth.c          | 13 +++++++++++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> index 2c0af7b00715..e2f3b21cd72a 100644
> --- a/include/linux/etherdevice.h
> +++ b/include/linux/etherdevice.h
> @@ -44,6 +44,7 @@ int eth_header_cache(const struct neighbour *neigh, struct hh_cache *hh,
>                      __be16 type);
>  void eth_header_cache_update(struct hh_cache *hh, const struct net_device *dev,
>                              const unsigned char *haddr);
> +__be16 eth_header_parse_protocol(const struct sk_buff *skb);

Does not need to be exposed in the header file or exported.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 1/7] net: Don't set transport offset to invalid value
  2019-01-23 14:11           ` Willem de Bruijn
@ 2019-01-24  9:47             ` Maxim Mikityanskiy
  2019-01-24 14:19               ` Willem de Bruijn
  0 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-24  9:47 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

> > but the general idea is that we
> > report this status, so if you say that my version is also good for you,
> > I'll leave it as is. It was just a rationale for my decision.
> 
> It's fine. But please avoid the code churn in xenvif_tx_submit
> with to extra indentation. This is equivalent:
> 
> -               if (skb_is_gso(skb)) {
> +               if (skb_is_gso(skb) && th_set) {
> 
> More fundamentally, the code has the assumption that th_set
> always holds if skb_is_gso(skb). Why add the check? This is
> another example that the return value is not really needed.

What about

if (skb_is_gso(skb)) {
	BUG_ON(!skb_transport_header_was_set(skb));

?

I think it's cleaner than skipping the action here if dissect failed and
propagating a potential bug further.

> > > If this is the only reason for the boolean return value, using
> > > skb_transport_header_was_set() is more standard (I immediately know
> > > what's happening when I read it), slightly less code change and avoids
> > > introducing a situation where the majority of callers ignore a return
> > > value. I think it's preferable. But these merits are certainly
> > > debatable, so either is fine.
> >
> > From my side, I wanted to avoid calling skb_transport_header_was_set
> > twice,  so I made skb_try_probe_transport_header return whether it
> > succeeded or not. I think "try" in the function name indicates this idea
> > pretty clearly. This result status is pretty useful, it just happened
> > that it's not needed in many places,
> 
> Which is an indication that it's perhaps not needed.

Well, from the point of view of the function, it looks reasonable to
notify the caller whether the call was successful or not... You know,
many functions return error codes.

However, this case is rather special, because it turned out that we
don't actually need that error status immediately, but it can be
requested much later, and we already have skb_transport_header_was_set
for that.

So, considering these points, the return value can be removed, as all
use cases are "probe now, check later", not "probe now and handle errors
immediately".

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
  2019-01-23 14:14   ` Willem de Bruijn
@ 2019-01-24  9:47     ` Maxim Mikityanskiy
  2019-01-24 14:21       ` Willem de Bruijn
  0 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-24  9:47 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

> -----Original Message-----
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Sent: 23 January, 2019 16:15
> To: Maxim Mikityanskiy <maximmi@mellanox.com>
> Cc: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>;
> netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq Toukan
> <tariqt@mellanox.com>
> Subject: Re: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
> 
> On Mon, Jan 14, 2019 at 8:21 AM Maxim Mikityanskiy <maximmi@mellanox.com>
> wrote:
> >
> > The previous commit introduced parse_protocol callback which should
> > extract the protocol number from the L2 header. Make all Ethernet
> > devices support it.
> >
> > Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> > ---
> >  include/linux/etherdevice.h |  1 +
> >  net/ethernet/eth.c          | 13 +++++++++++++
> >  2 files changed, 14 insertions(+)
> >
> > diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> > index 2c0af7b00715..e2f3b21cd72a 100644
> > --- a/include/linux/etherdevice.h
> > +++ b/include/linux/etherdevice.h
> > @@ -44,6 +44,7 @@ int eth_header_cache(const struct neighbour *neigh,
> struct hh_cache *hh,
> >                      __be16 type);
> >  void eth_header_cache_update(struct hh_cache *hh, const struct net_device
> *dev,
> >                              const unsigned char *haddr);
> > +__be16 eth_header_parse_protocol(const struct sk_buff *skb);
> 
> Does not need to be exposed in the header file or exported.

Are you sure? All the other Ethernet header_ops callbacks are exported
and declared in the header. I'm not sure about the reason why it is done
in such a way, but my guess is that it will be useful if some driver
decides to replace one callback in header_ops but to use the default
ones for the rest of callbacks. If the default callbacks were not
accessible externally, it wouldn't be possible. So, I'm following the
existing pattern here, and there are reasons for that. What do you think
about it?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/7] net: Don't set transport offset to invalid value
  2019-01-24  9:47             ` Maxim Mikityanskiy
@ 2019-01-24 14:19               ` Willem de Bruijn
  0 siblings, 0 replies; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-24 14:19 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Thu, Jan 24, 2019 at 4:48 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> > > but the general idea is that we
> > > report this status, so if you say that my version is also good for you,
> > > I'll leave it as is. It was just a rationale for my decision.
> >
> > It's fine. But please avoid the code churn in xenvif_tx_submit
> > with to extra indentation. This is equivalent:
> >
> > -               if (skb_is_gso(skb)) {
> > +               if (skb_is_gso(skb) && th_set) {
> >
> > More fundamentally, the code has the assumption that th_set
> > always holds if skb_is_gso(skb). Why add the check? This is
> > another example that the return value is not really needed.
>
> What about
>
> if (skb_is_gso(skb)) {
>         BUG_ON(!skb_transport_header_was_set(skb));
>
> ?

Good idea to validate. Please on error use WARN_ON_ONCE and free the
packet as with other errors in this function. BUG_ONs are problematic
when a path to them is found.

The other option is to stay as close to the current logic as possible
in this driver and set the mac header at offset 0 if probe fails.
Either through the return value or skb_transport_header_was_set.

> I think it's cleaner than skipping the action here if dissect failed and
> propagating a potential bug further.
>
> > > > If this is the only reason for the boolean return value, using
> > > > skb_transport_header_was_set() is more standard (I immediately know
> > > > what's happening when I read it), slightly less code change and avoids
> > > > introducing a situation where the majority of callers ignore a return
> > > > value. I think it's preferable. But these merits are certainly
> > > > debatable, so either is fine.
> > >
> > > From my side, I wanted to avoid calling skb_transport_header_was_set
> > > twice,  so I made skb_try_probe_transport_header return whether it
> > > succeeded or not. I think "try" in the function name indicates this idea
> > > pretty clearly. This result status is pretty useful, it just happened
> > > that it's not needed in many places,
> >
> > Which is an indication that it's perhaps not needed.
>
> Well, from the point of view of the function, it looks reasonable to
> notify the caller whether the call was successful or not... You know,
> many functions return error codes.
>
> However, this case is rather special, because it turned out that we
> don't actually need that error status immediately, but it can be
> requested much later, and we already have skb_transport_header_was_set
> for that.
>
> So, considering these points, the return value can be removed, as all
> use cases are "probe now, check later", not "probe now and handle errors
> immediately".

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
  2019-01-24  9:47     ` Maxim Mikityanskiy
@ 2019-01-24 14:21       ` Willem de Bruijn
  2019-01-25  8:51         ` Maxim Mikityanskiy
  0 siblings, 1 reply; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-24 14:21 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

On Thu, Jan 24, 2019 at 4:48 AM Maxim Mikityanskiy <maximmi@mellanox.com> wrote:
>
> > -----Original Message-----
> > From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> > Sent: 23 January, 2019 16:15
> > To: Maxim Mikityanskiy <maximmi@mellanox.com>
> > Cc: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> > <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> > <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>;
> > netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq Toukan
> > <tariqt@mellanox.com>
> > Subject: Re: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
> >
> > On Mon, Jan 14, 2019 at 8:21 AM Maxim Mikityanskiy <maximmi@mellanox.com>
> > wrote:
> > >
> > > The previous commit introduced parse_protocol callback which should
> > > extract the protocol number from the L2 header. Make all Ethernet
> > > devices support it.
> > >
> > > Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> > > ---
> > >  include/linux/etherdevice.h |  1 +
> > >  net/ethernet/eth.c          | 13 +++++++++++++
> > >  2 files changed, 14 insertions(+)
> > >
> > > diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> > > index 2c0af7b00715..e2f3b21cd72a 100644
> > > --- a/include/linux/etherdevice.h
> > > +++ b/include/linux/etherdevice.h
> > > @@ -44,6 +44,7 @@ int eth_header_cache(const struct neighbour *neigh,
> > struct hh_cache *hh,
> > >                      __be16 type);
> > >  void eth_header_cache_update(struct hh_cache *hh, const struct net_device
> > *dev,
> > >                              const unsigned char *haddr);
> > > +__be16 eth_header_parse_protocol(const struct sk_buff *skb);
> >
> > Does not need to be exposed in the header file or exported.
>
> Are you sure? All the other Ethernet header_ops callbacks are exported
> and declared in the header. I'm not sure about the reason why it is done
> in such a way, but my guess is that it will be useful if some driver
> decides to replace one callback in header_ops but to use the default
> ones for the rest of callbacks.

I don't exactly follow this. But I think that many are exported
because Ethernet is so common that of these are also called directly
instead of through header_ops. Looking at other header_ops
implementations, or other such callback structs, shows many examples
where the members are static local functions.

> If the default callbacks were not
> accessible externally, it wouldn't be possible. So, I'm following the
> existing pattern here, and there are reasons for that. What do you think
> about it?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* RE: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
  2019-01-24 14:21       ` Willem de Bruijn
@ 2019-01-25  8:51         ` Maxim Mikityanskiy
  2019-01-25 13:52           ` Willem de Bruijn
  0 siblings, 1 reply; 27+ messages in thread
From: Maxim Mikityanskiy @ 2019-01-25  8:51 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

> -----Original Message-----
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Sent: 24 January, 2019 16:22
> To: Maxim Mikityanskiy <maximmi@mellanox.com>
> Cc: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>;
> netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq Toukan
> <tariqt@mellanox.com>
> Subject: Re: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
> 
> On Thu, Jan 24, 2019 at 4:48 AM Maxim Mikityanskiy <maximmi@mellanox.com>
> wrote:
> >
> > > -----Original Message-----
> > > From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> > > Sent: 23 January, 2019 16:15
> > > To: Maxim Mikityanskiy <maximmi@mellanox.com>
> > > Cc: David S. Miller <davem@davemloft.net>; Saeed Mahameed
> > > <saeedm@mellanox.com>; Willem de Bruijn <willemb@google.com>; Jason Wang
> > > <jasowang@redhat.com>; Eric Dumazet <edumazet@google.com>;
> > > netdev@vger.kernel.org; Eran Ben Elisha <eranbe@mellanox.com>; Tariq
> Toukan
> > > <tariqt@mellanox.com>
> > > Subject: Re: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops
> support
> > >
> > > On Mon, Jan 14, 2019 at 8:21 AM Maxim Mikityanskiy
> <maximmi@mellanox.com>
> > > wrote:
> > > >
> > > > The previous commit introduced parse_protocol callback which should
> > > > extract the protocol number from the L2 header. Make all Ethernet
> > > > devices support it.
> > > >
> > > > Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
> > > > ---
> > > >  include/linux/etherdevice.h |  1 +
> > > >  net/ethernet/eth.c          | 13 +++++++++++++
> > > >  2 files changed, 14 insertions(+)
> > > >
> > > > diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> > > > index 2c0af7b00715..e2f3b21cd72a 100644
> > > > --- a/include/linux/etherdevice.h
> > > > +++ b/include/linux/etherdevice.h
> > > > @@ -44,6 +44,7 @@ int eth_header_cache(const struct neighbour *neigh,
> > > struct hh_cache *hh,
> > > >                      __be16 type);
> > > >  void eth_header_cache_update(struct hh_cache *hh, const struct
> net_device
> > > *dev,
> > > >                              const unsigned char *haddr);
> > > > +__be16 eth_header_parse_protocol(const struct sk_buff *skb);
> > >
> > > Does not need to be exposed in the header file or exported.
> >
> > Are you sure? All the other Ethernet header_ops callbacks are exported
> > and declared in the header. I'm not sure about the reason why it is done
> > in such a way, but my guess is that it will be useful if some driver
> > decides to replace one callback in header_ops but to use the default
> > ones for the rest of callbacks.
> 
> I don't exactly follow this. But I think that many are exported
> because Ethernet is so common that of these are also called directly
> instead of through header_ops. Looking at other header_ops
> implementations, or other such callback structs, shows many examples
> where the members are static local functions.

Yes, they are called directly indeed, but not all of them. E.g.,
eth_header_parse is never called directly. On the other hand, look at
drivers/net/macvlan.c:

static const struct header_ops macvlan_hard_header_ops = {
        .create         = macvlan_hard_header,
        .parse          = eth_header_parse,
        .cache          = eth_header_cache,
        .cache_update   = eth_header_cache_update,
};

This is exactly what I am talking about. In order to support it,
eth_header_parse_protocol needs to be exported. BTW, we should consider
adding it to macvlan_hard_header_ops, ipvlan_header_ops and all other
such structures.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support
  2019-01-25  8:51         ` Maxim Mikityanskiy
@ 2019-01-25 13:52           ` Willem de Bruijn
  0 siblings, 0 replies; 27+ messages in thread
From: Willem de Bruijn @ 2019-01-25 13:52 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: David S. Miller, Saeed Mahameed, Willem de Bruijn, Jason Wang,
	Eric Dumazet, netdev, Eran Ben Elisha, Tariq Toukan

> > > > > diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> > > > > index 2c0af7b00715..e2f3b21cd72a 100644
> > > > > --- a/include/linux/etherdevice.h
> > > > > +++ b/include/linux/etherdevice.h
> > > > > @@ -44,6 +44,7 @@ int eth_header_cache(const struct neighbour *neigh,
> > > > struct hh_cache *hh,
> > > > >                      __be16 type);
> > > > >  void eth_header_cache_update(struct hh_cache *hh, const struct
> > net_device
> > > > *dev,
> > > > >                              const unsigned char *haddr);
> > > > > +__be16 eth_header_parse_protocol(const struct sk_buff *skb);
> > > >
> > > > Does not need to be exposed in the header file or exported.
> > >
> > > Are you sure? All the other Ethernet header_ops callbacks are exported
> > > and declared in the header. I'm not sure about the reason why it is done
> > > in such a way, but my guess is that it will be useful if some driver
> > > decides to replace one callback in header_ops but to use the default
> > > ones for the rest of callbacks.
> >
> > I don't exactly follow this. But I think that many are exported
> > because Ethernet is so common that of these are also called directly
> > instead of through header_ops. Looking at other header_ops
> > implementations, or other such callback structs, shows many examples
> > where the members are static local functions.
>
> Yes, they are called directly indeed, but not all of them. E.g.,
> eth_header_parse is never called directly. On the other hand, look at
> drivers/net/macvlan.c:
>
> static const struct header_ops macvlan_hard_header_ops = {
>         .create         = macvlan_hard_header,
>         .parse          = eth_header_parse,
>         .cache          = eth_header_cache,
>         .cache_update   = eth_header_cache_update,
> };
>
> This is exactly what I am talking about. In order to support it,
> eth_header_parse_protocol needs to be exported. BTW, we should consider
> adding it to macvlan_hard_header_ops, ipvlan_header_ops and all other
> such structures.

Very good point. Okay, export it is then.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-01-25 13:53 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-14 13:18 [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
2019-01-14 13:18 ` [PATCH 1/7] net: Don't set transport offset to invalid value Maxim Mikityanskiy
2019-01-14 16:49   ` Willem de Bruijn
2019-01-17  9:10     ` Maxim Mikityanskiy
2019-01-17 15:16       ` Willem de Bruijn
2019-01-23 10:16         ` Maxim Mikityanskiy
2019-01-23 14:11           ` Willem de Bruijn
2019-01-24  9:47             ` Maxim Mikityanskiy
2019-01-24 14:19               ` Willem de Bruijn
2019-01-14 13:18 ` [PATCH 2/7] net: Introduce parse_protocol header_ops callback Maxim Mikityanskiy
2019-01-14 13:19 ` [PATCH 3/7] net/ethernet: Add parse_protocol header_ops support Maxim Mikityanskiy
2019-01-23 14:14   ` Willem de Bruijn
2019-01-24  9:47     ` Maxim Mikityanskiy
2019-01-24 14:21       ` Willem de Bruijn
2019-01-25  8:51         ` Maxim Mikityanskiy
2019-01-25 13:52           ` Willem de Bruijn
2019-01-14 13:19 ` [PATCH 4/7] net/packet: Ask driver for protocol if not provided by user Maxim Mikityanskiy
2019-01-14 16:52   ` Willem de Bruijn
2019-01-15 16:32     ` Willem de Bruijn
2019-01-17  9:10       ` Maxim Mikityanskiy
2019-01-17 15:41         ` Willem de Bruijn
2019-01-14 13:19 ` [PATCH 5/7] net/packet: Remove redundant skb->protocol set Maxim Mikityanskiy
2019-01-14 13:19 ` [PATCH 6/7] net/mlx5e: Remove the wrong assumption about transport offset Maxim Mikityanskiy
2019-01-14 13:19 ` [PATCH 7/7] net/mlx5e: Trust kernel regarding " Maxim Mikityanskiy
2019-01-14 13:52 ` [PATCH 0/7] AF_PACKET transport_offset fix Maxim Mikityanskiy
2019-01-23 10:16 ` Maxim Mikityanskiy
2019-01-23 14:12   ` Willem de Bruijn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.