netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel
@ 2019-02-23 21:05 Sasha Levin
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 02/65] xfrm: refine validation of template and selector families Sasha Levin
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:05 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Su Yanjun, Steffen Klassert, Sasha Levin, netdev

From: Su Yanjun <suyj.fnst@cn.fujitsu.com>

[ Upstream commit dd9ee3444014e8f28c0eefc9fffc9ac9c5248c12 ]

Recently we run a network test over ipcomp virtual tunnel.We find that
if a ipv4 packet needs fragment, then the peer can't receive
it.

We deep into the code and find that when packet need fragment the smaller
fragment will be encapsulated by ipip not ipcomp. So when the ipip packet
goes into xfrm, it's skb->dev is not properly set. The ipv4 reassembly code
always set skb'dev to the last fragment's dev. After ipv4 defrag processing,
when the kernel rp_filter parameter is set, the skb will be drop by -EXDEV
error.

This patch adds compatible support for the ipip process in ipcomp virtual tunnel.

Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv4/ip_vti.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 7f56944b020f6..40a7cd56e0087 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -74,6 +74,33 @@ static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi,
 	return 0;
 }
 
+static int vti_input_ipip(struct sk_buff *skb, int nexthdr, __be32 spi,
+		     int encap_type)
+{
+	struct ip_tunnel *tunnel;
+	const struct iphdr *iph = ip_hdr(skb);
+	struct net *net = dev_net(skb->dev);
+	struct ip_tunnel_net *itn = net_generic(net, vti_net_id);
+
+	tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+				  iph->saddr, iph->daddr, 0);
+	if (tunnel) {
+		if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
+			goto drop;
+
+		XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4 = tunnel;
+
+		skb->dev = tunnel->dev;
+
+		return xfrm_input(skb, nexthdr, spi, encap_type);
+	}
+
+	return -EINVAL;
+drop:
+	kfree_skb(skb);
+	return 0;
+}
+
 static int vti_rcv(struct sk_buff *skb)
 {
 	XFRM_SPI_SKB_CB(skb)->family = AF_INET;
@@ -82,6 +109,14 @@ static int vti_rcv(struct sk_buff *skb)
 	return vti_input(skb, ip_hdr(skb)->protocol, 0, 0);
 }
 
+static int vti_rcv_ipip(struct sk_buff *skb)
+{
+	XFRM_SPI_SKB_CB(skb)->family = AF_INET;
+	XFRM_SPI_SKB_CB(skb)->daddroff = offsetof(struct iphdr, daddr);
+
+	return vti_input_ipip(skb, ip_hdr(skb)->protocol, ip_hdr(skb)->saddr, 0);
+}
+
 static int vti_rcv_cb(struct sk_buff *skb, int err)
 {
 	unsigned short family;
@@ -435,6 +470,12 @@ static struct xfrm4_protocol vti_ipcomp4_protocol __read_mostly = {
 	.priority	=	100,
 };
 
+static struct xfrm_tunnel ipip_handler __read_mostly = {
+	.handler	=	vti_rcv_ipip,
+	.err_handler	=	vti4_err,
+	.priority	=	0,
+};
+
 static int __net_init vti_init_net(struct net *net)
 {
 	int err;
@@ -603,6 +644,13 @@ static int __init vti_init(void)
 	if (err < 0)
 		goto xfrm_proto_comp_failed;
 
+	msg = "ipip tunnel";
+	err = xfrm4_tunnel_register(&ipip_handler, AF_INET);
+	if (err < 0) {
+		pr_info("%s: cant't register tunnel\n",__func__);
+		goto xfrm_tunnel_failed;
+	}
+
 	msg = "netlink interface";
 	err = rtnl_link_register(&vti_link_ops);
 	if (err < 0)
@@ -612,6 +660,8 @@ static int __init vti_init(void)
 
 rtnl_link_failed:
 	xfrm4_protocol_deregister(&vti_ipcomp4_protocol, IPPROTO_COMP);
+xfrm_tunnel_failed:
+	xfrm4_tunnel_deregister(&ipip_handler, AF_INET);
 xfrm_proto_comp_failed:
 	xfrm4_protocol_deregister(&vti_ah4_protocol, IPPROTO_AH);
 xfrm_proto_ah_failed:
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 02/65] xfrm: refine validation of template and selector families
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
@ 2019-02-23 21:05 ` Sasha Levin
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 03/65] xfrm: Make set-mark default behavior backward compatible Sasha Levin
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:05 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Florian Westphal, Steffen Klassert, Sasha Levin, netdev

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 35e6103861a3a970de6c84688c6e7a1f65b164ca ]

The check assumes that in transport mode, the first templates family
must match the address family of the policy selector.

Syzkaller managed to build a template using MODE_ROUTEOPTIMIZATION,
with ipv4-in-ipv6 chain, leading to following splat:

BUG: KASAN: stack-out-of-bounds in xfrm_state_find+0x1db/0x1854
Read of size 4 at addr ffff888063e57aa0 by task a.out/2050
 xfrm_state_find+0x1db/0x1854
 xfrm_tmpl_resolve+0x100/0x1d0
 xfrm_resolve_and_create_bundle+0x108/0x1000 [..]

Problem is that addresses point into flowi4 struct, but xfrm_state_find
treats them as being ipv6 because it uses templ->encap_family is used
(AF_INET6 in case of reproducer) rather than family (AF_INET).

This patch inverts the logic: Enforce 'template family must match
selector' EXCEPT for tunnel and BEET mode.

In BEET and Tunnel mode, xfrm_tmpl_resolve_one will have remote/local
address pointers changed to point at the addresses found in the template,
rather than the flowi ones, so no oob read will occur.

Reported-by: 3ntr0py1337@gmail.com
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/xfrm/xfrm_user.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 566919838d5e8..ab557827aac05 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1488,10 +1488,15 @@ static int validate_tmpl(int nr, struct xfrm_user_tmpl *ut, u16 family)
 		if (!ut[i].family)
 			ut[i].family = family;
 
-		if ((ut[i].mode == XFRM_MODE_TRANSPORT) &&
-		    (ut[i].family != prev_family))
-			return -EINVAL;
-
+		switch (ut[i].mode) {
+		case XFRM_MODE_TUNNEL:
+		case XFRM_MODE_BEET:
+			break;
+		default:
+			if (ut[i].family != prev_family)
+				return -EINVAL;
+			break;
+		}
 		if (ut[i].mode >= XFRM_MODE_MAX)
 			return -EINVAL;
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 03/65] xfrm: Make set-mark default behavior backward compatible
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 02/65] xfrm: refine validation of template and selector families Sasha Levin
@ 2019-02-23 21:05 ` Sasha Levin
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 04/65] netfilter: nft_compat: use refcnt_t type for nft_xt reference count Sasha Levin
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:05 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Benedict Wong, Steffen Klassert, Sasha Levin, netdev

From: Benedict Wong <benedictwong@google.com>

[ Upstream commit e2612cd496e7b465711d219ea6118893d7253f52 ]

Fixes 9b42c1f179a6, which changed the default route lookup behavior for
tunnel mode SAs in the outbound direction to use the skb mark, whereas
previously mark=0 was used if the output mark was unspecified. In
mark-based routing schemes such as Android’s, this change in default
behavior causes routing loops or lookup failures.

This patch restores the default behavior of using a 0 mark while still
incorporating the skb mark if the SET_MARK (and SET_MARK_MASK) is
specified.

Tested with additions to Android's kernel unit test suite:
https://android-review.googlesource.com/c/kernel/tests/+/860150

Fixes: 9b42c1f179a6 ("xfrm: Extend the output_mark to support input direction and masking")
Signed-off-by: Benedict Wong <benedictwong@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/xfrm/xfrm_policy.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 119a427d9b2b2..6ea8036fcdbeb 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1628,7 +1628,10 @@ static struct dst_entry *xfrm_bundle_create(struct xfrm_policy *policy,
 		dst_copy_metrics(dst1, dst);
 
 		if (xfrm[i]->props.mode != XFRM_MODE_TRANSPORT) {
-			__u32 mark = xfrm_smark_get(fl->flowi_mark, xfrm[i]);
+			__u32 mark = 0;
+
+			if (xfrm[i]->props.smark.v || xfrm[i]->props.smark.m)
+				mark = xfrm_smark_get(fl->flowi_mark, xfrm[i]);
 
 			family = xfrm[i]->props.family;
 			dst = xfrm_dst_lookup(xfrm[i], tos, fl->flowi_oif,
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 04/65] netfilter: nft_compat: use refcnt_t type for nft_xt reference count
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 02/65] xfrm: refine validation of template and selector families Sasha Levin
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 03/65] xfrm: Make set-mark default behavior backward compatible Sasha Levin
@ 2019-02-23 21:05 ` Sasha Levin
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 05/65] netfilter: nft_compat: make lists per netns Sasha Levin
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:05 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Florian Westphal, Pablo Neira Ayuso, Sasha Levin,
	netfilter-devel, coreteam, netdev

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 12c44aba6618b7f6c437076e5722237190f6cd5f ]

Using standard integer type was fine while all operations on it were
guarded by the nftnl subsys mutex.

This isn't true anymore:
1. transactions are guarded only by a pernet mutex, so concurrent
   rule manipulation in different netns is racy
2. the ->destroy hook runs from a work queue after the transaction
   mutex has been released already.

cpu0                           cpu1 (net 1)        cpu2 (net 2)
 kworker
    nft_compat->destroy        nft_compat->init    nft_compat->init
      if (--nft_xt->ref == 0)   nft_xt->ref++        nft_xt->ref++

Switch to refcount_t.  Doing this however only fixes a minor aspect,
nft_compat also performs linked-list operations in an unsafe way.

This is addressed in the next two patches.

Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Fixes: 0935d5588400 ("netfilter: nf_tables: asynchronous release")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_compat.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index 29d6fc73caf99..3e629ee59ddab 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -26,7 +26,7 @@
 struct nft_xt {
 	struct list_head	head;
 	struct nft_expr_ops	ops;
-	unsigned int		refcnt;
+	refcount_t		refcnt;
 
 	/* Unlike other expressions, ops doesn't have static storage duration.
 	 * nft core assumes they do.  We use kfree_rcu so that nft core can
@@ -45,7 +45,7 @@ struct nft_xt_match_priv {
 
 static bool nft_xt_put(struct nft_xt *xt)
 {
-	if (--xt->refcnt == 0) {
+	if (refcount_dec_and_test(&xt->refcnt)) {
 		list_del(&xt->head);
 		kfree_rcu(xt, rcu_head);
 		return true;
@@ -273,7 +273,7 @@ nft_target_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 		return -EINVAL;
 
 	nft_xt = container_of(expr->ops, struct nft_xt, ops);
-	nft_xt->refcnt++;
+	refcount_inc(&nft_xt->refcnt);
 	return 0;
 }
 
@@ -467,7 +467,7 @@ __nft_match_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 		return ret;
 
 	nft_xt = container_of(expr->ops, struct nft_xt, ops);
-	nft_xt->refcnt++;
+	refcount_inc(&nft_xt->refcnt);
 	return 0;
 }
 
@@ -769,7 +769,7 @@ nft_match_select_ops(const struct nft_ctx *ctx,
 		goto err;
 	}
 
-	nft_match->refcnt = 0;
+	refcount_set(&nft_match->refcnt, 0);
 	nft_match->ops.type = &nft_match_type;
 	nft_match->ops.eval = nft_match_eval;
 	nft_match->ops.init = nft_match_init;
@@ -873,7 +873,7 @@ nft_target_select_ops(const struct nft_ctx *ctx,
 		goto err;
 	}
 
-	nft_target->refcnt = 0;
+	refcount_set(&nft_target->refcnt, 0);
 	nft_target->ops.type = &nft_target_type;
 	nft_target->ops.size = NFT_EXPR_SIZE(XT_ALIGN(target->targetsize));
 	nft_target->ops.init = nft_target_init;
@@ -944,7 +944,7 @@ static void __exit nft_compat_module_exit(void)
 	list_for_each_entry_safe(xt, next, &nft_target_list, head) {
 		struct xt_target *target = xt->ops.data;
 
-		if (WARN_ON_ONCE(xt->refcnt))
+		if (WARN_ON_ONCE(refcount_read(&xt->refcnt)))
 			continue;
 		module_put(target->me);
 		kfree(xt);
@@ -953,7 +953,7 @@ static void __exit nft_compat_module_exit(void)
 	list_for_each_entry_safe(xt, next, &nft_match_list, head) {
 		struct xt_match *match = xt->ops.data;
 
-		if (WARN_ON_ONCE(xt->refcnt))
+		if (WARN_ON_ONCE(refcount_read(&xt->refcnt)))
 			continue;
 		module_put(match->me);
 		kfree(xt);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 05/65] netfilter: nft_compat: make lists per netns
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (2 preceding siblings ...)
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 04/65] netfilter: nft_compat: use refcnt_t type for nft_xt reference count Sasha Levin
@ 2019-02-23 21:05 ` Sasha Levin
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 14/65] ipvs: Fix signed integer overflow when setsockopt timeout Sasha Levin
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:05 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Florian Westphal, Pablo Neira Ayuso, Sasha Levin,
	netfilter-devel, coreteam, netdev

From: Florian Westphal <fw@strlen.de>

[ Upstream commit cf52572ebbd7189a1966c2b5fc34b97078cd1dce ]

There are two problems with nft_compat since the netlink config
plane uses a per-netns mutex:

1. Concurrent add/del accesses to the same list
2. accesses to a list element after it has been free'd already.

This patch fixes the first problem.

Freeing occurs from a work queue, after transaction mutexes have been
released, i.e., it still possible for a new transaction (even from
same net ns) to find the to-be-deleted expression in the list.

The ->destroy functions are not allowed to have any such side effects,
i.e. the list_del() in the destroy function is not allowed.

This part of the problem is solved in the next patch.
I tried to make this work by serializing list access via mutex
and by moving list_del() to a deactivate callback, but
Taehee spotted following race on this approach:

  NET #0                          NET #1
   >select_ops()
   ->init()
                                   ->select_ops()
   ->deactivate()
   ->destroy()
      nft_xt_put()
       kfree_rcu(xt, rcu_head);
                                   ->init() <-- use-after-free occurred.

Unfortunately, we can't increment reference count in
select_ops(), because we can't undo the refcount increase in
case a different expression fails in the same batch.

(The destroy hook will only be called in case the expression
 was initialized successfully).

Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_compat.c | 129 +++++++++++++++++++++++++------------
 1 file changed, 89 insertions(+), 40 deletions(-)

diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index 3e629ee59ddab..c56eb909f718c 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -22,6 +22,7 @@
 #include <linux/netfilter_bridge/ebtables.h>
 #include <linux/netfilter_arp/arp_tables.h>
 #include <net/netfilter/nf_tables.h>
+#include <net/netns/generic.h>
 
 struct nft_xt {
 	struct list_head	head;
@@ -43,6 +44,20 @@ struct nft_xt_match_priv {
 	void *info;
 };
 
+struct nft_compat_net {
+	struct list_head nft_target_list;
+	struct list_head nft_match_list;
+};
+
+static unsigned int nft_compat_net_id __read_mostly;
+static struct nft_expr_type nft_match_type;
+static struct nft_expr_type nft_target_type;
+
+static struct nft_compat_net *nft_compat_pernet(struct net *net)
+{
+	return net_generic(net, nft_compat_net_id);
+}
+
 static bool nft_xt_put(struct nft_xt *xt)
 {
 	if (refcount_dec_and_test(&xt->refcnt)) {
@@ -714,10 +729,6 @@ static const struct nfnetlink_subsystem nfnl_compat_subsys = {
 	.cb		= nfnl_nft_compat_cb,
 };
 
-static LIST_HEAD(nft_match_list);
-
-static struct nft_expr_type nft_match_type;
-
 static bool nft_match_cmp(const struct xt_match *match,
 			  const char *name, u32 rev, u32 family)
 {
@@ -729,6 +740,7 @@ static const struct nft_expr_ops *
 nft_match_select_ops(const struct nft_ctx *ctx,
 		     const struct nlattr * const tb[])
 {
+	struct nft_compat_net *cn;
 	struct nft_xt *nft_match;
 	struct xt_match *match;
 	unsigned int matchsize;
@@ -745,8 +757,10 @@ nft_match_select_ops(const struct nft_ctx *ctx,
 	rev = ntohl(nla_get_be32(tb[NFTA_MATCH_REV]));
 	family = ctx->family;
 
+	cn = nft_compat_pernet(ctx->net);
+
 	/* Re-use the existing match if it's already loaded. */
-	list_for_each_entry(nft_match, &nft_match_list, head) {
+	list_for_each_entry(nft_match, &cn->nft_match_list, head) {
 		struct xt_match *match = nft_match->ops.data;
 
 		if (nft_match_cmp(match, mt_name, rev, family))
@@ -790,7 +804,7 @@ nft_match_select_ops(const struct nft_ctx *ctx,
 
 	nft_match->ops.size = matchsize;
 
-	list_add(&nft_match->head, &nft_match_list);
+	list_add(&nft_match->head, &cn->nft_match_list);
 
 	return &nft_match->ops;
 err:
@@ -806,10 +820,6 @@ static struct nft_expr_type nft_match_type __read_mostly = {
 	.owner		= THIS_MODULE,
 };
 
-static LIST_HEAD(nft_target_list);
-
-static struct nft_expr_type nft_target_type;
-
 static bool nft_target_cmp(const struct xt_target *tg,
 			   const char *name, u32 rev, u32 family)
 {
@@ -821,6 +831,7 @@ static const struct nft_expr_ops *
 nft_target_select_ops(const struct nft_ctx *ctx,
 		      const struct nlattr * const tb[])
 {
+	struct nft_compat_net *cn;
 	struct nft_xt *nft_target;
 	struct xt_target *target;
 	char *tg_name;
@@ -841,8 +852,9 @@ nft_target_select_ops(const struct nft_ctx *ctx,
 	    strcmp(tg_name, "standard") == 0)
 		return ERR_PTR(-EINVAL);
 
+	cn = nft_compat_pernet(ctx->net);
 	/* Re-use the existing target if it's already loaded. */
-	list_for_each_entry(nft_target, &nft_target_list, head) {
+	list_for_each_entry(nft_target, &cn->nft_target_list, head) {
 		struct xt_target *target = nft_target->ops.data;
 
 		if (!target->target)
@@ -887,7 +899,7 @@ nft_target_select_ops(const struct nft_ctx *ctx,
 	else
 		nft_target->ops.eval = nft_target_eval_xt;
 
-	list_add(&nft_target->head, &nft_target_list);
+	list_add(&nft_target->head, &cn->nft_target_list);
 
 	return &nft_target->ops;
 err:
@@ -903,13 +915,74 @@ static struct nft_expr_type nft_target_type __read_mostly = {
 	.owner		= THIS_MODULE,
 };
 
+static int __net_init nft_compat_init_net(struct net *net)
+{
+	struct nft_compat_net *cn = nft_compat_pernet(net);
+
+	INIT_LIST_HEAD(&cn->nft_target_list);
+	INIT_LIST_HEAD(&cn->nft_match_list);
+
+	return 0;
+}
+
+static void __net_exit nft_compat_exit_net(struct net *net)
+{
+	struct nft_compat_net *cn = nft_compat_pernet(net);
+	struct nft_xt *xt, *next;
+
+	if (list_empty(&cn->nft_match_list) &&
+	    list_empty(&cn->nft_target_list))
+		return;
+
+	/* If there was an error that caused nft_xt expr to not be initialized
+	 * fully and noone else requested the same expression later, the lists
+	 * contain 0-refcount entries that still hold module reference.
+	 *
+	 * Clean them here.
+	 */
+	mutex_lock(&net->nft.commit_mutex);
+	list_for_each_entry_safe(xt, next, &cn->nft_target_list, head) {
+		struct xt_target *target = xt->ops.data;
+
+		list_del_init(&xt->head);
+
+		if (refcount_read(&xt->refcnt))
+			continue;
+		module_put(target->me);
+		kfree(xt);
+	}
+
+	list_for_each_entry_safe(xt, next, &cn->nft_match_list, head) {
+		struct xt_match *match = xt->ops.data;
+
+		list_del_init(&xt->head);
+
+		if (refcount_read(&xt->refcnt))
+			continue;
+		module_put(match->me);
+		kfree(xt);
+	}
+	mutex_unlock(&net->nft.commit_mutex);
+}
+
+static struct pernet_operations nft_compat_net_ops = {
+	.init	= nft_compat_init_net,
+	.exit	= nft_compat_exit_net,
+	.id	= &nft_compat_net_id,
+	.size	= sizeof(struct nft_compat_net),
+};
+
 static int __init nft_compat_module_init(void)
 {
 	int ret;
 
+	ret = register_pernet_subsys(&nft_compat_net_ops);
+	if (ret < 0)
+		goto err_target;
+
 	ret = nft_register_expr(&nft_match_type);
 	if (ret < 0)
-		return ret;
+		goto err_pernet;
 
 	ret = nft_register_expr(&nft_target_type);
 	if (ret < 0)
@@ -922,45 +995,21 @@ static int __init nft_compat_module_init(void)
 	}
 
 	return ret;
-
 err_target:
 	nft_unregister_expr(&nft_target_type);
 err_match:
 	nft_unregister_expr(&nft_match_type);
+err_pernet:
+	unregister_pernet_subsys(&nft_compat_net_ops);
 	return ret;
 }
 
 static void __exit nft_compat_module_exit(void)
 {
-	struct nft_xt *xt, *next;
-
-	/* list should be empty here, it can be non-empty only in case there
-	 * was an error that caused nft_xt expr to not be initialized fully
-	 * and noone else requested the same expression later.
-	 *
-	 * In this case, the lists contain 0-refcount entries that still
-	 * hold module reference.
-	 */
-	list_for_each_entry_safe(xt, next, &nft_target_list, head) {
-		struct xt_target *target = xt->ops.data;
-
-		if (WARN_ON_ONCE(refcount_read(&xt->refcnt)))
-			continue;
-		module_put(target->me);
-		kfree(xt);
-	}
-
-	list_for_each_entry_safe(xt, next, &nft_match_list, head) {
-		struct xt_match *match = xt->ops.data;
-
-		if (WARN_ON_ONCE(refcount_read(&xt->refcnt)))
-			continue;
-		module_put(match->me);
-		kfree(xt);
-	}
 	nfnetlink_subsys_unregister(&nfnl_compat_subsys);
 	nft_unregister_expr(&nft_target_type);
 	nft_unregister_expr(&nft_match_type);
+	unregister_pernet_subsys(&nft_compat_net_ops);
 }
 
 MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_NFT_COMPAT);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 14/65] ipvs: Fix signed integer overflow when setsockopt timeout
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (3 preceding siblings ...)
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 05/65] netfilter: nft_compat: make lists per netns Sasha Levin
@ 2019-02-23 21:05 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 28/65] net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case Sasha Levin
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:05 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: ZhangXiaoxu, Pablo Neira Ayuso, Sasha Levin, netdev, lvs-devel,
	netfilter-devel, coreteam

From: ZhangXiaoxu <zhangxiaoxu5@huawei.com>

[ Upstream commit 53ab60baa1ac4f20b080a22c13b77b6373922fd7 ]

There is a UBSAN bug report as below:
UBSAN: Undefined behaviour in net/netfilter/ipvs/ip_vs_ctl.c:2227:21
signed integer overflow:
-2147483647 * 1000 cannot be represented in type 'int'

Reproduce program:
	#include <stdio.h>
	#include <sys/types.h>
	#include <sys/socket.h>

	#define IPPROTO_IP 0
	#define IPPROTO_RAW 255

	#define IP_VS_BASE_CTL		(64+1024+64)
	#define IP_VS_SO_SET_TIMEOUT	(IP_VS_BASE_CTL+10)

	/* The argument to IP_VS_SO_GET_TIMEOUT */
	struct ipvs_timeout_t {
		int tcp_timeout;
		int tcp_fin_timeout;
		int udp_timeout;
	};

	int main() {
		int ret = -1;
		int sockfd = -1;
		struct ipvs_timeout_t to;

		sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
		if (sockfd == -1) {
			printf("socket init error\n");
			return -1;
		}

		to.tcp_timeout = -2147483647;
		to.tcp_fin_timeout = -2147483647;
		to.udp_timeout = -2147483647;

		ret = setsockopt(sockfd,
				 IPPROTO_IP,
				 IP_VS_SO_SET_TIMEOUT,
				 (char *)(&to),
				 sizeof(to));

		printf("setsockopt return %d\n", ret);
		return ret;
	}

Return -EINVAL if the timeout value is negative or max than 'INT_MAX / HZ'.

Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/ipvs/ip_vs_ctl.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 518364f4abccd..55a77314340a5 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2220,6 +2220,18 @@ static int ip_vs_set_timeout(struct netns_ipvs *ipvs, struct ip_vs_timeout_user
 		  u->tcp_fin_timeout,
 		  u->udp_timeout);
 
+#ifdef CONFIG_IP_VS_PROTO_TCP
+	if (u->tcp_timeout < 0 || u->tcp_timeout > (INT_MAX / HZ) ||
+	    u->tcp_fin_timeout < 0 || u->tcp_fin_timeout > (INT_MAX / HZ)) {
+		return -EINVAL;
+	}
+#endif
+
+#ifdef CONFIG_IP_VS_PROTO_UDP
+	if (u->udp_timeout < 0 || u->udp_timeout > (INT_MAX / HZ))
+		return -EINVAL;
+#endif
+
 #ifdef CONFIG_IP_VS_PROTO_TCP
 	if (u->tcp_timeout) {
 		pd = ip_vs_proto_data_get(ipvs, IPPROTO_TCP);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 28/65] net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (4 preceding siblings ...)
  2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 14/65] ipvs: Fix signed integer overflow when setsockopt timeout Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 29/65] net: hns: Fix for missing of_node_put() after of_parse_phandle() Sasha Levin
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Tomonori Sakita, Atsushi Nemoto, David S . Miller, Sasha Levin, netdev

From: Tomonori Sakita <tomonori.sakita@sord.co.jp>

[ Upstream commit 6571ebce112a21ec9be68ef2f53b96fcd41fd81b ]

If fill_level was not zero and status was not BUSY,
result of "tx_prod - tx_cons - inuse" might be zero.
Subtracting 1 unconditionally results invalid negative return value
on this case.
Make sure not to return an negative value.

Signed-off-by: Tomonori Sakita <tomonori.sakita@sord.co.jp>
Signed-off-by: Atsushi Nemoto <atsushi.nemoto@sord.co.jp>
Reviewed-by: Dalon L Westergreen <dalon.westergreen@linux.intel.com>
Acked-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/altera/altera_msgdma.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/altera/altera_msgdma.c b/drivers/net/ethernet/altera/altera_msgdma.c
index 0fb986ba32905..0ae723f753417 100644
--- a/drivers/net/ethernet/altera/altera_msgdma.c
+++ b/drivers/net/ethernet/altera/altera_msgdma.c
@@ -145,7 +145,8 @@ u32 msgdma_tx_completions(struct altera_tse_private *priv)
 			& 0xffff;
 
 	if (inuse) { /* Tx FIFO is not empty */
-		ready = priv->tx_prod - priv->tx_cons - inuse - 1;
+		ready = max_t(int,
+			      priv->tx_prod - priv->tx_cons - inuse - 1, 0);
 	} else {
 		/* Check for buffered last packet */
 		status = csrrd32(priv->tx_dma_csr, msgdma_csroffs(status));
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 29/65] net: hns: Fix for missing of_node_put() after of_parse_phandle()
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (5 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 28/65] net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 30/65] net: hns: Restart autoneg need return failed when autoneg off Sasha Levin
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Yonglong Liu, Peng Li, David S . Miller, Sasha Levin, netdev

From: Yonglong Liu <liuyonglong@huawei.com>

[ Upstream commit 263c6d75f9a544a3c2f8f6a26de4f4808d8f59cf ]

In hns enet driver, we use of_parse_handle() to get hold of the
device node related to "ae-handle" but we have missed to put
the node reference using of_node_put() after we are done using
the node. This patch fixes it.

Note:
This problem is stated in Link: https://lkml.org/lkml/2018/12/22/217

Fixes: 48189d6aaf1e ("net: hns: enet specifies a reference to dsaf")
Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/hisilicon/hns/hns_enet.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_enet.c b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
index 6242249c9f4c5..b043370c26857 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_enet.c
@@ -2419,6 +2419,8 @@ static int hns_nic_dev_probe(struct platform_device *pdev)
 out_notify_fail:
 	(void)cancel_work_sync(&priv->service_task);
 out_read_prop_fail:
+	/* safe for ACPI FW */
+	of_node_put(to_of_node(priv->fwnode));
 	free_netdev(ndev);
 	return ret;
 }
@@ -2448,6 +2450,9 @@ static int hns_nic_dev_remove(struct platform_device *pdev)
 	set_bit(NIC_STATE_REMOVING, &priv->state);
 	(void)cancel_work_sync(&priv->service_task);
 
+	/* safe for ACPI FW */
+	of_node_put(to_of_node(priv->fwnode));
+
 	free_netdev(ndev);
 	return 0;
 }
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 30/65] net: hns: Restart autoneg need return failed when autoneg off
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (6 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 29/65] net: hns: Fix for missing of_node_put() after of_parse_phandle() Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 31/65] net: hns: Fix wrong read accesses via Clause 45 MDIO protocol Sasha Levin
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Yonglong Liu, Peng Li, David S . Miller, Sasha Levin, netdev

From: Yonglong Liu <liuyonglong@huawei.com>

[ Upstream commit ed29ca8b9592562559c64d027fb5eb126e463e2c ]

The hns driver of earlier devices, when autoneg off, restart autoneg
will return -EINVAL, so make the hns driver for the latest devices
do the same.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/hisilicon/hns/hns_ethtool.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
index 774beda040a16..e2710ff48fb0f 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ethtool.c
@@ -1157,16 +1157,18 @@ static int hns_get_regs_len(struct net_device *net_dev)
  */
 static int hns_nic_nway_reset(struct net_device *netdev)
 {
-	int ret = 0;
 	struct phy_device *phy = netdev->phydev;
 
-	if (netif_running(netdev)) {
-		/* if autoneg is disabled, don't restart auto-negotiation */
-		if (phy && phy->autoneg == AUTONEG_ENABLE)
-			ret = genphy_restart_aneg(phy);
-	}
+	if (!netif_running(netdev))
+		return 0;
 
-	return ret;
+	if (!phy)
+		return -EOPNOTSUPP;
+
+	if (phy->autoneg != AUTONEG_ENABLE)
+		return -EINVAL;
+
+	return genphy_restart_aneg(phy);
 }
 
 static u32
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 31/65] net: hns: Fix wrong read accesses via Clause 45 MDIO protocol
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (7 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 30/65] net: hns: Restart autoneg need return failed when autoneg off Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 32/65] net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup() Sasha Levin
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Yonglong Liu, Peng Li, David S . Miller, Sasha Levin, netdev

From: Yonglong Liu <liuyonglong@huawei.com>

[ Upstream commit cec8abba13e6a26729dfed41019720068eeeff2b ]

When reading phy registers via Clause 45 MDIO protocol, after write
address operation, the driver use another write address operation, so
can not read the right value of any phy registers. This patch fixes it.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/hisilicon/hns_mdio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns_mdio.c b/drivers/net/ethernet/hisilicon/hns_mdio.c
index 017e08452d8c0..baf5cc251f329 100644
--- a/drivers/net/ethernet/hisilicon/hns_mdio.c
+++ b/drivers/net/ethernet/hisilicon/hns_mdio.c
@@ -321,7 +321,7 @@ static int hns_mdio_read(struct mii_bus *bus, int phy_id, int regnum)
 		}
 
 		hns_mdio_cmd_write(mdio_dev, is_c45,
-				   MDIO_C45_WRITE_ADDR, phy_id, devad);
+				   MDIO_C45_READ, phy_id, devad);
 	}
 
 	/* Step 5: waitting for MDIO_COMMAND_REG 's mdio_start==0,*/
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 32/65] net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup()
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (8 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 31/65] net: hns: Fix wrong read accesses via Clause 45 MDIO protocol Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 33/65] netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present Sasha Levin
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Alexey Khoroshilov, David S . Miller, Sasha Levin, netdev

From: Alexey Khoroshilov <khoroshilov@ispras.ru>

[ Upstream commit c69c29a1a0a8f68cd87e98ba4a5a79fb8ef2a58c ]

If phy_power_on() fails in rk_gmac_powerup(), clocks are left enabled.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
index 7b923362ee550..3b174eae77c10 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
@@ -1342,8 +1342,10 @@ static int rk_gmac_powerup(struct rk_priv_data *bsp_priv)
 	}
 
 	ret = phy_power_on(bsp_priv, true);
-	if (ret)
+	if (ret) {
+		gmac_clk_enable(bsp_priv, false);
 		return ret;
+	}
 
 	pm_runtime_enable(dev);
 	pm_runtime_get_sync(dev);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 33/65] netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (9 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 32/65] net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup() Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 34/65] netfilter: nfnetlink_osf: add missing fmatch check Sasha Levin
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Florian Westphal, Pablo Neira Ayuso, Sasha Levin,
	netfilter-devel, coreteam, netdev

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 2035f3ff8eaa29cfb5c8e2160b0f6e85eeb21a95 ]

Unlike ip(6)tables ebtables only counts user-defined chains.

The effect is that a 32bit ebtables binary on a 64bit kernel can do
'ebtables -N FOO' only after adding at least one rule, else the request
fails with -EINVAL.

This is a similar fix as done in
3f1e53abff84 ("netfilter: ebtables: don't attempt to allocate 0-sized compat array").

Fixes: 7d7d7e02111e9 ("netfilter: compat: reject huge allocation requests")
Reported-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/bridge/netfilter/ebtables.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 5e55cef0cec39..6693e209efe80 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -2293,9 +2293,12 @@ static int compat_do_replace(struct net *net, void __user *user,
 
 	xt_compat_lock(NFPROTO_BRIDGE);
 
-	ret = xt_compat_init_offsets(NFPROTO_BRIDGE, tmp.nentries);
-	if (ret < 0)
-		goto out_unlock;
+	if (tmp.nentries) {
+		ret = xt_compat_init_offsets(NFPROTO_BRIDGE, tmp.nentries);
+		if (ret < 0)
+			goto out_unlock;
+	}
+
 	ret = compat_copy_entries(entries_tmp, tmp.entries_size, &state);
 	if (ret < 0)
 		goto out_unlock;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 34/65] netfilter: nfnetlink_osf: add missing fmatch check
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (10 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 33/65] netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 36/65] selftests: net: use LDLIBS instead of LDFLAGS Sasha Levin
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Fernando Fernandez Mancera, Pablo Neira Ayuso, Sasha Levin,
	netfilter-devel, coreteam, netdev

From: Fernando Fernandez Mancera <ffmancera@riseup.net>

[ Upstream commit 1a6a0951fc009f6d9fe8ebea2d2417d80d54097b ]

When we check the tcp options of a packet and it doesn't match the current
fingerprint, the tcp packet option pointer must be restored to its initial
value in order to do the proper tcp options check for the next fingerprint.

Here we can see an example.
Assumming the following fingerprint base with two lines:

S10:64:1:60:M*,S,T,N,W6:      Linux:3.0::Linux 3.0
S20:64:1:60:M*,S,T,N,W7:      Linux:4.19:arch:Linux 4.1

Where TCP options are the last field in the OS signature, all of them overlap
except by the last one, ie. 'W6' versus 'W7'.

In case a packet for Linux 4.19 kicks in, the osf finds no matching because the
TCP options pointer is updated after checking for the TCP options in the first
line.

Therefore, reset pointer back to where it should be.

Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match")
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nfnetlink_osf.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/netfilter/nfnetlink_osf.c b/net/netfilter/nfnetlink_osf.c
index 00db27dfd2ff7..b0bc130947c94 100644
--- a/net/netfilter/nfnetlink_osf.c
+++ b/net/netfilter/nfnetlink_osf.c
@@ -71,6 +71,7 @@ static bool nf_osf_match_one(const struct sk_buff *skb,
 			     int ttl_check,
 			     struct nf_osf_hdr_ctx *ctx)
 {
+	const __u8 *optpinit = ctx->optp;
 	unsigned int check_WSS = 0;
 	int fmatch = FMATCH_WRONG;
 	int foptsize, optnum;
@@ -160,6 +161,9 @@ static bool nf_osf_match_one(const struct sk_buff *skb,
 		}
 	}
 
+	if (fmatch != FMATCH_OK)
+		ctx->optp = optpinit;
+
 	return fmatch == FMATCH_OK;
 }
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 36/65] selftests: net: use LDLIBS instead of LDFLAGS
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (11 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 34/65] netfilter: nfnetlink_osf: add missing fmatch check Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 39/65] qed: Fix bug in tx promiscuous mode settings Sasha Levin
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Fathi Boudra, Shuah Khan, Sasha Levin, netdev, linux-kselftest

From: Fathi Boudra <fathi.boudra@linaro.org>

[ Upstream commit 870f193d48c25a97d61a8e6c04e3c29a2c606850 ]

reuseport_bpf_numa fails to build due to undefined reference errors:

 aarch64-linaro-linux-gcc
 --sysroot=/build/tmp-rpb-glibc/sysroots/hikey -Wall
 -Wl,--no-as-needed -O2 -g -I../../../../usr/include/  -Wl,-O1
 -Wl,--hash-style=gnu -Wl,--as-needed -lnuma  reuseport_bpf_numa.c
 -o
 /build/tmp-rpb-glibc/work/hikey-linaro-linux/kselftests/4.12-r0/linux-4.12-rc7/tools/testing/selftests/net/reuseport_bpf_numa
 /tmp/ccfUuExT.o: In function `send_from_node':
 /build/tmp-rpb-glibc/work/hikey-linaro-linux/kselftests/4.12-r0/linux-4.12-rc7/tools/testing/selftests/net/reuseport_bpf_numa.c:138:
 undefined reference to `numa_run_on_node'
 /tmp/ccfUuExT.o: In function `main':
 /build/tmp-rpb-glibc/work/hikey-linaro-linux/kselftests/4.12-r0/linux-4.12-rc7/tools/testing/selftests/net/reuseport_bpf_numa.c:230:
 undefined reference to `numa_available'
 /build/tmp-rpb-glibc/work/hikey-linaro-linux/kselftests/4.12-r0/linux-4.12-rc7/tools/testing/selftests/net/reuseport_bpf_numa.c:233:
 undefined reference to `numa_max_node'

It's GNU Make and linker specific.

The default Makefile rule looks like:

$(CC) $(CFLAGS) $(LDFLAGS) $@ $^ $(LDLIBS)

When linking is done by gcc itself, no issue, but when it needs to be passed
to proper ld, only LDLIBS follows and then ld cannot know what libs to link
with.

More detail:
https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html

LDFLAGS
Extra flags to give to compilers when they are supposed to invoke the linker,
‘ld’, such as -L. Libraries (-lfoo) should be added to the LDLIBS variable
instead.

LDLIBS
Library flags or names given to compilers when they are supposed to invoke the
linker, ‘ld’. LOADLIBES is a deprecated (but still supported) alternative to
LDLIBS. Non-library linker flags, such as -L, should go in the LDFLAGS
variable.

https://lkml.org/lkml/2010/2/10/362

tools/perf: libraries must come after objects

Link order matters, use LDLIBS instead of LDFLAGS to properly link against
libnuma.

Signed-off-by: Fathi Boudra <fathi.boudra@linaro.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 tools/testing/selftests/net/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 919aa2ac00af7..9a3764a1084ea 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -18,6 +18,6 @@ TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls
 KSFT_KHDR_INSTALL := 1
 include ../lib.mk
 
-$(OUTPUT)/reuseport_bpf_numa: LDFLAGS += -lnuma
+$(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma
 $(OUTPUT)/tcp_mmap: LDFLAGS += -lpthread
 $(OUTPUT)/tcp_inq: LDFLAGS += -lpthread
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 39/65] qed: Fix bug in tx promiscuous mode settings
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (12 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 36/65] selftests: net: use LDLIBS instead of LDFLAGS Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 40/65] qed: Fix LACP pdu drops for VFs Sasha Levin
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Manish Chopra, Ariel Elior, David S . Miller, Sasha Levin, netdev

From: Manish Chopra <manishc@marvell.com>

[ Upstream commit 9e71a15d8b5bbce25c637f7f8833cd3f45b65646 ]

When running tx switched traffic between VNICs
created via a bridge(to which VFs are added),
adapter drops the unicast packets in tx flow due to
VNIC's ucast mac being unknown to it. But VF interfaces
being in promiscuous mode should have caused adapter
to accept all the unknown ucast packets. Later, it
was found that driver doesn't really configure tx
promiscuous mode settings to accept all unknown unicast macs.

This patch fixes tx promiscuous mode settings to accept all
unknown/unmatched unicast macs and works out the scenario.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_l2.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.c b/drivers/net/ethernet/qlogic/qed/qed_l2.c
index 67c02ea939062..b8baa6fcef8e3 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_l2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_l2.c
@@ -609,6 +609,10 @@ qed_sp_update_accept_mode(struct qed_hwfn *p_hwfn,
 			  (!!(accept_filter & QED_ACCEPT_MCAST_MATCHED) &&
 			   !!(accept_filter & QED_ACCEPT_MCAST_UNMATCHED)));
 
+		SET_FIELD(state, ETH_VPORT_TX_MODE_UCAST_ACCEPT_ALL,
+			  (!!(accept_filter & QED_ACCEPT_UCAST_MATCHED) &&
+			   !!(accept_filter & QED_ACCEPT_UCAST_UNMATCHED)));
+
 		SET_FIELD(state, ETH_VPORT_TX_MODE_BCAST_ACCEPT_ALL,
 			  !!(accept_filter & QED_ACCEPT_BCAST));
 
@@ -2688,7 +2692,8 @@ static int qed_configure_filter_rx_mode(struct qed_dev *cdev,
 	if (type == QED_FILTER_RX_MODE_TYPE_PROMISC) {
 		accept_flags.rx_accept_filter |= QED_ACCEPT_UCAST_UNMATCHED |
 						 QED_ACCEPT_MCAST_UNMATCHED;
-		accept_flags.tx_accept_filter |= QED_ACCEPT_MCAST_UNMATCHED;
+		accept_flags.tx_accept_filter |= QED_ACCEPT_UCAST_UNMATCHED |
+						 QED_ACCEPT_MCAST_UNMATCHED;
 	} else if (type == QED_FILTER_RX_MODE_TYPE_MULTI_PROMISC) {
 		accept_flags.rx_accept_filter |= QED_ACCEPT_MCAST_UNMATCHED;
 		accept_flags.tx_accept_filter |= QED_ACCEPT_MCAST_UNMATCHED;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 40/65] qed: Fix LACP pdu drops for VFs
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (13 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 39/65] qed: Fix bug in tx promiscuous mode settings Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 41/65] qed: Fix VF probe failure while FLR Sasha Levin
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Manish Chopra, Ariel Elior, David S . Miller, Sasha Levin, netdev

From: Manish Chopra <manishc@marvell.com>

[ Upstream commit ff9296966e5e00b0d0d00477b2365a178f0f06a3 ]

VF is always configured to drop control frames
(with reserved mac addresses) but to work LACP
on the VFs, it would require LACP control frames
to be forwarded or transmitted successfully.

This patch fixes this in such a way that trusted VFs
(marked through ndo_set_vf_trust) would be allowed to
pass the control frames such as LACP pdus.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_l2.c    |  5 +++++
 drivers/net/ethernet/qlogic/qed/qed_l2.h    |  3 +++
 drivers/net/ethernet/qlogic/qed/qed_sriov.c | 10 ++++++++--
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.c b/drivers/net/ethernet/qlogic/qed/qed_l2.c
index b8baa6fcef8e3..e68ca83ae9153 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_l2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_l2.c
@@ -748,6 +748,11 @@ int qed_sp_vport_update(struct qed_hwfn *p_hwfn,
 		return rc;
 	}
 
+	if (p_params->update_ctl_frame_check) {
+		p_cmn->ctl_frame_mac_check_en = p_params->mac_chk_en;
+		p_cmn->ctl_frame_ethtype_check_en = p_params->ethtype_chk_en;
+	}
+
 	/* Update mcast bins for VFs, PF doesn't use this functionality */
 	qed_sp_update_mcast_bin(p_hwfn, p_ramrod, p_params);
 
diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.h b/drivers/net/ethernet/qlogic/qed/qed_l2.h
index 8d80f1095d171..7127d5aaac422 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_l2.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_l2.h
@@ -219,6 +219,9 @@ struct qed_sp_vport_update_params {
 	struct qed_rss_params		*rss_params;
 	struct qed_filter_accept_flags	accept_flags;
 	struct qed_sge_tpa_params	*sge_tpa_params;
+	u8				update_ctl_frame_check;
+	u8				mac_chk_en;
+	u8				ethtype_chk_en;
 };
 
 int qed_sp_vport_update(struct qed_hwfn *p_hwfn,
diff --git a/drivers/net/ethernet/qlogic/qed/qed_sriov.c b/drivers/net/ethernet/qlogic/qed/qed_sriov.c
index ca6290fa0f309..71a7af134dd8e 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_sriov.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_sriov.c
@@ -1969,7 +1969,9 @@ static void qed_iov_vf_mbx_start_vport(struct qed_hwfn *p_hwfn,
 	params.vport_id = vf->vport_id;
 	params.max_buffers_per_cqe = start->max_buffers_per_cqe;
 	params.mtu = vf->mtu;
-	params.check_mac = true;
+
+	/* Non trusted VFs should enable control frame filtering */
+	params.check_mac = !vf->p_vf_info.is_trusted_configured;
 
 	rc = qed_sp_eth_vport_start(p_hwfn, &params);
 	if (rc) {
@@ -5130,6 +5132,9 @@ static void qed_iov_handle_trust_change(struct qed_hwfn *hwfn)
 		params.opaque_fid = vf->opaque_fid;
 		params.vport_id = vf->vport_id;
 
+		params.update_ctl_frame_check = 1;
+		params.mac_chk_en = !vf_info->is_trusted_configured;
+
 		if (vf_info->rx_accept_mode & mask) {
 			flags->update_rx_mode_config = 1;
 			flags->rx_accept_filter = vf_info->rx_accept_mode;
@@ -5147,7 +5152,8 @@ static void qed_iov_handle_trust_change(struct qed_hwfn *hwfn)
 		}
 
 		if (flags->update_rx_mode_config ||
-		    flags->update_tx_mode_config)
+		    flags->update_tx_mode_config ||
+		    params.update_ctl_frame_check)
 			qed_sp_vport_update(hwfn, &params,
 					    QED_SPQ_MODE_EBLOCK, NULL);
 	}
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 41/65] qed: Fix VF probe failure while FLR
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (14 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 40/65] qed: Fix LACP pdu drops for VFs Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 42/65] qed: Fix system crash in ll2 xmit Sasha Levin
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Manish Chopra, Ariel Elior, David S . Miller, Sasha Levin, netdev

From: Manish Chopra <manishc@marvell.com>

[ Upstream commit 327852ec64205bb651be391a069784872098a3b2 ]

VFs may hit VF-PF channel timeout while probing, as in some
cases it was observed that VF FLR and VF "acquire" message
transaction (i.e first message from VF to PF in VF's probe flow)
could occur simultaneously which could lead VF to fail sending
"acquire" message to PF as VF is marked disabled from HW perspective
due to FLR, which will result into channel timeout and VF probe failure.

In such cases, try retrying VF "acquire" message so that in later
attempts it could be successful to pass message to PF after the VF
FLR is completed and can be probed successfully.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_vf.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_vf.c b/drivers/net/ethernet/qlogic/qed/qed_vf.c
index be118d057b92c..6ab3fb008139d 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_vf.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_vf.c
@@ -261,6 +261,7 @@ static int qed_vf_pf_acquire(struct qed_hwfn *p_hwfn)
 	struct pfvf_acquire_resp_tlv *resp = &p_iov->pf2vf_reply->acquire_resp;
 	struct pf_vf_pfdev_info *pfdev_info = &resp->pfdev_info;
 	struct vf_pf_resc_request *p_resc;
+	u8 retry_cnt = VF_ACQUIRE_THRESH;
 	bool resources_acquired = false;
 	struct vfpf_acquire_tlv *req;
 	int rc = 0, attempts = 0;
@@ -314,6 +315,15 @@ static int qed_vf_pf_acquire(struct qed_hwfn *p_hwfn)
 
 		/* send acquire request */
 		rc = qed_send_msg2pf(p_hwfn, &resp->hdr.status, sizeof(*resp));
+
+		/* Re-try acquire in case of vf-pf hw channel timeout */
+		if (retry_cnt && rc == -EBUSY) {
+			DP_VERBOSE(p_hwfn, QED_MSG_IOV,
+				   "VF retrying to acquire due to VPC timeout\n");
+			retry_cnt--;
+			continue;
+		}
+
 		if (rc)
 			goto exit;
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 42/65] qed: Fix system crash in ll2 xmit
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (15 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 41/65] qed: Fix VF probe failure while FLR Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 43/65] qed: Fix stack out of bounds bug Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 50/65] net: macb: Apply RXUBR workaround only to versions with errata Sasha Levin
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Manish Chopra, Ariel Elior, David S . Miller, Sasha Levin, netdev

From: Manish Chopra <manishc@marvell.com>

[ Upstream commit 7c81626a3c37e4ac320b8ad785694ba498f24794 ]

Cache number of fragments in the skb locally as in case
of linear skb (with zero fragments), tx completion
(or freeing of skb) may happen before driver tries
to get number of frgaments from the skb which could
lead to stale access to an already freed skb.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_ll2.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
index 2fa1c050a14b4..b5beca97172a7 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
@@ -2426,19 +2426,24 @@ static int qed_ll2_start_xmit(struct qed_dev *cdev, struct sk_buff *skb,
 {
 	struct qed_ll2_tx_pkt_info pkt;
 	const skb_frag_t *frag;
+	u8 flags = 0, nr_frags;
 	int rc = -EINVAL, i;
 	dma_addr_t mapping;
 	u16 vlan = 0;
-	u8 flags = 0;
 
 	if (unlikely(skb->ip_summed != CHECKSUM_NONE)) {
 		DP_INFO(cdev, "Cannot transmit a checksummed packet\n");
 		return -EINVAL;
 	}
 
-	if (1 + skb_shinfo(skb)->nr_frags > CORE_LL2_TX_MAX_BDS_PER_PACKET) {
+	/* Cache number of fragments from SKB since SKB may be freed by
+	 * the completion routine after calling qed_ll2_prepare_tx_packet()
+	 */
+	nr_frags = skb_shinfo(skb)->nr_frags;
+
+	if (1 + nr_frags > CORE_LL2_TX_MAX_BDS_PER_PACKET) {
 		DP_ERR(cdev, "Cannot transmit a packet with %d fragments\n",
-		       1 + skb_shinfo(skb)->nr_frags);
+		       1 + nr_frags);
 		return -EINVAL;
 	}
 
@@ -2460,7 +2465,7 @@ static int qed_ll2_start_xmit(struct qed_dev *cdev, struct sk_buff *skb,
 	}
 
 	memset(&pkt, 0, sizeof(pkt));
-	pkt.num_of_bds = 1 + skb_shinfo(skb)->nr_frags;
+	pkt.num_of_bds = 1 + nr_frags;
 	pkt.vlan = vlan;
 	pkt.bd_flags = flags;
 	pkt.tx_dest = QED_LL2_TX_DEST_NW;
@@ -2471,12 +2476,17 @@ static int qed_ll2_start_xmit(struct qed_dev *cdev, struct sk_buff *skb,
 	    test_bit(QED_LL2_XMIT_FLAGS_FIP_DISCOVERY, &xmit_flags))
 		pkt.remove_stag = true;
 
+	/* qed_ll2_prepare_tx_packet() may actually send the packet if
+	 * there are no fragments in the skb and subsequently the completion
+	 * routine may run and free the SKB, so no dereferencing the SKB
+	 * beyond this point unless skb has any fragments.
+	 */
 	rc = qed_ll2_prepare_tx_packet(&cdev->hwfns[0], cdev->ll2->handle,
 				       &pkt, 1);
 	if (rc)
 		goto err;
 
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+	for (i = 0; i < nr_frags; i++) {
 		frag = &skb_shinfo(skb)->frags[i];
 
 		mapping = skb_frag_dma_map(&cdev->pdev->dev, frag, 0,
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 43/65] qed: Fix stack out of bounds bug
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (16 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 42/65] qed: Fix system crash in ll2 xmit Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 50/65] net: macb: Apply RXUBR workaround only to versions with errata Sasha Levin
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Manish Chopra, Ariel Elior, David S . Miller, Sasha Levin, netdev

From: Manish Chopra <manishc@marvell.com>

[ Upstream commit ffb057f98928aa099b08e419bbe5afc26ec9f448 ]

KASAN reported following bug in qed_init_qm_get_idx_from_flags
due to inappropriate casting of "pq_flags". Fix the type of "pq_flags".

[  196.624707] BUG: KASAN: stack-out-of-bounds in qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
[  196.624712] Read of size 8 at addr ffff809b00bc7360 by task kworker/0:9/1712
[  196.624714]
[  196.624720] CPU: 0 PID: 1712 Comm: kworker/0:9 Not tainted 4.18.0-60.el8.aarch64+debug #1
[  196.624723] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL024 09/26/2018
[  196.624733] Workqueue: events work_for_cpu_fn
[  196.624738] Call trace:
[  196.624742]  dump_backtrace+0x0/0x2f8
[  196.624745]  show_stack+0x24/0x30
[  196.624749]  dump_stack+0xe0/0x11c
[  196.624755]  print_address_description+0x68/0x260
[  196.624759]  kasan_report+0x178/0x340
[  196.624762]  __asan_report_load_n_noabort+0x38/0x48
[  196.624786]  qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
[  196.624808]  qed_init_qm_info+0xec0/0x2200 [qed]
[  196.624830]  qed_resc_alloc+0x284/0x7e8 [qed]
[  196.624853]  qed_slowpath_start+0x6cc/0x1ae8 [qed]
[  196.624864]  __qede_probe.isra.10+0x1cc/0x12c0 [qede]
[  196.624874]  qede_probe+0x78/0xf0 [qede]
[  196.624879]  local_pci_probe+0xc4/0x180
[  196.624882]  work_for_cpu_fn+0x54/0x98
[  196.624885]  process_one_work+0x758/0x1900
[  196.624888]  worker_thread+0x4e0/0xd18
[  196.624892]  kthread+0x2c8/0x350
[  196.624897]  ret_from_fork+0x10/0x18
[  196.624899]
[  196.624902] Allocated by task 2:
[  196.624906]  kasan_kmalloc.part.1+0x40/0x108
[  196.624909]  kasan_kmalloc+0xb4/0xc8
[  196.624913]  kasan_slab_alloc+0x14/0x20
[  196.624916]  kmem_cache_alloc_node+0x1dc/0x480
[  196.624921]  copy_process.isra.1.part.2+0x1d8/0x4a98
[  196.624924]  _do_fork+0x150/0xfa0
[  196.624926]  kernel_thread+0x48/0x58
[  196.624930]  kthreadd+0x3a4/0x5a0
[  196.624932]  ret_from_fork+0x10/0x18
[  196.624934]
[  196.624937] Freed by task 0:
[  196.624938] (stack is not available)
[  196.624940]
[  196.624943] The buggy address belongs to the object at ffff809b00bc0000
[  196.624943]  which belongs to the cache thread_stack of size 32768
[  196.624946] The buggy address is located 29536 bytes inside of
[  196.624946]  32768-byte region [ffff809b00bc0000, ffff809b00bc8000)
[  196.624948] The buggy address belongs to the page:
[  196.624952] page:ffff7fe026c02e00 count:1 mapcount:0 mapping:ffff809b4001c000 index:0x0 compound_mapcount: 0
[  196.624960] flags: 0xfffff8000008100(slab|head)
[  196.624967] raw: 0fffff8000008100 dead000000000100 dead000000000200 ffff809b4001c000
[  196.624970] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
[  196.624973] page dumped because: kasan: bad access detected
[  196.624974]
[  196.624976] Memory state around the buggy address:
[  196.624980]  ffff809b00bc7200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  196.624983]  ffff809b00bc7280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  196.624985] >ffff809b00bc7300: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
[  196.624988]                                                        ^
[  196.624990]  ffff809b00bc7380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  196.624993]  ffff809b00bc7400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  196.624995] ==================================================================

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index 2f69ee9221c62..4dd82a1612aa8 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -473,19 +473,19 @@ static void qed_init_qm_pq(struct qed_hwfn *p_hwfn,
 
 /* get pq index according to PQ_FLAGS */
 static u16 *qed_init_qm_get_idx_from_flags(struct qed_hwfn *p_hwfn,
-					   u32 pq_flags)
+					   unsigned long pq_flags)
 {
 	struct qed_qm_info *qm_info = &p_hwfn->qm_info;
 
 	/* Can't have multiple flags set here */
-	if (bitmap_weight((unsigned long *)&pq_flags,
+	if (bitmap_weight(&pq_flags,
 			  sizeof(pq_flags) * BITS_PER_BYTE) > 1) {
-		DP_ERR(p_hwfn, "requested multiple pq flags 0x%x\n", pq_flags);
+		DP_ERR(p_hwfn, "requested multiple pq flags 0x%lx\n", pq_flags);
 		goto err;
 	}
 
 	if (!(qed_get_pq_flags(p_hwfn) & pq_flags)) {
-		DP_ERR(p_hwfn, "pq flag 0x%x is not set\n", pq_flags);
+		DP_ERR(p_hwfn, "pq flag 0x%lx is not set\n", pq_flags);
 		goto err;
 	}
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH AUTOSEL 4.19 50/65] net: macb: Apply RXUBR workaround only to versions with errata
  2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
                   ` (17 preceding siblings ...)
  2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 43/65] qed: Fix stack out of bounds bug Sasha Levin
@ 2019-02-23 21:06 ` Sasha Levin
  18 siblings, 0 replies; 20+ messages in thread
From: Sasha Levin @ 2019-02-23 21:06 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Harini Katakam, David S . Miller, Sasha Levin, netdev

From: Harini Katakam <harini.katakam@xilinx.com>

[ Upstream commit e501070e4db0b67a4c17a5557d1e9d098f3db310 ]

The interrupt handler contains a workaround for RX hang applicable
to Zynq and AT91RM9200 only. Subsequent versions do not need this
workaround. This workaround unnecessarily resets RX whenever RX used
bit read is observed, which can be often under heavy traffic. There
is no other action performed on RX UBR interrupt. Hence introduce a
CAPS mask; enable this interrupt and workaround only on affected
versions.

Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/cadence/macb.h      |  3 +++
 drivers/net/ethernet/cadence/macb_main.c | 28 ++++++++++++++----------
 2 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index 3d45f4c92cf6e..9bbaad9f3d632 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -643,6 +643,7 @@
 #define MACB_CAPS_JUMBO				0x00000020
 #define MACB_CAPS_GEM_HAS_PTP			0x00000040
 #define MACB_CAPS_BD_RD_PREFETCH		0x00000080
+#define MACB_CAPS_NEEDS_RSTONUBR		0x00000100
 #define MACB_CAPS_FIFO_MODE			0x10000000
 #define MACB_CAPS_GIGABIT_MODE_AVAILABLE	0x20000000
 #define MACB_CAPS_SG_DISABLED			0x40000000
@@ -1214,6 +1215,8 @@ struct macb {
 
 	int	rx_bd_rd_prefetch;
 	int	tx_bd_rd_prefetch;
+
+	u32	rx_intr_mask;
 };
 
 #ifdef CONFIG_MACB_USE_HWSTAMP
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 8f4b2f9a8e079..8abea1c3844fc 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -56,8 +56,7 @@
 /* level of occupied TX descriptors under which we wake up TX process */
 #define MACB_TX_WAKEUP_THRESH(bp)	(3 * (bp)->tx_ring_size / 4)
 
-#define MACB_RX_INT_FLAGS	(MACB_BIT(RCOMP) | MACB_BIT(RXUBR)	\
-				 | MACB_BIT(ISR_ROVR))
+#define MACB_RX_INT_FLAGS	(MACB_BIT(RCOMP) | MACB_BIT(ISR_ROVR))
 #define MACB_TX_ERR_FLAGS	(MACB_BIT(ISR_TUND)			\
 					| MACB_BIT(ISR_RLE)		\
 					| MACB_BIT(TXERR))
@@ -1271,7 +1270,7 @@ static int macb_poll(struct napi_struct *napi, int budget)
 				queue_writel(queue, ISR, MACB_BIT(RCOMP));
 			napi_reschedule(napi);
 		} else {
-			queue_writel(queue, IER, MACB_RX_INT_FLAGS);
+			queue_writel(queue, IER, bp->rx_intr_mask);
 		}
 	}
 
@@ -1289,7 +1288,7 @@ static void macb_hresp_error_task(unsigned long data)
 	u32 ctrl;
 
 	for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
-		queue_writel(queue, IDR, MACB_RX_INT_FLAGS |
+		queue_writel(queue, IDR, bp->rx_intr_mask |
 					 MACB_TX_INT_FLAGS |
 					 MACB_BIT(HRESP));
 	}
@@ -1319,7 +1318,7 @@ static void macb_hresp_error_task(unsigned long data)
 
 		/* Enable interrupts */
 		queue_writel(queue, IER,
-			     MACB_RX_INT_FLAGS |
+			     bp->rx_intr_mask |
 			     MACB_TX_INT_FLAGS |
 			     MACB_BIT(HRESP));
 	}
@@ -1373,14 +1372,14 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
 			    (unsigned int)(queue - bp->queues),
 			    (unsigned long)status);
 
-		if (status & MACB_RX_INT_FLAGS) {
+		if (status & bp->rx_intr_mask) {
 			/* There's no point taking any more interrupts
 			 * until we have processed the buffers. The
 			 * scheduling call may fail if the poll routine
 			 * is already scheduled, so disable interrupts
 			 * now.
 			 */
-			queue_writel(queue, IDR, MACB_RX_INT_FLAGS);
+			queue_writel(queue, IDR, bp->rx_intr_mask);
 			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
 				queue_writel(queue, ISR, MACB_BIT(RCOMP));
 
@@ -1413,8 +1412,9 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
 		/* There is a hardware issue under heavy load where DMA can
 		 * stop, this causes endless "used buffer descriptor read"
 		 * interrupts but it can be cleared by re-enabling RX. See
-		 * the at91 manual, section 41.3.1 or the Zynq manual
-		 * section 16.7.4 for details.
+		 * the at91rm9200 manual, section 41.3.1 or the Zynq manual
+		 * section 16.7.4 for details. RXUBR is only enabled for
+		 * these two versions.
 		 */
 		if (status & MACB_BIT(RXUBR)) {
 			ctrl = macb_readl(bp, NCR);
@@ -2264,7 +2264,7 @@ static void macb_init_hw(struct macb *bp)
 
 		/* Enable interrupts */
 		queue_writel(queue, IER,
-			     MACB_RX_INT_FLAGS |
+			     bp->rx_intr_mask |
 			     MACB_TX_INT_FLAGS |
 			     MACB_BIT(HRESP));
 	}
@@ -3912,6 +3912,7 @@ static const struct macb_config sama5d4_config = {
 };
 
 static const struct macb_config emac_config = {
+	.caps = MACB_CAPS_NEEDS_RSTONUBR,
 	.clk_init = at91ether_clk_init,
 	.init = at91ether_init,
 };
@@ -3933,7 +3934,8 @@ static const struct macb_config zynqmp_config = {
 };
 
 static const struct macb_config zynq_config = {
-	.caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_NO_GIGABIT_HALF,
+	.caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_NO_GIGABIT_HALF |
+		MACB_CAPS_NEEDS_RSTONUBR,
 	.dma_burst_length = 16,
 	.clk_init = macb_clk_init,
 	.init = macb_init,
@@ -4088,6 +4090,10 @@ static int macb_probe(struct platform_device *pdev)
 						macb_dma_desc_get_size(bp);
 	}
 
+	bp->rx_intr_mask = MACB_RX_INT_FLAGS;
+	if (bp->caps & MACB_CAPS_NEEDS_RSTONUBR)
+		bp->rx_intr_mask |= MACB_BIT(RXUBR);
+
 	mac = of_get_mac_address(np);
 	if (mac) {
 		ether_addr_copy(bp->dev->dev_addr, mac);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-02-23 21:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-23 21:05 [PATCH AUTOSEL 4.19 01/65] vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel Sasha Levin
2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 02/65] xfrm: refine validation of template and selector families Sasha Levin
2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 03/65] xfrm: Make set-mark default behavior backward compatible Sasha Levin
2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 04/65] netfilter: nft_compat: use refcnt_t type for nft_xt reference count Sasha Levin
2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 05/65] netfilter: nft_compat: make lists per netns Sasha Levin
2019-02-23 21:05 ` [PATCH AUTOSEL 4.19 14/65] ipvs: Fix signed integer overflow when setsockopt timeout Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 28/65] net: altera_tse: fix msgdma_tx_completion on non-zero fill_level case Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 29/65] net: hns: Fix for missing of_node_put() after of_parse_phandle() Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 30/65] net: hns: Restart autoneg need return failed when autoneg off Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 31/65] net: hns: Fix wrong read accesses via Clause 45 MDIO protocol Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 32/65] net: stmmac: dwmac-rk: fix error handling in rk_gmac_powerup() Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 33/65] netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 34/65] netfilter: nfnetlink_osf: add missing fmatch check Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 36/65] selftests: net: use LDLIBS instead of LDFLAGS Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 39/65] qed: Fix bug in tx promiscuous mode settings Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 40/65] qed: Fix LACP pdu drops for VFs Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 41/65] qed: Fix VF probe failure while FLR Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 42/65] qed: Fix system crash in ll2 xmit Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 43/65] qed: Fix stack out of bounds bug Sasha Levin
2019-02-23 21:06 ` [PATCH AUTOSEL 4.19 50/65] net: macb: Apply RXUBR workaround only to versions with errata Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).