netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime
@ 2020-11-09  2:50 Jeff Dike
  2020-11-09 19:47 ` Jakub Kicinski
  2020-11-10  3:55 ` David Ahern
  0 siblings, 2 replies; 7+ messages in thread
From: Jeff Dike @ 2020-11-09  2:50 UTC (permalink / raw)
  To: netdev; +Cc: David Ahern, Jeff Dike

Commit 58956317c8de ("neighbor: Improve garbage collection")
guarantees neighbour table entries a five-second lifetime.  Processes
which make heavy use of multicast can fill the neighour table with
multicast addresses in five seconds.  At that point, neighbour entries
can't be GC-ed because they aren't five seconds old yet, the kernel
log starts to fill up with "neighbor table overflow!" messages, and
sends start to fail.

This patch allows multicast addresses to be thrown out before they've
lived out their five seconds.  This makes room for non-multicast
addresses and makes messages to all addresses more reliable in these
circumstances.

Signed-off-by: Jeff Dike <jdike@akamai.com>
---
 include/net/neighbour.h | 1 +
 net/core/neighbour.c    | 2 ++
 net/ipv4/arp.c          | 6 ++++++
 net/ipv6/ndisc.c        | 7 +++++++
 4 files changed, 16 insertions(+)

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 81ee17594c32..22ced1381ede 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -204,6 +204,7 @@ struct neigh_table {
 	int			(*pconstructor)(struct pneigh_entry *);
 	void			(*pdestructor)(struct pneigh_entry *);
 	void			(*proxy_redo)(struct sk_buff *skb);
+	int			(*is_multicast)(const void *pkey);
 	bool			(*allow_add)(const struct net_device *dev,
 					     struct netlink_ext_ack *extack);
 	char			*id;
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 8e39e28b0a8d..9500d28a43b0 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -235,6 +235,8 @@ static int neigh_forced_gc(struct neigh_table *tbl)
 
 			write_lock(&n->lock);
 			if ((n->nud_state == NUD_FAILED) ||
+			    (tbl->is_multicast &&
+			     tbl->is_multicast(n->primary_key)) ||
 			    time_after(tref, n->updated))
 				remove = true;
 			write_unlock(&n->lock);
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index 687971d83b4e..b69bd78cfdf6 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -125,6 +125,7 @@ static int arp_constructor(struct neighbour *neigh);
 static void arp_solicit(struct neighbour *neigh, struct sk_buff *skb);
 static void arp_error_report(struct neighbour *neigh, struct sk_buff *skb);
 static void parp_redo(struct sk_buff *skb);
+static int arp_is_multicast(const void *pkey);
 
 static const struct neigh_ops arp_generic_ops = {
 	.family =		AF_INET,
@@ -156,6 +157,7 @@ struct neigh_table arp_tbl = {
 	.key_eq		= arp_key_eq,
 	.constructor	= arp_constructor,
 	.proxy_redo	= parp_redo,
+	.is_multicast   = arp_is_multicast,
 	.id		= "arp_cache",
 	.parms		= {
 		.tbl			= &arp_tbl,
@@ -928,6 +930,10 @@ static void parp_redo(struct sk_buff *skb)
 	arp_process(dev_net(skb->dev), NULL, skb);
 }
 
+static int arp_is_multicast(const void *pkey)
+{
+	return IN_MULTICAST(htonl(*((u32 *)pkey)));
+}
 
 /*
  *	Receive an arp request from the device layer.
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 27f29b957ee7..6aed5536fc5c 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -81,6 +81,7 @@ static void ndisc_error_report(struct neighbour *neigh, struct sk_buff *skb);
 static int pndisc_constructor(struct pneigh_entry *n);
 static void pndisc_destructor(struct pneigh_entry *n);
 static void pndisc_redo(struct sk_buff *skb);
+static int ndisc_is_multicast(const void *pkey);
 
 static const struct neigh_ops ndisc_generic_ops = {
 	.family =		AF_INET6,
@@ -115,6 +116,7 @@ struct neigh_table nd_tbl = {
 	.pconstructor =	pndisc_constructor,
 	.pdestructor =	pndisc_destructor,
 	.proxy_redo =	pndisc_redo,
+	.is_multicast = ndisc_is_multicast,
 	.allow_add  =   ndisc_allow_add,
 	.id =		"ndisc_cache",
 	.parms = {
@@ -1706,6 +1708,11 @@ static void pndisc_redo(struct sk_buff *skb)
 	kfree_skb(skb);
 }
 
+static int ndisc_is_multicast(const void *pkey)
+{
+	return (((struct in6_addr *)pkey)->in6_u.u6_addr8[0] & 0xf0) == 0xf0;
+}
+
 static bool ndisc_suppress_frag_ndisc(struct sk_buff *skb)
 {
 	struct inet6_dev *idev = __in6_dev_get(skb->dev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime
  2020-11-09  2:50 [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime Jeff Dike
@ 2020-11-09 19:47 ` Jakub Kicinski
  2020-11-10 14:21   ` Jeff Dike
  2020-11-10  3:55 ` David Ahern
  1 sibling, 1 reply; 7+ messages in thread
From: Jakub Kicinski @ 2020-11-09 19:47 UTC (permalink / raw)
  To: Jeff Dike; +Cc: netdev, David Ahern, Nikolay Aleksandrov

On Sun, 8 Nov 2020 21:50:52 -0500 Jeff Dike wrote:
> Commit 58956317c8de ("neighbor: Improve garbage collection")
> guarantees neighbour table entries a five-second lifetime.  Processes
> which make heavy use of multicast can fill the neighour table with
> multicast addresses in five seconds.  At that point, neighbour entries
> can't be GC-ed because they aren't five seconds old yet, the kernel
> log starts to fill up with "neighbor table overflow!" messages, and
> sends start to fail.
> 
> This patch allows multicast addresses to be thrown out before they've
> lived out their five seconds.  This makes room for non-multicast
> addresses and makes messages to all addresses more reliable in these
> circumstances.
> 
> Signed-off-by: Jeff Dike <jdike@akamai.com>

This makes sense because mcast L2 addr is calculated, not discovered,
and therefore can be recreated at a very low cost, correct?

Perhaps it would make sense to widen the API to any "computed" address
rather than implicitly depending on this behavior for mcast?

I'm not an expert tho, maybe others disagree.

> +static int arp_is_multicast(const void *pkey)
> +{
> +	return IN_MULTICAST(htonl(*((u32 *)pkey)));
> +}

net/ipv4/arp.c:935:16: warning: cast from restricted __be32

s/u32/__be32/
s/htonl/ntohl/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime
  2020-11-09  2:50 [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime Jeff Dike
  2020-11-09 19:47 ` Jakub Kicinski
@ 2020-11-10  3:55 ` David Ahern
  2020-11-10 14:24   ` Jeff Dike
  1 sibling, 1 reply; 7+ messages in thread
From: David Ahern @ 2020-11-10  3:55 UTC (permalink / raw)
  To: Jeff Dike, netdev

in addition to Jakub's comments ...

On 11/8/20 7:50 PM, Jeff Dike wrote:
> @@ -1706,6 +1708,11 @@ static void pndisc_redo(struct sk_buff *skb)
>  	kfree_skb(skb);
>  }
>  
> +static int ndisc_is_multicast(const void *pkey)
> +{
> +	return (((struct in6_addr *)pkey)->in6_u.u6_addr8[0] & 0xf0) == 0xf0;

ipv6_addr_type() and IPV6_ADDR_MULTICAST is the better way to code this.



> +}
> +
>  static bool ndisc_suppress_frag_ndisc(struct sk_buff *skb)
>  {
>  	struct inet6_dev *idev = __in6_dev_get(skb->dev);
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime
  2020-11-09 19:47 ` Jakub Kicinski
@ 2020-11-10 14:21   ` Jeff Dike
  2020-11-10 15:05     ` David Ahern
  2020-11-10 16:03     ` Jakub Kicinski
  0 siblings, 2 replies; 7+ messages in thread
From: Jeff Dike @ 2020-11-10 14:21 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, David Ahern, Nikolay Aleksandrov

Hi Jakub,

On 11/9/20 2:47 PM, Jakub Kicinski wrote:
> This makes sense because mcast L2 addr is calculated, not discovered,
> and therefore can be recreated at a very low cost, correct?

Yes.

> Perhaps it would make sense to widen the API to any "computed" address
> rather than implicitly depending on this behavior for mcast?

I'm happy to do that, but I don't know of any other types of addresses which are computed and end up in the neighbors table.

> I'm not an expert tho, maybe others disagree.
> 
>> +static int arp_is_multicast(const void *pkey)
>> +{
>> +	return IN_MULTICAST(htonl(*((u32 *)pkey)));
>> +}
> 
> net/ipv4/arp.c:935:16: warning: cast from restricted __be32
> 
> s/u32/__be32/
> s/htonl/ntohl/

Thanks, I ran sparse, but must have missed that somehow.

Jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime
  2020-11-10  3:55 ` David Ahern
@ 2020-11-10 14:24   ` Jeff Dike
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Dike @ 2020-11-10 14:24 UTC (permalink / raw)
  To: David Ahern, netdev

Hi David,

On 11/9/20 10:55 PM, David Ahern wrote:
> ipv6_addr_type() and IPV6_ADDR_MULTICAST is the better way to code this.

Thanks, will fix.

Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime
  2020-11-10 14:21   ` Jeff Dike
@ 2020-11-10 15:05     ` David Ahern
  2020-11-10 16:03     ` Jakub Kicinski
  1 sibling, 0 replies; 7+ messages in thread
From: David Ahern @ 2020-11-10 15:05 UTC (permalink / raw)
  To: Jeff Dike, Jakub Kicinski; +Cc: netdev, Nikolay Aleksandrov

On 11/10/20 7:21 AM, Jeff Dike wrote:
> Hi Jakub,
> 
> On 11/9/20 2:47 PM, Jakub Kicinski wrote:
>> This makes sense because mcast L2 addr is calculated, not discovered,
>> and therefore can be recreated at a very low cost, correct?
> 
> Yes.
> 
>> Perhaps it would make sense to widen the API to any "computed" address
>> rather than implicitly depending on this behavior for mcast?
> 
> I'm happy to do that, but I don't know of any other types of addresses which are computed and end up in the neighbors table.
> 
>> I'm not an expert tho, maybe others disagree.
>>
>>> +static int arp_is_multicast(const void *pkey)
>>> +{
>>> +	return IN_MULTICAST(htonl(*((u32 *)pkey)));
>>> +}
>>
>> net/ipv4/arp.c:935:16: warning: cast from restricted __be32
>>
>> s/u32/__be32/
>> s/htonl/ntohl/
> 
> Thanks, I ran sparse, but must have missed that somehow.
> 

I missed this yesterday -- ipv4_is_multicast() is more appropriate and
the norm for IPv4 addresses.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime
  2020-11-10 14:21   ` Jeff Dike
  2020-11-10 15:05     ` David Ahern
@ 2020-11-10 16:03     ` Jakub Kicinski
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2020-11-10 16:03 UTC (permalink / raw)
  To: Jeff Dike; +Cc: netdev, David Ahern, Nikolay Aleksandrov

On Tue, 10 Nov 2020 09:21:53 -0500 Jeff Dike wrote:
> > Perhaps it would make sense to widen the API to any "computed" address
> > rather than implicitly depending on this behavior for mcast?  
> 
> I'm happy to do that, but I don't know of any other types of
> addresses which are computed and end up in the neighbors table.

Fair point, thinking about it again only mcast or local addresses 
could be computed but I never heard of local addresses being used
like that, so you can stick to what you have.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-10 16:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-09  2:50 [PATCH net V2] Exempt multicast addresses from five-second neighbor lifetime Jeff Dike
2020-11-09 19:47 ` Jakub Kicinski
2020-11-10 14:21   ` Jeff Dike
2020-11-10 15:05     ` David Ahern
2020-11-10 16:03     ` Jakub Kicinski
2020-11-10  3:55 ` David Ahern
2020-11-10 14:24   ` Jeff Dike

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).