[PATCH net] neigh: Force garbage collection if an entry is deleted administratively

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH net] neigh: Force garbage collection if an entry is deleted administratively
@ 2013-11-12  8:57 Steffen Klassert
  2013-11-14  7:23 ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Steffen Klassert @ 2013-11-12  8:57 UTC (permalink / raw)
  To: David Miller; +Cc: YOSHIFUJI Hideaki, netdev

Since git commit 2724680 ("neigh: Keep neighbour cache entries if number
of them is small enough."), we keep all neighbour cache entries if the
number is below a threshold. But if we now delete an entry administratively
and then try to replace this by a permanent one, we get -EEXIST because the
old entry ist still in the table (in NUD_FAILED state).

So lets force a garbage collect if we delete an entry administratively.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/core/neighbour.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 6072610..ec20880 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1659,6 +1659,8 @@ static int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh)
 				   NEIGH_UPDATE_F_OVERRIDE |
 				   NEIGH_UPDATE_F_ADMIN);
 		neigh_release(neigh);
+
+		neigh_forced_gc(tbl);
 		goto out;
 	}
 	read_unlock(&neigh_tbl_lock);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net] neigh: Force garbage collection if an entry is deleted administratively
  2013-11-12  8:57 [PATCH net] neigh: Force garbage collection if an entry is deleted administratively Steffen Klassert
@ 2013-11-14  7:23 ` David Miller
  2013-11-18 10:08   ` Steffen Klassert
  0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2013-11-14  7:23 UTC (permalink / raw)
  To: steffen.klassert; +Cc: yoshfuji, netdev

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Tue, 12 Nov 2013 09:57:14 +0100

> Since git commit 2724680 ("neigh: Keep neighbour cache entries if number
> of them is small enough."), we keep all neighbour cache entries if the
> number is below a threshold. But if we now delete an entry administratively
> and then try to replace this by a permanent one, we get -EEXIST because the
> old entry ist still in the table (in NUD_FAILED state).
> 
> So lets force a garbage collect if we delete an entry administratively.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

I don't think this is a sufficient fix.

If some entity refers to this entry (refcnt != 1) then the forced
GC won't do anything, and this is very much possible.

It is the difficult situation mentioned in a comment in
neigh_flush_dev() below the "refcnt != 1" test there.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] neigh: Force garbage collection if an entry is deleted administratively
  2013-11-14  7:23 ` David Miller
@ 2013-11-18 10:08   ` Steffen Klassert
  2013-11-18 21:21     ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Steffen Klassert @ 2013-11-18 10:08 UTC (permalink / raw)
  To: David Miller; +Cc: yoshfuji, netdev

On Thu, Nov 14, 2013 at 02:23:56AM -0500, David Miller wrote:
> From: Steffen Klassert <steffen.klassert@secunet.com>
> Date: Tue, 12 Nov 2013 09:57:14 +0100
> 
> > Since git commit 2724680 ("neigh: Keep neighbour cache entries if number
> > of them is small enough."), we keep all neighbour cache entries if the
> > number is below a threshold. But if we now delete an entry administratively
> > and then try to replace this by a permanent one, we get -EEXIST because the
> > old entry ist still in the table (in NUD_FAILED state).
> > 
> > So lets force a garbage collect if we delete an entry administratively.
> > 
> > Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> 
> I don't think this is a sufficient fix.

Yes, looks like that.

> 
> If some entity refers to this entry (refcnt != 1) then the forced
> GC won't do anything, and this is very much possible.
> 
> It is the difficult situation mentioned in a comment in
> neigh_flush_dev() below the "refcnt != 1" test there.

We ensured that an entry will go away if the refcount got to 1
by cyclic garbage collecting. Now we do this only if the cached
entries are above a threshold, so we have to handle this somehow
different if the cached entries are below this threshold.

Currently we do cyclic rescheduling of the gc worker but the worker
exits immediately if cached entries are below the threshold.
I'm testing an approach where we schedule the gc worker only if
the entries are above the threshold or if there is an administrative
change.

The patch I'm testing is below. It is pure RFC at the moment,
but I would be grateful for comments on the approach.


Subject: [PATCH RFC] neigh: Fix garbage collection if the cached entries are
 below the threshold

Since git commit 2724680 ("neigh: Keep neighbour cache entries if number
of them is small enough."), we keep all neighbour cache entries if the
number is below a threshold. But if we now delete an entry administratively
and then try to replace this by a permanent one, we get -EEXIST because the
old entry ist still in the table (in NUD_FAILED state).

So remove the threshold check in neigh_periodic_work() and schedule the
gc_work only when needed, i.e. if gc_thresh1 is reached or if there is
an administrative change. We reschedule gc_work either if the number of
cache entries is still above gc_thresh1 or if there are invalid entries
with "refcnt != 1" cached.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/core/neighbour.c |   29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index ca15f32..39c2a24 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -239,6 +239,9 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev)
 				else
 					n->nud_state = NUD_NONE;
 				neigh_dbg(2, "neigh %p is stray\n", n);
+				schedule_delayed_work(&tbl->gc_work,
+						      tbl->parms.base_reachable_time >> 1);
+
 			}
 			write_unlock(&n->lock);
 			neigh_cleanup_and_release(n);
@@ -282,6 +285,9 @@ static struct neighbour *neigh_alloc(struct neigh_table *tbl, struct net_device
 			goto out_entries;
 	}
 
+	if (entries >= tbl->gc_thresh1)
+		schedule_delayed_work(&tbl->gc_work, 0);
+
 	n = kzalloc(tbl->entry_size + dev->neigh_priv_len, GFP_ATOMIC);
 	if (!n)
 		goto out_entries;
@@ -757,6 +763,7 @@ static void neigh_periodic_work(struct work_struct *work)
 	struct neighbour __rcu **np;
 	unsigned int i;
 	struct neigh_hash_table *nht;
+	bool schedule = false;
 
 	NEIGH_CACHE_STAT_INC(tbl, periodic_gc_runs);
 
@@ -764,9 +771,6 @@ static void neigh_periodic_work(struct work_struct *work)
 	nht = rcu_dereference_protected(tbl->nht,
 					lockdep_is_held(&tbl->lock));
 
-	if (atomic_read(&tbl->entries) < tbl->gc_thresh1)
-		goto out;
-
 	/*
 	 *	periodically recompute ReachableTime from random function
 	 */
@@ -785,6 +789,7 @@ static void neigh_periodic_work(struct work_struct *work)
 		while ((n = rcu_dereference_protected(*np,
 				lockdep_is_held(&tbl->lock))) != NULL) {
 			unsigned int state;
+			int refcnt;
 
 			write_lock(&n->lock);
 
@@ -797,7 +802,8 @@ static void neigh_periodic_work(struct work_struct *work)
 			if (time_before(n->used, n->confirmed))
 				n->used = n->confirmed;
 
-			if (atomic_read(&n->refcnt) == 1 &&
+			refcnt = atomic_read(&n->refcnt);
+			if (refcnt == 1 &&
 			    (state == NUD_FAILED ||
 			     time_after(jiffies, n->used + n->parms->gc_staletime))) {
 				*np = n->next;
@@ -805,7 +811,8 @@ static void neigh_periodic_work(struct work_struct *work)
 				write_unlock(&n->lock);
 				neigh_cleanup_and_release(n);
 				continue;
-			}
+			} else if (refcnt != 1 && !(state & NUD_VALID))
+				schedule = true;
 			write_unlock(&n->lock);
 
 next_elt:
@@ -821,13 +828,15 @@ next_elt:
 		nht = rcu_dereference_protected(tbl->nht,
 						lockdep_is_held(&tbl->lock));
 	}
-out:
-	/* Cycle through all hash buckets every base_reachable_time/2 ticks.
+
+	/* Cycle through all hash buckets every base_reachable_time/2 ticks
+	 * as long as we have more than gc_thresh1 entries cached.
 	 * ARP entry timeouts range from 1/2 base_reachable_time to 3/2
 	 * base_reachable_time.
 	 */
-	schedule_delayed_work(&tbl->gc_work,
-			      tbl->parms.base_reachable_time >> 1);
+	if (schedule == true || atomic_read(&tbl->entries) >= tbl->gc_thresh1)
+		schedule_delayed_work(&tbl->gc_work,
+				      tbl->parms.base_reachable_time >> 1);
 	write_unlock_bh(&tbl->lock);
 }
 
@@ -1659,6 +1668,8 @@ static int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh)
 				   NEIGH_UPDATE_F_OVERRIDE |
 				   NEIGH_UPDATE_F_ADMIN);
 		neigh_release(neigh);
+
+		schedule_delayed_work(&tbl->gc_work, 0);
 		goto out;
 	}
 	read_unlock(&neigh_tbl_lock);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net] neigh: Force garbage collection if an entry is deleted administratively
  2013-11-18 10:08   ` Steffen Klassert
@ 2013-11-18 21:21     ` David Miller
  2013-11-19 11:54       ` Steffen Klassert
  0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2013-11-18 21:21 UTC (permalink / raw)
  To: steffen.klassert; +Cc: yoshfuji, netdev

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Mon, 18 Nov 2013 11:08:43 +0100

> Subject: [PATCH RFC] neigh: Fix garbage collection if the cached entries are
>  below the threshold
> 
> Since git commit 2724680 ("neigh: Keep neighbour cache entries if number
> of them is small enough."), we keep all neighbour cache entries if the
> number is below a threshold. But if we now delete an entry administratively
> and then try to replace this by a permanent one, we get -EEXIST because the
> old entry ist still in the table (in NUD_FAILED state).
> 
> So remove the threshold check in neigh_periodic_work() and schedule the
> gc_work only when needed, i.e. if gc_thresh1 is reached or if there is
> an administrative change. We reschedule gc_work either if the number of
> cache entries is still above gc_thresh1 or if there are invalid entries
> with "refcnt != 1" cached.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

I think the main issue is that after this patch, the problem is really
still there.

Let's say some device holds onto the neigh for a long time, then during
this time an administrative replacement will still get that -EEXIST
failure.

My conclusion is that the management of the state is the problem.
Specifically, if we invalidate an entry then we should remove it's
visisbility.  This means the table should operate by unhashing the
entry unconditionally during such operations.

If some stray references exist, that's fine, the entity holding the
reference will perform the final neigh cleanup at release time.

Does this make sense to you?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] neigh: Force garbage collection if an entry is deleted administratively
  2013-11-18 21:21     ` David Miller
@ 2013-11-19 11:54       ` Steffen Klassert
  2013-11-19 12:08         ` David Laight
  0 siblings, 1 reply; 8+ messages in thread
From: Steffen Klassert @ 2013-11-19 11:54 UTC (permalink / raw)
  To: David Miller; +Cc: yoshfuji, netdev

On Mon, Nov 18, 2013 at 04:21:15PM -0500, David Miller wrote:
> 
> I think the main issue is that after this patch, the problem is really
> still there.
> 
> Let's say some device holds onto the neigh for a long time, then during
> this time an administrative replacement will still get that -EEXIST
> failure.

Well, I think we had this problem even before we stopped to do
cyclic garbage collecting. This patch would just restore the
behaviour we had before this change.

> 
> My conclusion is that the management of the state is the problem.
> Specifically, if we invalidate an entry then we should remove it's
> visisbility.  This means the table should operate by unhashing the
> entry unconditionally during such operations.
> 
> If some stray references exist, that's fine, the entity holding the
> reference will perform the final neigh cleanup at release time.
> 
> Does this make sense to you?

Yes, makes sense :-)

I'm not sure how invasive this will be, so maybe we should do
one patch to restore the old behaviour (for net) and one on
top of this to improve the situation as you described
(for net-next).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH net] neigh: Force garbage collection if an entry is deleted administratively
  2013-11-19 11:54       ` Steffen Klassert
@ 2013-11-19 12:08         ` David Laight
  2013-11-19 12:41           ` Steffen Klassert
  0 siblings, 1 reply; 8+ messages in thread
From: David Laight @ 2013-11-19 12:08 UTC (permalink / raw)
  To: Steffen Klassert, David Miller; +Cc: yoshfuji, netdev

> > My conclusion is that the management of the state is the problem.
> > Specifically, if we invalidate an entry then we should remove it's
> > visisbility.  This means the table should operate by unhashing the
> > entry unconditionally during such operations.
> >
> > If some stray references exist, that's fine, the entity holding the
> > reference will perform the final neigh cleanup at release time.
> >
> > Does this make sense to you?
> 
> Yes, makes sense :-)

Isn't it enough to act as if the entry were not in the hash tables.
So an attempt to add such an entry wouldn't fail.

I've not looked at these code paths, but it can easily be that when
the entry is invalidated the hash table isn't (and can't easily be)
locked - just having the entry locked may make it difficult.

Whereas the code path to add an entry can easily delete old entries.

	David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] neigh: Force garbage collection if an entry is deleted administratively
  2013-11-19 12:08         ` David Laight
@ 2013-11-19 12:41           ` Steffen Klassert
  2013-11-21 19:18             ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Steffen Klassert @ 2013-11-19 12:41 UTC (permalink / raw)
  To: David Laight; +Cc: David Miller, yoshfuji, netdev

On Tue, Nov 19, 2013 at 12:08:06PM -0000, David Laight wrote:
> > > My conclusion is that the management of the state is the problem.
> > > Specifically, if we invalidate an entry then we should remove it's
> > > visisbility.  This means the table should operate by unhashing the
> > > entry unconditionally during such operations.
> > >
> > > If some stray references exist, that's fine, the entity holding the
> > > reference will perform the final neigh cleanup at release time.
> > >
> > > Does this make sense to you?
> > 
> > Yes, makes sense :-)
> 
> Isn't it enough to act as if the entry were not in the hash tables.
> So an attempt to add such an entry wouldn't fail.

Hm, how you want to do that?

> 
> I've not looked at these code paths, but it can easily be that when
> the entry is invalidated the hash table isn't (and can't easily be)
> locked - just having the entry locked may make it difficult.
> 

We have the table locked in neigh_periodic_work() so we can unlink
invalidated entries there. This function could then periodically
check and remove the unlinked entries if they lost their references.
Unlinking with neigh_periodic_work() would have some seconds delay
of course, but I think this is acceptable.

> Whereas the code path to add an entry can easily delete old entries.
> 

Well, it can not if the old entry is still referenced. That's why
we need to look periodically for stale entries and remove them
when they lost their references.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] neigh: Force garbage collection if an entry is deleted administratively
  2013-11-19 12:41           ` Steffen Klassert
@ 2013-11-21 19:18             ` David Miller
  0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2013-11-21 19:18 UTC (permalink / raw)
  To: steffen.klassert; +Cc: David.Laight, yoshfuji, netdev

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Tue, 19 Nov 2013 13:41:02 +0100

> We have the table locked in neigh_periodic_work() so we can unlink
> invalidated entries there. This function could then periodically
> check and remove the unlinked entries if they lost their references.
> Unlinking with neigh_periodic_work() would have some seconds delay
> of course, but I think this is acceptable.

Create an "invalidated_list", and at the moment the neigh is
administratively deleted you can remove it from the hash table and
link it into this new list.

Then neigh_periodic_work() can scan both the hash tables and this
"invalidated_list" for refcnt==1 entries.

It seems a good solution, and should avoid the need to handle
forcibly freeing a neigh which still has external references.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-11-21 19:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-12  8:57 [PATCH net] neigh: Force garbage collection if an entry is deleted administratively Steffen Klassert
2013-11-14  7:23 ` David Miller
2013-11-18 10:08   ` Steffen Klassert
2013-11-18 21:21     ` David Miller
2013-11-19 11:54       ` Steffen Klassert
2013-11-19 12:08         ` David Laight
2013-11-19 12:41           ` Steffen Klassert
2013-11-21 19:18             ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.