All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
@ 2016-07-21 17:58 Chunhui He
  2016-07-22  7:20   ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: Chunhui He @ 2016-07-21 17:58 UTC (permalink / raw)
  To: David S. Miller
  Cc: Chunhui He, David Ahern, Nicolas Dichtel, Roopa Prabhu,
	Robert Shearman, David Barroso, Martin Zhang, Rick Jones,
	Konstantin Khlebnikov, Eric Dumazet, Thomas Graf,
	Eric W. Biederman, YOSHIFUJI Hideaki, netdev, linux-kernel

If neigh entry was CONNECTED and address is not changed, and if new state is
STALE, entry state will not change. Because DELAY is not in CONNECTED, it's
possible to change state from DELAY to STALE.

That is bad. Consider a host in IPv4 nerwork, a neigh entry in STALE state
is referenced to send packets, so goes to DELAY state. If the entry is not
confirmed by upper layer, it goes to PROBE state, and sends ARP request.
The neigh host sends ARP reply, then the entry goes to REACHABLE state.
But the entry state may be reseted to STALE by broadcast ARP packets, before
the entry goes to PROBE state. So it's possible that the entry will never go
to REACHABLE state, without external confirmation.

In my case, the gateway refuses to send unicast packets to me, before it sees
my ARP request. So it's critical to enter REACHABLE state by sending ARP
request, but not by external confirmation.

This fixes neigh_update() not to change to STALE if old state is CONNECTED or
DELAY.

Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn>
---
 net/core/neighbour.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 510cd62..29429eb 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1152,7 +1152,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
 		} else {
 			if (lladdr == neigh->ha && new == NUD_STALE &&
 			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
-			     (old & NUD_CONNECTED))
+			     (old & (NUD_CONNECTED | NUD_DELAY)))
 			    )
 				new = old;
 		}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-21 17:58 [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update() Chunhui He
@ 2016-07-22  7:20   ` Julian Anastasov
  0 siblings, 0 replies; 16+ messages in thread
From: Julian Anastasov @ 2016-07-22  7:20 UTC (permalink / raw)
  To: Chunhui He
  Cc: David S. Miller, David Ahern, Nicolas Dichtel, Roopa Prabhu,
	Robert Shearman, David Barroso, Martin Zhang, Rick Jones,
	Konstantin Khlebnikov, Eric Dumazet, Thomas Graf,
	Eric W. Biederman, YOSHIFUJI Hideaki, netdev, linux-kernel


	Hello,

On Thu, 21 Jul 2016, Chunhui He wrote:

> If neigh entry was CONNECTED and address is not changed, and if new state is
> STALE, entry state will not change. Because DELAY is not in CONNECTED, it's
> possible to change state from DELAY to STALE.
> 
> That is bad. Consider a host in IPv4 nerwork, a neigh entry in STALE state
> is referenced to send packets, so goes to DELAY state. If the entry is not
> confirmed by upper layer, it goes to PROBE state, and sends ARP request.
> The neigh host sends ARP reply, then the entry goes to REACHABLE state.
> But the entry state may be reseted to STALE by broadcast ARP packets, before
> the entry goes to PROBE state. So it's possible that the entry will never go
> to REACHABLE state, without external confirmation.
> 
> In my case, the gateway refuses to send unicast packets to me, before it sees
> my ARP request. So it's critical to enter REACHABLE state by sending ARP
> request, but not by external confirmation.
> 
> This fixes neigh_update() not to change to STALE if old state is CONNECTED or
> DELAY.
> 
> Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn>
> ---
>  net/core/neighbour.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 510cd62..29429eb 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -1152,7 +1152,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>  		} else {
>  			if (lladdr == neigh->ha && new == NUD_STALE &&
>  			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
> -			     (old & NUD_CONNECTED))
> +			     (old & (NUD_CONNECTED | NUD_DELAY)))
>  			    )
>  				new = old;
>  		}

	You change looks correct to me. But this place
has more problems. There is no good reason to set NUD_STALE
for any state that is NUD_VALID if address is not changed.
This matches perfectly the comment above this code:
NUD_STALE should change a NUD_VALID state only when
address changes. It also means that IPv6 does not need
to provide NEIGH_UPDATE_F_WEAK_OVERRIDE anymore when
NEIGH_UPDATE_F_OVERRIDE is also present.

	By this way the state machine can continue with
the resolving: NUD_STALE -> NUD_DELAY (traffic) ->
NUD_PROBE (retries) -> NUD_REACHABLE (unicast reply)
while the address is not changed. Your change covers only
NUD_DELAY, not NUD_PROBE, so it is better to allow more
retries to send. We should not give up until success (NUD_REACHABLE).

	Second problem: NEIGH_UPDATE_F_WEAK_OVERRIDE has no
priority over NEIGH_UPDATE_F_ADMIN. For example, now I can not
change from NUD_PERMANENT to NUD_STALE:

# ip neigh add 192.168.168.111 lladdr 00:11:22:33:44:55 nud perm dev wlan0
# ip neigh show to 192.168.168.111
192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
# ip neigh change 192.168.168.111 lladdr 00:11:22:33:44:55 nud stale dev wlan0
# ip neigh show to 192.168.168.111
192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT

	IMHO, here is how this place should look:

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 5cdc62a..2b1cb91 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1151,10 +1151,8 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
 				goto out;
 		} else {
 			if (lladdr == neigh->ha && new == NUD_STALE &&
-			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
-			     (old & NUD_CONNECTED))
-			    )
-				new = old;
+			    !(flags & NEIGH_UPDATE_F_ADMIN))
+				goto out;
 		}
 	}

	Any thoughts?
 
Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
@ 2016-07-22  7:20   ` Julian Anastasov
  0 siblings, 0 replies; 16+ messages in thread
From: Julian Anastasov @ 2016-07-22  7:20 UTC (permalink / raw)
  To: Chunhui He
  Cc: David S. Miller, David Ahern, Nicolas Dichtel, Roopa Prabhu,
	Robert Shearman, David Barroso, Martin Zhang, Rick Jones,
	Konstantin Khlebnikov, Eric Dumazet, Thomas Graf,
	Eric W. Biederman, YOSHIFUJI Hideaki, netdev, linux-kernel


	Hello,

On Thu, 21 Jul 2016, Chunhui He wrote:

> If neigh entry was CONNECTED and address is not changed, and if new state is
> STALE, entry state will not change. Because DELAY is not in CONNECTED, it's
> possible to change state from DELAY to STALE.
> 
> That is bad. Consider a host in IPv4 nerwork, a neigh entry in STALE state
> is referenced to send packets, so goes to DELAY state. If the entry is not
> confirmed by upper layer, it goes to PROBE state, and sends ARP request.
> The neigh host sends ARP reply, then the entry goes to REACHABLE state.
> But the entry state may be reseted to STALE by broadcast ARP packets, before
> the entry goes to PROBE state. So it's possible that the entry will never go
> to REACHABLE state, without external confirmation.
> 
> In my case, the gateway refuses to send unicast packets to me, before it sees
> my ARP request. So it's critical to enter REACHABLE state by sending ARP
> request, but not by external confirmation.
> 
> This fixes neigh_update() not to change to STALE if old state is CONNECTED or
> DELAY.
> 
> Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn>
> ---
>  net/core/neighbour.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 510cd62..29429eb 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -1152,7 +1152,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>  		} else {
>  			if (lladdr == neigh->ha && new == NUD_STALE &&
>  			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
> -			     (old & NUD_CONNECTED))
> +			     (old & (NUD_CONNECTED | NUD_DELAY)))
>  			    )
>  				new = old;
>  		}

	You change looks correct to me. But this place
has more problems. There is no good reason to set NUD_STALE
for any state that is NUD_VALID if address is not changed.
This matches perfectly the comment above this code:
NUD_STALE should change a NUD_VALID state only when
address changes. It also means that IPv6 does not need
to provide NEIGH_UPDATE_F_WEAK_OVERRIDE anymore when
NEIGH_UPDATE_F_OVERRIDE is also present.

	By this way the state machine can continue with
the resolving: NUD_STALE -> NUD_DELAY (traffic) ->
NUD_PROBE (retries) -> NUD_REACHABLE (unicast reply)
while the address is not changed. Your change covers only
NUD_DELAY, not NUD_PROBE, so it is better to allow more
retries to send. We should not give up until success (NUD_REACHABLE).

	Second problem: NEIGH_UPDATE_F_WEAK_OVERRIDE has no
priority over NEIGH_UPDATE_F_ADMIN. For example, now I can not
change from NUD_PERMANENT to NUD_STALE:

# ip neigh add 192.168.168.111 lladdr 00:11:22:33:44:55 nud perm dev wlan0
# ip neigh show to 192.168.168.111
192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
# ip neigh change 192.168.168.111 lladdr 00:11:22:33:44:55 nud stale dev wlan0
# ip neigh show to 192.168.168.111
192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT

	IMHO, here is how this place should look:

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 5cdc62a..2b1cb91 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1151,10 +1151,8 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
 				goto out;
 		} else {
 			if (lladdr == neigh->ha && new == NUD_STALE &&
-			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
-			     (old & NUD_CONNECTED))
-			    )
-				new = old;
+			    !(flags & NEIGH_UPDATE_F_ADMIN))
+				goto out;
 		}
 	}

	Any thoughts?
 
Regards

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-22  7:20   ` Julian Anastasov
  (?)
@ 2016-07-22  9:41   ` Hannes Frederic Sowa
  -1 siblings, 0 replies; 16+ messages in thread
From: Hannes Frederic Sowa @ 2016-07-22  9:41 UTC (permalink / raw)
  To: Julian Anastasov, Chunhui He
  Cc: David S. Miller, David Ahern, Nicolas Dichtel, Roopa Prabhu,
	Robert Shearman, David Barroso, Martin Zhang, Rick Jones,
	Konstantin Khlebnikov, Eric Dumazet, Thomas Graf,
	Eric W. Biederman, YOSHIFUJI Hideaki, netdev, linux-kernel

On 22.07.2016 09:20, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Thu, 21 Jul 2016, Chunhui He wrote:
> 
>> If neigh entry was CONNECTED and address is not changed, and if new state is
>> STALE, entry state will not change. Because DELAY is not in CONNECTED, it's
>> possible to change state from DELAY to STALE.
>>
>> That is bad. Consider a host in IPv4 nerwork, a neigh entry in STALE state
>> is referenced to send packets, so goes to DELAY state. If the entry is not
>> confirmed by upper layer, it goes to PROBE state, and sends ARP request.
>> The neigh host sends ARP reply, then the entry goes to REACHABLE state.
>> But the entry state may be reseted to STALE by broadcast ARP packets, before
>> the entry goes to PROBE state. So it's possible that the entry will never go
>> to REACHABLE state, without external confirmation.
>>
>> In my case, the gateway refuses to send unicast packets to me, before it sees
>> my ARP request. So it's critical to enter REACHABLE state by sending ARP
>> request, but not by external confirmation.
>>
>> This fixes neigh_update() not to change to STALE if old state is CONNECTED or
>> DELAY.
>>
>> Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn>
>> ---
>>  net/core/neighbour.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index 510cd62..29429eb 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -1152,7 +1152,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>>  		} else {
>>  			if (lladdr == neigh->ha && new == NUD_STALE &&
>>  			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
>> -			     (old & NUD_CONNECTED))
>> +			     (old & (NUD_CONNECTED | NUD_DELAY)))
>>  			    )
>>  				new = old;
>>  		}
> 
> 	You change looks correct to me. But this place
> has more problems. There is no good reason to set NUD_STALE
> for any state that is NUD_VALID if address is not changed.
> This matches perfectly the comment above this code:
> NUD_STALE should change a NUD_VALID state only when
> address changes. It also means that IPv6 does not need
> to provide NEIGH_UPDATE_F_WEAK_OVERRIDE anymore when
> NEIGH_UPDATE_F_OVERRIDE is also present.
> 
> 	By this way the state machine can continue with
> the resolving: NUD_STALE -> NUD_DELAY (traffic) ->
> NUD_PROBE (retries) -> NUD_REACHABLE (unicast reply)
> while the address is not changed. Your change covers only
> NUD_DELAY, not NUD_PROBE, so it is better to allow more
> retries to send. We should not give up until success (NUD_REACHABLE).
> 
> 	Second problem: NEIGH_UPDATE_F_WEAK_OVERRIDE has no
> priority over NEIGH_UPDATE_F_ADMIN. For example, now I can not
> change from NUD_PERMANENT to NUD_STALE:
> 
> # ip neigh add 192.168.168.111 lladdr 00:11:22:33:44:55 nud perm dev wlan0
> # ip neigh show to 192.168.168.111
> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
> # ip neigh change 192.168.168.111 lladdr 00:11:22:33:44:55 nud stale dev wlan0
> # ip neigh show to 192.168.168.111
> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
> 
> 	IMHO, here is how this place should look:
> 
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 5cdc62a..2b1cb91 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -1151,10 +1151,8 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>  				goto out;
>  		} else {
>  			if (lladdr == neigh->ha && new == NUD_STALE &&
> -			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
> -			     (old & NUD_CONNECTED))
> -			    )
> -				new = old;
> +			    !(flags & NEIGH_UPDATE_F_ADMIN))
> +				goto out;
>  		}
>  	}
> 
> 	Any thoughts?

This change makes perfectly sense to me.

Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-22  7:20   ` Julian Anastasov
  (?)
  (?)
@ 2016-07-22 12:47   ` Chunhui He
  2016-07-23  6:17     ` Julian Anastasov
  2016-07-25  5:20     ` YOSHIFUJI Hideaki/吉藤英明
  -1 siblings, 2 replies; 16+ messages in thread
From: Chunhui He @ 2016-07-22 12:47 UTC (permalink / raw)
  To: ja
  Cc: davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel

Hi,

On Fri, 22 Jul 2016 10:20:01 +0300 (EEST), Julian Anastasov <ja@ssi.bg> wrote:
> 
> 	Hello,
> 
> On Thu, 21 Jul 2016, Chunhui He wrote:
> 
>> If neigh entry was CONNECTED and address is not changed, and if new state is
>> STALE, entry state will not change. Because DELAY is not in CONNECTED, it's
>> possible to change state from DELAY to STALE.
>> 
>> That is bad. Consider a host in IPv4 nerwork, a neigh entry in STALE state
>> is referenced to send packets, so goes to DELAY state. If the entry is not
>> confirmed by upper layer, it goes to PROBE state, and sends ARP request.
>> The neigh host sends ARP reply, then the entry goes to REACHABLE state.
>> But the entry state may be reseted to STALE by broadcast ARP packets, before
>> the entry goes to PROBE state. So it's possible that the entry will never go
>> to REACHABLE state, without external confirmation.
>> 
>> In my case, the gateway refuses to send unicast packets to me, before it sees
>> my ARP request. So it's critical to enter REACHABLE state by sending ARP
>> request, but not by external confirmation.
>> 
>> This fixes neigh_update() not to change to STALE if old state is CONNECTED or
>> DELAY.
>> 
>> Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn>
>> ---
>>  net/core/neighbour.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index 510cd62..29429eb 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -1152,7 +1152,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>>  		} else {
>>  			if (lladdr == neigh->ha && new == NUD_STALE &&
>>  			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
>> -			     (old & NUD_CONNECTED))
>> +			     (old & (NUD_CONNECTED | NUD_DELAY)))
>>  			    )
>>  				new = old;
>>  		}
> 
> 	You change looks correct to me. But this place
> has more problems. There is no good reason to set NUD_STALE
> for any state that is NUD_VALID if address is not changed.
> This matches perfectly the comment above this code:
> NUD_STALE should change a NUD_VALID state only when
> address changes. It also means that IPv6 does not need
> to provide NEIGH_UPDATE_F_WEAK_OVERRIDE anymore when
> NEIGH_UPDATE_F_OVERRIDE is also present.
>

The NEIGH_UPDATE_F_WEAK_OVERRIDE is confusing to me, so I choose not to deal
with the flag.

> 	By this way the state machine can continue with
> the resolving: NUD_STALE -> NUD_DELAY (traffic) ->
> NUD_PROBE (retries) -> NUD_REACHABLE (unicast reply)
> while the address is not changed. Your change covers only
> NUD_DELAY, not NUD_PROBE, so it is better to allow more
> retries to send. We should not give up until success (NUD_REACHABLE).
>

I have thought about this.
The origin code allows NUD_DELAY -> NUD_STALE and NUD_PROBE -> NUD_STALE.
This part was imported to kernel since v2.1.79, I don't know clearly why it
allows that.

My analysis:
(1) As shown in my previous mail, NUD_DELAY -> NUD_STALE may cause "dead loop",
so it should be fixed.

(2) But NUD_PROBE -> NUD_STALE is acceptable, because in NUD_PROBE, ARP request
has been sent, it is sufficient to break the "dead loop".
More attempts are accomplished by the following sequence:
NUD_STALE --> NUD_DELAY -(sent req)-> NUD_PROBE -(reset by neigh_update())->
NUD_STALE --> NUD_DELAY -(send req again)-> ... -->
NUD_REACHABLE


But I also agree your change.

> 	Second problem: NEIGH_UPDATE_F_WEAK_OVERRIDE has no
> priority over NEIGH_UPDATE_F_ADMIN. For example, now I can not
> change from NUD_PERMANENT to NUD_STALE:
>
> # ip neigh add 192.168.168.111 lladdr 00:11:22:33:44:55 nud perm dev wlan0
> # ip neigh show to 192.168.168.111
> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
> # ip neigh change 192.168.168.111 lladdr 00:11:22:33:44:55 nud stale dev wlan0
> # ip neigh show to 192.168.168.111
> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
> 
> 	IMHO, here is how this place should look:
> 
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 5cdc62a..2b1cb91 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -1151,10 +1151,8 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>  				goto out;
>  		} else {
>  			if (lladdr == neigh->ha && new == NUD_STALE &&
> -			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
> -			     (old & NUD_CONNECTED))
> -			    )
> -				new = old;
> +			    !(flags & NEIGH_UPDATE_F_ADMIN))
> +				goto out;
>  		}
>  	}
> 
> 	Any thoughts?
>  
> Regards
> 
> --
> Julian Anastasov <ja@ssi.bg>

Regards,
Chunhui He

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-22 12:47   ` Chunhui He
@ 2016-07-23  6:17     ` Julian Anastasov
  2016-07-23 11:24       ` Chunhui He
  2016-07-25  5:20     ` YOSHIFUJI Hideaki/吉藤英明
  1 sibling, 1 reply; 16+ messages in thread
From: Julian Anastasov @ 2016-07-23  6:17 UTC (permalink / raw)
  To: Chunhui He
  Cc: davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel


	Hello,

On Fri, 22 Jul 2016, Chunhui He wrote:

> The origin code allows NUD_DELAY -> NUD_STALE and NUD_PROBE -> NUD_STALE.
> This part was imported to kernel since v2.1.79, I don't know clearly why it
> allows that.
> 
> My analysis:
> (1) As shown in my previous mail, NUD_DELAY -> NUD_STALE may cause "dead loop",
> so it should be fixed.

	Yes, because we stay in NUD_DELAY for many seconds
which is enough for remote host to reset our resolving.

	BTW, you said:

<quote>
In my case, the gateway refuses to send unicast packets to me, before it sees
my ARP request. So it's critical to enter REACHABLE state by sending ARP
request, but not by external confirmation.
</quote>

	What kind of problem is this? Remote host wants to
see a recent probe from us, otherwise it refuses to resolve
our address before its traffic to us and it is not sent?
Can you explain this in more detail because after looking
again I have some doubts what actually happens, see below.

> (2) But NUD_PROBE -> NUD_STALE is acceptable, because in NUD_PROBE, ARP request
> has been sent, it is sufficient to break the "dead loop".
> More attempts are accomplished by the following sequence:
> NUD_STALE --> NUD_DELAY -(sent req)-> NUD_PROBE -(reset by neigh_update())->

	I think, when entering NUD_DELAY we do not send
any ARP probe: for NUD_STALE __neigh_event_send is called on
outgoing traffic to change state to NUD_DELAY and to start
timer (it was stopped in NUD_STALE) to detect if address is
still alive before probing it again. Now in this period of
5 seconds (delay_first_probe_time) two things can happen:

1. Unexpected Unicast ARP reply (immediate switch to NUD_REACHABLE)
or protocol indication (dst_confirm) causing delayed switch to
NUD_REACHABLE on next outgoing packet. On sporadic
request+reply we may not switch immediately to NUD_REACHABLE.
Even if the reply called dst_confirm, the change happens
next time when new request is sent and dst_neigh_output is called.

2. Remote host is fast enough to reset us again to NUD_STALE
before we change state to ->NUD_PROBE->NUD_REACHABLE.

	To summarize: currently the change to NUD_STALE serves the
purpose to avoid/delay our hwaddr refreshing probes. They are
avoided if protocols indicate progress with the current hwaddr.
Outgoing IP traffic that does not trigger confirmation
from replies (for example TCP ACK calling dst_confirm) or
from applications (MSG_CONFIRM) surely will cause a
switch to NUD_PROBE.

	Now the main question: is reaching a NUD_REACHABLE
state a good enough goal (if we ignore the NUD_STALE in
NUD_DELAY | NUD_PROBE state) or we prefer traffic that does
not provide confirmation indications to use the current
hwaddr based only on indications from received ARP broadcasts
or requests, in which case we avoid our ARP probes. In the
latter case remote hosts do not see fresh probes from us
and we may cycle between NUD_STALE and NUD_DELAY if
such remote packets come more often.

	So, the question is, to avoid probes or to refresh
frequently? Is there a good reason to ignore this NUD_STALE
event in NUD_DELAY | NUD_PROBE state?

> NUD_STALE --> NUD_DELAY -(send req again)-> ... -->
> NUD_REACHABLE

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-23  6:17     ` Julian Anastasov
@ 2016-07-23 11:24       ` Chunhui He
  2016-07-23 14:09           ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: Chunhui He @ 2016-07-23 11:24 UTC (permalink / raw)
  To: ja
  Cc: davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel


On Sat, 23 Jul 2016 09:17:59 +0300 (EEST), Julian Anastasov <ja@ssi.bg> wrote:
> <quote>
> In my case, the gateway refuses to send unicast packets to me, before it sees
> my ARP request. So it's critical to enter REACHABLE state by sending ARP
> request, but not by external confirmation.
> </quote>
> 
> 	What kind of problem is this? Remote host wants to
> see a recent probe from us, otherwise it refuses to resolve
> our address before its traffic to us and it is not sent?
> Can you explain this in more detail because after looking
> again I have some doubts what actually happens, see below.
>

The remote host is configured to refuse to send any packets to a host it doesn't
"know" (but broadcast is allowed), and it can only "learn" from ARP packets.

When I send packets, if broadcast ARP requests from the remote host are received
and set the state to NUD_STALE, then I stuck.

>> (2) But NUD_PROBE -> NUD_STALE is acceptable, because in NUD_PROBE, ARP request
>> has been sent, it is sufficient to break the "dead loop".
>> More attempts are accomplished by the following sequence:
>> NUD_STALE --> NUD_DELAY -(sent req)-> NUD_PROBE -(reset by neigh_update())->
> 
> 	I think, when entering NUD_DELAY we do not send
> any ARP probe: for NUD_STALE __neigh_event_send is called on
> outgoing traffic to change state to NUD_DELAY and to start
> timer (it was stopped in NUD_STALE) to detect if address is
> still alive before probing it again. Now in this period of
> 5 seconds (delay_first_probe_time) two things can happen:
>

Yes, we do not send any probe when entering NUD_DELAY.
"NUD_DELAY -(send req)-> NUD_PROBE" means when entering NUD_PROBE, send req.

> 1. Unexpected Unicast ARP reply (immediate switch to NUD_REACHABLE)
> or protocol indication (dst_confirm) causing delayed switch to
> NUD_REACHABLE on next outgoing packet. On sporadic
> request+reply we may not switch immediately to NUD_REACHABLE.
> Even if the reply called dst_confirm, the change happens
> next time when new request is sent and dst_neigh_output is called.
> 
> 2. Remote host is fast enough to reset us again to NUD_STALE
> before we change state to ->NUD_PROBE->NUD_REACHABLE.
> 
> 	To summarize: currently the change to NUD_STALE serves the
> purpose to avoid/delay our hwaddr refreshing probes. They are
> avoided if protocols indicate progress with the current hwaddr.
> Outgoing IP traffic that does not trigger confirmation
> from replies (for example TCP ACK calling dst_confirm) or
> from applications (MSG_CONFIRM) surely will cause a
> switch to NUD_PROBE.
>

Yes, I agree.
But now it is possible to delay the probes *forever*, and at the same time we
get no positive response from the remote host.

When entering NUD_DELAY, it means we need some confirmations to ensure the
address we fill is not stale. If we get no evidence, it's our responsibility
to ensure reachability. So some probes are unavoidable, and delay by
NUD_STALE(can not proof fresh) is unacceptable.

> 	Now the main question: is reaching a NUD_REACHABLE
> state a good enough goal (if we ignore the NUD_STALE in
> NUD_DELAY | NUD_PROBE state) or we prefer traffic that does
> not provide confirmation indications to use the current
> hwaddr based only on indications from received ARP broadcasts
> or requests, in which case we avoid our ARP probes. In the
> latter case remote hosts do not see fresh probes from us
> and we may cycle between NUD_STALE and NUD_DELAY if
> such remote packets come more often.
>
> 	So, the question is, to avoid probes or to refresh
> frequently? Is there a good reason to ignore this NUD_STALE
> event in NUD_DELAY | NUD_PROBE state?
>

So reaching a NUD_REACHABLE state in not our goal. It's to ensure correctness.
Cycle between NUD_STALE and NUD_DELAY is not correct.

Maybe it is enough to ignore NUD_STALE?

Regards,
Chunhui

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-23 11:24       ` Chunhui He
@ 2016-07-23 14:09           ` Julian Anastasov
  0 siblings, 0 replies; 16+ messages in thread
From: Julian Anastasov @ 2016-07-23 14:09 UTC (permalink / raw)
  To: Chunhui He
  Cc: davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel


	Hello,

On Sat, 23 Jul 2016, Chunhui He wrote:

> On Sat, 23 Jul 2016 09:17:59 +0300 (EEST), Julian Anastasov <ja@ssi.bg> wrote:
> > 
> > 	What kind of problem is this? Remote host wants to
> > see a recent probe from us, otherwise it refuses to resolve
> > our address before its traffic to us and it is not sent?
> > Can you explain this in more detail because after looking
> > again I have some doubts what actually happens, see below.
> >
> 
> The remote host is configured to refuse to send any packets to a host it doesn't
> "know" (but broadcast is allowed), and it can only "learn" from ARP packets.

	Can it learn from our unicast ARP replies that we
should sent in response to its broadcast probes? Or it
expects only ARP requests?

> When I send packets, if broadcast ARP requests from the remote host are received
> and set the state to NUD_STALE, then I stuck.

	So, this is a special case. Is it possible to
solve it from user space?:

1.1. echo 0 > delay_first_probe_time. This can help if
remote hosts sends broadcast ARP probes every second and
if we send IP packets too.

1.2. reduce base_reachable_time if needed to send ARP probes
more often

2. Send ARP probe by using the arping tool, eg. from cron

	Note that solution 1 is not good. If we do not
have traffic to send there will be no ARP probe and the
remote host can not send to us.

> > 	To summarize: currently the change to NUD_STALE serves the
> > purpose to avoid/delay our hwaddr refreshing probes. They are
> > avoided if protocols indicate progress with the current hwaddr.
> > Outgoing IP traffic that does not trigger confirmation
> > from replies (for example TCP ACK calling dst_confirm) or
> > from applications (MSG_CONFIRM) surely will cause a
> > switch to NUD_PROBE.
> >
> 
> Yes, I agree.
> But now it is possible to delay the probes *forever*, and at the same time we
> get no positive response from the remote host.

	What happens if we do not send traffic and the
neigh entry is removed? How the remote host will learn
our address? If remote host sends ARP broadcasts even
arp_accept=1 will create NUD_STALE entry and without any
traffic we can stay in this state, no chance for NUD_DELAY.

> > 	So, the question is, to avoid probes or to refresh
> > frequently? Is there a good reason to ignore this NUD_STALE
> > event in NUD_DELAY | NUD_PROBE state?
> >
> 
> So reaching a NUD_REACHABLE state in not our goal. It's to ensure correctness.
> Cycle between NUD_STALE and NUD_DELAY is not correct.

	The main goal looks to be the reduced ARP traffic. If
we learned the neigh address recently (even if from remote ARP
broadcast probes or from TCP ACKs) we do not need to send
probes. Looks like the goal "always stay present in remote
ARP caches" is not listed as our goal. Even "always update
remote ARP cache" is not implemented, no outgoing traffic =>
no ARP probes.

> Maybe it is enough to ignore NUD_STALE?

	But you in this case rely on traffic to enter
NUD_DELAY state. Note that looking at neigh_timer_handler
NUD_DELAY state is not guaranteed: if there is no
recent outgoing traffic the NUD_REACHABLE state can be changed
to NUD_STALE, not to NUD_DELAY, so no chance for probes
that will keep the entry refreshed forever.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
@ 2016-07-23 14:09           ` Julian Anastasov
  0 siblings, 0 replies; 16+ messages in thread
From: Julian Anastasov @ 2016-07-23 14:09 UTC (permalink / raw)
  To: Chunhui He
  Cc: davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel


	Hello,

On Sat, 23 Jul 2016, Chunhui He wrote:

> On Sat, 23 Jul 2016 09:17:59 +0300 (EEST), Julian Anastasov <ja@ssi.bg> wrote:
> > 
> > 	What kind of problem is this? Remote host wants to
> > see a recent probe from us, otherwise it refuses to resolve
> > our address before its traffic to us and it is not sent?
> > Can you explain this in more detail because after looking
> > again I have some doubts what actually happens, see below.
> >
> 
> The remote host is configured to refuse to send any packets to a host it doesn't
> "know" (but broadcast is allowed), and it can only "learn" from ARP packets.

	Can it learn from our unicast ARP replies that we
should sent in response to its broadcast probes? Or it
expects only ARP requests?

> When I send packets, if broadcast ARP requests from the remote host are received
> and set the state to NUD_STALE, then I stuck.

	So, this is a special case. Is it possible to
solve it from user space?:

1.1. echo 0 > delay_first_probe_time. This can help if
remote hosts sends broadcast ARP probes every second and
if we send IP packets too.

1.2. reduce base_reachable_time if needed to send ARP probes
more often

2. Send ARP probe by using the arping tool, eg. from cron

	Note that solution 1 is not good. If we do not
have traffic to send there will be no ARP probe and the
remote host can not send to us.

> > 	To summarize: currently the change to NUD_STALE serves the
> > purpose to avoid/delay our hwaddr refreshing probes. They are
> > avoided if protocols indicate progress with the current hwaddr.
> > Outgoing IP traffic that does not trigger confirmation
> > from replies (for example TCP ACK calling dst_confirm) or
> > from applications (MSG_CONFIRM) surely will cause a
> > switch to NUD_PROBE.
> >
> 
> Yes, I agree.
> But now it is possible to delay the probes *forever*, and at the same time we
> get no positive response from the remote host.

	What happens if we do not send traffic and the
neigh entry is removed? How the remote host will learn
our address? If remote host sends ARP broadcasts even
arp_accept=1 will create NUD_STALE entry and without any
traffic we can stay in this state, no chance for NUD_DELAY.

> > 	So, the question is, to avoid probes or to refresh
> > frequently? Is there a good reason to ignore this NUD_STALE
> > event in NUD_DELAY | NUD_PROBE state?
> >
> 
> So reaching a NUD_REACHABLE state in not our goal. It's to ensure correctness.
> Cycle between NUD_STALE and NUD_DELAY is not correct.

	The main goal looks to be the reduced ARP traffic. If
we learned the neigh address recently (even if from remote ARP
broadcast probes or from TCP ACKs) we do not need to send
probes. Looks like the goal "always stay present in remote
ARP caches" is not listed as our goal. Even "always update
remote ARP cache" is not implemented, no outgoing traffic =>
no ARP probes.

> Maybe it is enough to ignore NUD_STALE?

	But you in this case rely on traffic to enter
NUD_DELAY state. Note that looking at neigh_timer_handler
NUD_DELAY state is not guaranteed: if there is no
recent outgoing traffic the NUD_REACHABLE state can be changed
to NUD_STALE, not to NUD_DELAY, so no chance for probes
that will keep the entry refreshed forever.

Regards

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-23 14:09           ` Julian Anastasov
  (?)
@ 2016-07-23 16:52           ` Chunhui He
  2016-07-23 19:09             ` Julian Anastasov
  -1 siblings, 1 reply; 16+ messages in thread
From: Chunhui He @ 2016-07-23 16:52 UTC (permalink / raw)
  To: ja
  Cc: davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel


Hello, Julian.

My case is special, so I think the detail(provided below, if you are intresting)
is not very important. *It only trigers the real problem*.

The neigh system is to reduce ARP traffic, that is good. The problem is it fails
to handle some coner cases.

The coner case is (let's forget my case above):
In NUD_DELAY, the neigh system is waiting for a proof of reachablity. If there
is no proof, the neigh system must prove by itself, so goes to NUD_PROBE and
sends request. But when some other part of kernel gives a non-proof by
neigh_update()(STALE is a *hint*, not a proof of reachablity), the neigh system
will leave NUD_DELAY, and will *"forget"* to prove by itself. So it's possiable
to send traffic to a non-reachable address. That's definitely wrong, even it
"saves" traffic.

And the fix is to disallow NUD_DELAY -> NUD_STALE.

Regrads,
Chunhui

On Sat, 23 Jul 2016 17:09:12 +0300 (EEST), Julian Anastasov <ja@ssi.bg> wrote:
>> 
>> The remote host is configured to refuse to send any packets to a host it doesn't
>> "know" (but broadcast is allowed), and it can only "learn" from ARP packets.
> 
> 	Can it learn from our unicast ARP replies that we
> should sent in response to its broadcast probes? Or it
> expects only ARP requests?

All the broadcast probes I have seen are not "who has <our ip>". they are about
other hosts, so we are not expected to answer.
So I'm not sure if it can learn from ARP reply.

> 
>> When I send packets, if broadcast ARP requests from the remote host are received
>> and set the state to NUD_STALE, then I stuck.
> 
> 	So, this is a special case. Is it possible to
> solve it from user space?:
> 
> 1.1. echo 0 > delay_first_probe_time. This can help if
> remote hosts sends broadcast ARP probes every second and
> if we send IP packets too.
> 
> 1.2. reduce base_reachable_time if needed to send ARP probes
> more often
>
> 2. Send ARP probe by using the arping tool, eg. from cron
>

Solution 2 works. But I think it is a workaround.

> 	What happens if we do not send traffic and the
> neigh entry is removed? How the remote host will learn
> our address? If remote host sends ARP broadcasts even
> arp_accept=1 will create NUD_STALE entry and without any
> traffic we can stay in this state, no chance for NUD_DELAY.
>

The remote host is a gateway, traffic initiated from outside is forbidden.
So we always initiate traffic.
If we don't send traffic and arp_accept=0, no entry is created.

The entry is created when we send traffic.
Normally the state is set to NUD_STALE immediately, then we enter
the "NUD_STALE -> NUD_DELAY -> NUD_STALE" loop.

> 
> 	The main goal looks to be the reduced ARP traffic. If
> we learned the neigh address recently (even if from remote ARP
> broadcast probes or from TCP ACKs) we do not need to send
> probes. Looks like the goal "always stay present in remote
> ARP caches" is not listed as our goal. Even "always update
> remote ARP cache" is not implemented, no outgoing traffic =>
> no ARP probes.
>

Please see the top.

> 	But you in this case rely on traffic to enter
> NUD_DELAY state. Note that looking at neigh_timer_handler
> NUD_DELAY state is not guaranteed: if there is no
> recent outgoing traffic the NUD_REACHABLE state can be changed
> to NUD_STALE, not to NUD_DELAY, so no chance for probes
> that will keep the entry refreshed forever.
> 

No. When I send traffic, the entry will enter NUD_DELAY agagin.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-23 16:52           ` Chunhui He
@ 2016-07-23 19:09             ` Julian Anastasov
  2016-07-25  7:52               ` Chunhui He
  0 siblings, 1 reply; 16+ messages in thread
From: Julian Anastasov @ 2016-07-23 19:09 UTC (permalink / raw)
  To: Chunhui He
  Cc: davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel


	Hello,

On Sat, 23 Jul 2016, Chunhui He wrote:

> The neigh system is to reduce ARP traffic, that is good. The problem is it fails
> to handle some coner cases.
> 
> The coner case is (let's forget my case above):
> In NUD_DELAY, the neigh system is waiting for a proof of reachablity. If there
> is no proof, the neigh system must prove by itself, so goes to NUD_PROBE and
> sends request. But when some other part of kernel gives a non-proof by
> neigh_update()(STALE is a *hint*, not a proof of reachablity), the neigh system
> will leave NUD_DELAY, and will *"forget"* to prove by itself. So it's possiable
> to send traffic to a non-reachable address. That's definitely wrong, even it
> "saves" traffic.
> 
> And the fix is to disallow NUD_DELAY -> NUD_STALE.

	But NUD_STALE event happens only for received
packet, for the concerned remote IP address, for same or
different hwaddr, for any kind of tip (target IP). Examples:

- Received ARP request who-has LOCAL_IP tell NEIGH_IP:
	neigh_event_ns is called for the RTN_LOCAL case,
	for sip/sha. Reply is sent.

- Received ARP request who-has UNICAST_IP tell NEIGH_IP:
	neigh_event_ns is called for the IN_DEV_FORWARD case,
	for sip/sha, i.e. if we use proxy_arp. Deferred
	or immediate reply is sent.

- Received ARP request who-has UNICAST_IP tell NEIGH_IP:
	neigh_update is called for existing entry when
	proxy_arp=0, i.e. request not catched by above case.
	No reply is sent.

- Received Gratuitous ARP request who-has NEIGH_IP tell NEIGH_IP:
	neigh_update is called for broadcast request when
	arp_accept=1 or when arp_accept=0 while cache entry exists.
	No reply is sent.

- Received Gratuitous ARP reply NEIGH_IP is-at hwaddr:
	neigh_update is called for the received broadcast reply

	This was all for NUD_STALE. There is only one
ARP case where NUD_REACHABLE is set, usually in response
to our request:

- Received unicast reply NEIGH_IP is-at hwaddr

- the second non-ARP case for NUD_REACHABLE is from dst_confirm

> > 	Can it learn from our unicast ARP replies that we
> > should sent in response to its broadcast probes? Or it
> > expects only ARP requests?
> 
> All the broadcast probes I have seen are not "who has <our ip>". they are about
> other hosts, so we are not expected to answer.

	May be that is the problem: we receive such packet,
ip_route_input_noref detects that we allow such packet
from NEIGH_IP on this interface, tip is not RTN_LOCAL (no
ARP reply from us), tip is RTN_UNICAST but proxy_arp is not
allowed, so we continue and reach __neigh_lookup which finds
the existing cache entry because we talked to GW before that.
As this is an ARP request, neigh_update is called with NUD_STALE.
No reply is sent because request was not for us but we
just learned that NEIGH_IP is alive because it lookups
for someone else. This is common to observe with broadcasts,
GW lookups for other hosts and has to expose its IP+hwaddr.
More difficult to happen with unicast packets, you need hub,
not switch, to detect such packets.

	It is possible that you miss the packet that tries
to set NUD_STALE. May be you can add some printk's to catch
what kind of packet causes this. This can help too:

tcpdump -lnnn -s0 arp and host GW_IP

	If you see such packet, that is it. Our cache is
updated with NUD_STALE.

> So I'm not sure if it can learn from ARP reply.

	See above, received broadcast GARP reply can set
NUD_STALE. But the most trivial case of GW exposing its
IP while looking for other hosts should be the culprit.
It probably happens often, that is why we have no chance
to send ARP requests, GW is more ARP-active than us and
updates our cache and we are happy.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-22 12:47   ` Chunhui He
  2016-07-23  6:17     ` Julian Anastasov
@ 2016-07-25  5:20     ` YOSHIFUJI Hideaki/吉藤英明
  2016-07-25  8:13       ` Chunhui He
  1 sibling, 1 reply; 16+ messages in thread
From: YOSHIFUJI Hideaki/吉藤英明 @ 2016-07-25  5:20 UTC (permalink / raw)
  To: Chunhui He, ja
  Cc: hideaki.yoshifuji, davem, dsa, nicolas.dichtel, roopa, rshearma,
	dbarroso, martinbj2008, rick.jones2, koct9i, edumazet, tgraf,
	ebiederm, yoshfuji, netdev, linux-kernel

Hi,

Chunhui He wrote:
> Hi,
> 
> On Fri, 22 Jul 2016 10:20:01 +0300 (EEST), Julian Anastasov <ja@ssi.bg> wrote:
>>
>> 	Hello,
>>
>> On Thu, 21 Jul 2016, Chunhui He wrote:
>>
>>> If neigh entry was CONNECTED and address is not changed, and if new state is
>>> STALE, entry state will not change. Because DELAY is not in CONNECTED, it's
>>> possible to change state from DELAY to STALE.
>>>
>>> That is bad. Consider a host in IPv4 nerwork, a neigh entry in STALE state
>>> is referenced to send packets, so goes to DELAY state. If the entry is not
>>> confirmed by upper layer, it goes to PROBE state, and sends ARP request.
>>> The neigh host sends ARP reply, then the entry goes to REACHABLE state.
>>> But the entry state may be reseted to STALE by broadcast ARP packets, before
>>> the entry goes to PROBE state. So it's possible that the entry will never go
>>> to REACHABLE state, without external confirmation.
>>>
>>> In my case, the gateway refuses to send unicast packets to me, before it sees
>>> my ARP request. So it's critical to enter REACHABLE state by sending ARP
>>> request, but not by external confirmation.
>>>
>>> This fixes neigh_update() not to change to STALE if old state is CONNECTED or
>>> DELAY.
>>>
>>> Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn>
>>> ---
>>>  net/core/neighbour.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>> index 510cd62..29429eb 100644
>>> --- a/net/core/neighbour.c
>>> +++ b/net/core/neighbour.c
>>> @@ -1152,7 +1152,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>>>  		} else {
>>>  			if (lladdr == neigh->ha && new == NUD_STALE &&
>>>  			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
>>> -			     (old & NUD_CONNECTED))
>>> +			     (old & (NUD_CONNECTED | NUD_DELAY)))
>>>  			    )
>>>  				new = old;
>>>  		}
>>
>> 	You change looks correct to me. But this place
>> has more problems. There is no good reason to set NUD_STALE
>> for any state that is NUD_VALID if address is not changed.
>> This matches perfectly the comment above this code:
>> NUD_STALE should change a NUD_VALID state only when
>> address changes. It also means that IPv6 does not need
>> to provide NEIGH_UPDATE_F_WEAK_OVERRIDE anymore when
>> NEIGH_UPDATE_F_OVERRIDE is also present.
>>
> 
> The NEIGH_UPDATE_F_WEAK_OVERRIDE is confusing to me, so I choose not to deal
> with the flag.

IPv6 depends on WEAK_OVERRIDE.  Please do not change.

> 
>> 	By this way the state machine can continue with
>> the resolving: NUD_STALE -> NUD_DELAY (traffic) ->
>> NUD_PROBE (retries) -> NUD_REACHABLE (unicast reply)
>> while the address is not changed. Your change covers only
>> NUD_DELAY, not NUD_PROBE, so it is better to allow more
>> retries to send. We should not give up until success (NUD_REACHABLE).
>>
> 
> I have thought about this.
> The origin code allows NUD_DELAY -> NUD_STALE and NUD_PROBE -> NUD_STALE.
> This part was imported to kernel since v2.1.79, I don't know clearly why it
> allows that.
> 
> My analysis:
> (1) As shown in my previous mail, NUD_DELAY -> NUD_STALE may cause "dead loop",
> so it should be fixed.
> 
> (2) But NUD_PROBE -> NUD_STALE is acceptable, because in NUD_PROBE, ARP request
> has been sent, it is sufficient to break the "dead loop".
> More attempts are accomplished by the following sequence:
> NUD_STALE --> NUD_DELAY -(sent req)-> NUD_PROBE -(reset by neigh_update())->
> NUD_STALE --> NUD_DELAY -(send req again)-> ... -->
> NUD_REACHABLE
> 
> 
> But I also agree your change.
> 
>> 	Second problem: NEIGH_UPDATE_F_WEAK_OVERRIDE has no
>> priority over NEIGH_UPDATE_F_ADMIN. For example, now I can not
>> change from NUD_PERMANENT to NUD_STALE:
>>
>> # ip neigh add 192.168.168.111 lladdr 00:11:22:33:44:55 nud perm dev wlan0
>> # ip neigh show to 192.168.168.111
>> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
>> # ip neigh change 192.168.168.111 lladdr 00:11:22:33:44:55 nud stale dev wlan0
>> # ip neigh show to 192.168.168.111
>> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
>>
>> 	IMHO, here is how this place should look:
>>
>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index 5cdc62a..2b1cb91 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -1151,10 +1151,8 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>>  				goto out;
>>  		} else {
>>  			if (lladdr == neigh->ha && new == NUD_STALE &&
>> -			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
>> -			     (old & NUD_CONNECTED))
>> -			    )
>> -				new = old;
>> +			    !(flags & NEIGH_UPDATE_F_ADMIN))
>> +				goto out;
>>  		}
>>  	}
>>
>> 	Any thoughts?
>>  
>> Regards
>>
>> --
>> Julian Anastasov <ja@ssi.bg>
> 
> Regards,
> Chunhui He
> 

-- 
吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
ミラクル・リナックス株式会社 技術本部 サポート部

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-23 19:09             ` Julian Anastasov
@ 2016-07-25  7:52               ` Chunhui He
  0 siblings, 0 replies; 16+ messages in thread
From: Chunhui He @ 2016-07-25  7:52 UTC (permalink / raw)
  To: ja
  Cc: davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel


Hello,

On Sat, 23 Jul 2016 22:09:43 +0300 (EEST), Julian Anastasov <ja@ssi.bg> wrote:
> 
> 	May be that is the problem: we receive such packet,
> ip_route_input_noref detects that we allow such packet
> from NEIGH_IP on this interface, tip is not RTN_LOCAL (no
> ARP reply from us), tip is RTN_UNICAST but proxy_arp is not
> allowed, so we continue and reach __neigh_lookup which finds
> the existing cache entry because we talked to GW before that.
> As this is an ARP request, neigh_update is called with NUD_STALE.
> No reply is sent because request was not for us but we
> just learned that NEIGH_IP is alive because it lookups
> for someone else. This is common to observe with broadcasts,
> GW lookups for other hosts and has to expose its IP+hwaddr.
> More difficult to happen with unicast packets, you need hub,
> not switch, to detect such packets.
> 
> 	It is possible that you miss the packet that tries
> to set NUD_STALE. May be you can add some printk's to catch
> what kind of packet causes this. This can help too:
> 
> tcpdump -lnnn -s0 arp and host GW_IP
> 
> 	If you see such packet, that is it. Our cache is
> updated with NUD_STALE.
> 
>> So I'm not sure if it can learn from ARP reply.
> 
> 	See above, received broadcast GARP reply can set
> NUD_STALE. But the most trivial case of GW exposing its
> IP while looking for other hosts should be the culprit.
> It probably happens often, that is why we have no chance
> to send ARP requests, GW is more ARP-active than us and
> updates our cache and we are happy.
>

Thank you for your analysis.
I think the same, except that I don't think GW can update
our cache via broadcast ARP.

Please the comment in arp_process():
>                int state = NUD_REACHABLE;
...
>                /* Broadcast replies and request packets
>                   do not assert neighbour reachability.
>                 */
>                if (arp->ar_op != htons(ARPOP_REPLY) ||
>                    skb->pkt_type != PACKET_HOST)
>                        state = NUD_STALE;
>                neigh_update(n, sha, state,
...

Broadcast packets do not assert reachability, so they
should not interference our state machine. They are
just a hint.

Regards,
Chunhui

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-25  5:20     ` YOSHIFUJI Hideaki/吉藤英明
@ 2016-07-25  8:13       ` Chunhui He
  2016-07-25 12:45         ` 吉藤英明
  0 siblings, 1 reply; 16+ messages in thread
From: Chunhui He @ 2016-07-25  8:13 UTC (permalink / raw)
  To: hideaki.yoshifuji
  Cc: ja, davem, dsa, nicolas.dichtel, roopa, rshearma, dbarroso,
	martinbj2008, rick.jones2, koct9i, edumazet, tgraf, ebiederm,
	yoshfuji, netdev, linux-kernel

Hi,
On Mon, 25 Jul 2016 14:20:29 +0900, YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> wrote:
> Hi,
> 
> Chunhui He wrote:
>> Hi,
>> 
>> On Fri, 22 Jul 2016 10:20:01 +0300 (EEST), Julian Anastasov <ja@ssi.bg> wrote:
>>>
>>> 	Hello,
>>>
>>> On Thu, 21 Jul 2016, Chunhui He wrote:
>>>
>>>> If neigh entry was CONNECTED and address is not changed, and if new state is
>>>> STALE, entry state will not change. Because DELAY is not in CONNECTED, it's
>>>> possible to change state from DELAY to STALE.
>>>>
>>>> That is bad. Consider a host in IPv4 nerwork, a neigh entry in STALE state
>>>> is referenced to send packets, so goes to DELAY state. If the entry is not
>>>> confirmed by upper layer, it goes to PROBE state, and sends ARP request.
>>>> The neigh host sends ARP reply, then the entry goes to REACHABLE state.
>>>> But the entry state may be reseted to STALE by broadcast ARP packets, before
>>>> the entry goes to PROBE state. So it's possible that the entry will never go
>>>> to REACHABLE state, without external confirmation.
>>>>
>>>> In my case, the gateway refuses to send unicast packets to me, before it sees
>>>> my ARP request. So it's critical to enter REACHABLE state by sending ARP
>>>> request, but not by external confirmation.
>>>>
>>>> This fixes neigh_update() not to change to STALE if old state is CONNECTED or
>>>> DELAY.
>>>>
>>>> Signed-off-by: Chunhui He <hchunhui@mail.ustc.edu.cn>
>>>> ---
>>>>  net/core/neighbour.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>>> index 510cd62..29429eb 100644
>>>> --- a/net/core/neighbour.c
>>>> +++ b/net/core/neighbour.c
>>>> @@ -1152,7 +1152,7 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>>>>  		} else {
>>>>  			if (lladdr == neigh->ha && new == NUD_STALE &&
>>>>  			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
>>>> -			     (old & NUD_CONNECTED))
>>>> +			     (old & (NUD_CONNECTED | NUD_DELAY)))
>>>>  			    )
>>>>  				new = old;
>>>>  		}
>>>
>>> 	You change looks correct to me. But this place
>>> has more problems. There is no good reason to set NUD_STALE
>>> for any state that is NUD_VALID if address is not changed.
>>> This matches perfectly the comment above this code:
>>> NUD_STALE should change a NUD_VALID state only when
>>> address changes. It also means that IPv6 does not need
>>> to provide NEIGH_UPDATE_F_WEAK_OVERRIDE anymore when
>>> NEIGH_UPDATE_F_OVERRIDE is also present.
>>>
>> 
>> The NEIGH_UPDATE_F_WEAK_OVERRIDE is confusing to me, so I choose not to deal
>> with the flag.
> 
> IPv6 depends on WEAK_OVERRIDE.  Please do not change.
>

It seems like IPv6 always sets WEAK_OVERRIDE.

As Julian said, maybe there is no good reason to set NUD_STALE for any state
that is NUD_VALID if address is not changed, even WEAK_OVERRIDE is not set.
So we may eliminate WEAK_OVERRIDE in that branch.

I think this change should not break IPv6.

>> 
>>> 	By this way the state machine can continue with
>>> the resolving: NUD_STALE -> NUD_DELAY (traffic) ->
>>> NUD_PROBE (retries) -> NUD_REACHABLE (unicast reply)
>>> while the address is not changed. Your change covers only
>>> NUD_DELAY, not NUD_PROBE, so it is better to allow more
>>> retries to send. We should not give up until success (NUD_REACHABLE).
>>>
>> 
>> I have thought about this.
>> The origin code allows NUD_DELAY -> NUD_STALE and NUD_PROBE -> NUD_STALE.
>> This part was imported to kernel since v2.1.79, I don't know clearly why it
>> allows that.
>> 
>> My analysis:
>> (1) As shown in my previous mail, NUD_DELAY -> NUD_STALE may cause "dead loop",
>> so it should be fixed.
>> 
>> (2) But NUD_PROBE -> NUD_STALE is acceptable, because in NUD_PROBE, ARP request
>> has been sent, it is sufficient to break the "dead loop".
>> More attempts are accomplished by the following sequence:
>> NUD_STALE --> NUD_DELAY -(sent req)-> NUD_PROBE -(reset by neigh_update())->
>> NUD_STALE --> NUD_DELAY -(send req again)-> ... -->
>> NUD_REACHABLE
>> 
>> 
>> But I also agree your change.
>> 
>>> 	Second problem: NEIGH_UPDATE_F_WEAK_OVERRIDE has no
>>> priority over NEIGH_UPDATE_F_ADMIN. For example, now I can not
>>> change from NUD_PERMANENT to NUD_STALE:
>>>
>>> # ip neigh add 192.168.168.111 lladdr 00:11:22:33:44:55 nud perm dev wlan0
>>> # ip neigh show to 192.168.168.111
>>> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
>>> # ip neigh change 192.168.168.111 lladdr 00:11:22:33:44:55 nud stale dev wlan0
>>> # ip neigh show to 192.168.168.111
>>> 192.168.168.111 dev wlan0 lladdr 00:11:22:33:44:55 PERMANENT
>>>
>>> 	IMHO, here is how this place should look:
>>>
>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>> index 5cdc62a..2b1cb91 100644
>>> --- a/net/core/neighbour.c
>>> +++ b/net/core/neighbour.c
>>> @@ -1151,10 +1151,8 @@ int neigh_update(struct neighbour *neigh, const u8 *lladdr, u8 new,
>>>  				goto out;
>>>  		} else {
>>>  			if (lladdr == neigh->ha && new == NUD_STALE &&
>>> -			    ((flags & NEIGH_UPDATE_F_WEAK_OVERRIDE) ||
>>> -			     (old & NUD_CONNECTED))
>>> -			    )
>>> -				new = old;
>>> +			    !(flags & NEIGH_UPDATE_F_ADMIN))
>>> +				goto out;
>>>  		}
>>>  	}
>>>
>>> 	Any thoughts?
>>>  
>>> Regards
>>>
>>> --
>>> Julian Anastasov <ja@ssi.bg>
>> 
>> Regards,
>> Chunhui He
>> 
> 
> -- 
> 吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
> ミラクル・リナックス株式会社 技術本部 サポート部

Regards,
Chunhui

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-25  8:13       ` Chunhui He
@ 2016-07-25 12:45         ` 吉藤英明
  2016-07-25 18:57           ` Julian Anastasov
  0 siblings, 1 reply; 16+ messages in thread
From: 吉藤英明 @ 2016-07-25 12:45 UTC (permalink / raw)
  To: Chunhui He
  Cc: ja, David Miller, dsa, nicolas.dichtel, roopa, rshearma,
	dbarroso, martinbj2008, rick.jones2, koct9i, edumazet, tgraf,
	ebiederm, YOSHIFUJI Hideaki, network dev,
	Linux Kernel Mailing List

Hi,

2016-07-25 17:13 GMT+09:00 Chunhui He <hchunhui@mail.ustc.edu.cn>:
>>> The NEIGH_UPDATE_F_WEAK_OVERRIDE is confusing to me, so I choose not to deal
>>> with the flag.
>>
>> IPv6 depends on WEAK_OVERRIDE.  Please do not change.
>>
>
> It seems like IPv6 always sets WEAK_OVERRIDE.
>

Yes.

> As Julian said, maybe there is no good reason to set NUD_STALE for any state
> that is NUD_VALID if address is not changed, even WEAK_OVERRIDE is not set.
> So we may eliminate WEAK_OVERRIDE in that branch.
>
> I think this change should not break IPv6.

OK, following blocks are "no-op" and we will get same result.

Well, please do not try changing several things at the same time and
you could say:

if (ladder == neigh->ha && new == NUD_STALE &&
    !(flags & NUD_UPDATE_F_ADMIN))
        new = old;

I think I tried to maintain our traditional IPv4 behavior (e.g. as of
2.2 era), BTW...

--yoshfuji

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update()
  2016-07-25 12:45         ` 吉藤英明
@ 2016-07-25 18:57           ` Julian Anastasov
  0 siblings, 0 replies; 16+ messages in thread
From: Julian Anastasov @ 2016-07-25 18:57 UTC (permalink / raw)
  To: 吉藤英明
  Cc: Chunhui He, David Miller, dsa, nicolas.dichtel, roopa, rshearma,
	dbarroso, martinbj2008, rick.jones2, koct9i, edumazet, tgraf,
	ebiederm, YOSHIFUJI Hideaki, network dev,
	Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 741 bytes --]


	Hello,

On Mon, 25 Jul 2016, 吉藤英明 wrote:

> OK, following blocks are "no-op" and we will get same result.
> 
> Well, please do not try changing several things at the same time and
> you could say:
> 
> if (ladder == neigh->ha && new == NUD_STALE &&
>     !(flags & NUD_UPDATE_F_ADMIN))
>         new = old;

	OK, lets do it with 2 patches then.

	Chunhui He, can you modify your patch to delete the
both lines and explain that we prefer to resolve the
remote address, even while remote packets try to set NUD_STALE
state. If your patch is accepted, I'll post second patch that
adds the line with the ADMIN check. As result, the code will
look like the example from Yoshifuji Hideaki above.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-07-25 18:58 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-21 17:58 [PATCH] net: neigh: disallow state transition DELAY->STALE in neigh_update() Chunhui He
2016-07-22  7:20 ` Julian Anastasov
2016-07-22  7:20   ` Julian Anastasov
2016-07-22  9:41   ` Hannes Frederic Sowa
2016-07-22 12:47   ` Chunhui He
2016-07-23  6:17     ` Julian Anastasov
2016-07-23 11:24       ` Chunhui He
2016-07-23 14:09         ` Julian Anastasov
2016-07-23 14:09           ` Julian Anastasov
2016-07-23 16:52           ` Chunhui He
2016-07-23 19:09             ` Julian Anastasov
2016-07-25  7:52               ` Chunhui He
2016-07-25  5:20     ` YOSHIFUJI Hideaki/吉藤英明
2016-07-25  8:13       ` Chunhui He
2016-07-25 12:45         ` 吉藤英明
2016-07-25 18:57           ` Julian Anastasov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.