linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: vrf: remove redundant vrf neigh entry
@ 2019-03-22 14:10 linmiaohe
  2019-03-22 15:50 ` David Ahern
  0 siblings, 1 reply; 5+ messages in thread
From: linmiaohe @ 2019-03-22 14:10 UTC (permalink / raw)
  To: David Ahern, Shrijeet Mukherjee, David S. Miller, netdev, linux-kernel
  Cc: Mingfangsen

From: Miaohe Lin <linmiaohe@huawei.com>

When vrf->rth is created, it wouldn't change in his lifetime.And in
func vrf_finish_output, we always create a neigh with ip_hdr(skb)->daddr
as key because rth->rt_gateway is equal to 0. But I think we only need
one vrf neigh entry because we strip the ethernet header and reset the
dst_entry in vrf_process_v4_outbound.
So I set rth->rt_gateway to loopback addr(It's ok to set any other
valid ip address, just choose one.). And we would only create one vrf
neigh entry. This helps saving some memory and improving the hitting
rate of neigh lookup.
If there is something I misunderstand, it's very nice of you to
let me know. Thanks a lot.

Signed-off-by: linmiaohe <linmiaohe@huawei.com>
---
 drivers/net/vrf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 7c1430ed0244..2b0227fb8f53 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -738,6 +738,7 @@ static int vrf_rtable_create(struct net_device *dev)
 		return -ENOMEM;

 	rth->dst.output	= vrf_output;
+	rth->rt_gateway = htonl(INADDR_LOOPBACK);

 	rcu_assign_pointer(vrf->rth, rth);

-- 
2.16.2



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: vrf: remove redundant vrf neigh entry
  2019-03-22 14:10 [PATCH net] net: vrf: remove redundant vrf neigh entry linmiaohe
@ 2019-03-22 15:50 ` David Ahern
  2019-03-23  1:58   ` linmiaohe
  2019-04-11  3:39   ` linmiaohe
  0 siblings, 2 replies; 5+ messages in thread
From: David Ahern @ 2019-03-22 15:50 UTC (permalink / raw)
  To: linmiaohe, Shrijeet Mukherjee, David S. Miller, netdev, linux-kernel
  Cc: Mingfangsen

On 3/22/19 3:10 PM, linmiaohe wrote:
> From: Miaohe Lin <linmiaohe@huawei.com>
> 
> When vrf->rth is created, it wouldn't change in his lifetime.And in
> func vrf_finish_output, we always create a neigh with ip_hdr(skb)->daddr
> as key because rth->rt_gateway is equal to 0. But I think we only need
> one vrf neigh entry because we strip the ethernet header and reset the
> dst_entry in vrf_process_v4_outbound.
> So I set rth->rt_gateway to loopback addr(It's ok to set any other
> valid ip address, just choose one.). And we would only create one vrf
> neigh entry. This helps saving some memory and improving the hitting
> rate of neigh lookup.
> If there is something I misunderstand, it's very nice of you to
> let me know. Thanks a lot.
> 
> Signed-off-by: linmiaohe <linmiaohe@huawei.com>
> ---
>  drivers/net/vrf.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
> index 7c1430ed0244..2b0227fb8f53 100644
> --- a/drivers/net/vrf.c
> +++ b/drivers/net/vrf.c
> @@ -738,6 +738,7 @@ static int vrf_rtable_create(struct net_device *dev)
>  		return -ENOMEM;
> 
>  	rth->dst.output	= vrf_output;
> +	rth->rt_gateway = htonl(INADDR_LOOPBACK);
> 
>  	rcu_assign_pointer(vrf->rth, rth);
> 

Did you investigate how neighbor entries are getting created? The vrf
device has IFF_NOARP set, so neigh entries should not be created.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: vrf: remove redundant vrf neigh entry
  2019-03-22 15:50 ` David Ahern
@ 2019-03-23  1:58   ` linmiaohe
  2019-04-11  3:39   ` linmiaohe
  1 sibling, 0 replies; 5+ messages in thread
From: linmiaohe @ 2019-03-23  1:58 UTC (permalink / raw)
  To: David Ahern, Shrijeet Mukherjee, David S. Miller, netdev, linux-kernel
  Cc: Mingfangsen



On 2019/3/22 23:50, David Ahern wrote:
> On 3/22/19 3:10 PM, linmiaohe wrote:
>> From: Miaohe Lin <linmiaohe@huawei.com>
>>
>> When vrf->rth is created, it wouldn't change in his lifetime.And in
>> func vrf_finish_output, we always create a neigh with ip_hdr(skb)->daddr
>> as key because rth->rt_gateway is equal to 0. But I think we only need
>> one vrf neigh entry because we strip the ethernet header and reset the
>> dst_entry in vrf_process_v4_outbound.
>> So I set rth->rt_gateway to loopback addr(It's ok to set any other
>> valid ip address, just choose one.). And we would only create one vrf
>> neigh entry. This helps saving some memory and improving the hitting
>> rate of neigh lookup.
>> If there is something I misunderstand, it's very nice of you to
>> let me know. Thanks a lot.
>>
>> Signed-off-by: linmiaohe <linmiaohe@huawei.com>
>> ---
>>  drivers/net/vrf.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
>> index 7c1430ed0244..2b0227fb8f53 100644
>> --- a/drivers/net/vrf.c
>> +++ b/drivers/net/vrf.c
>> @@ -738,6 +738,7 @@ static int vrf_rtable_create(struct net_device *dev)
>>  		return -ENOMEM;
>>
>>  	rth->dst.output	= vrf_output;
>> +	rth->rt_gateway = htonl(INADDR_LOOPBACK);
>>
>>  	rcu_assign_pointer(vrf->rth, rth);
>>
> 
> Did you investigate how neighbor entries are getting created? The vrf
> device has IFF_NOARP set, so neigh entries should not be created.
> 
> .
> 

Thanks a lot. I searched for how IFF_NOARP works, but I haven't
investigated how neighbor entries are getting created. I will pay
more effort to study neighbor subsystem and thanks for pointing out
that.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: vrf: remove redundant vrf neigh entry
  2019-03-22 15:50 ` David Ahern
  2019-03-23  1:58   ` linmiaohe
@ 2019-04-11  3:39   ` linmiaohe
  2019-04-15 19:05     ` David Ahern
  1 sibling, 1 reply; 5+ messages in thread
From: linmiaohe @ 2019-04-11  3:39 UTC (permalink / raw)
  To: David Ahern, Shrijeet Mukherjee, David S. Miller, netdev, linux-kernel
  Cc: Mingfangsen


On 2019/3/22 23:50, David Ahern wrote:
> On 3/22/19 3:10 PM, linmiaohe wrote:
>> From: Miaohe Lin <linmiaohe@huawei.com>
>>
>> When vrf->rth is created, it wouldn't change in his lifetime.And in
>> func vrf_finish_output, we always create a neigh with ip_hdr(skb)->daddr
>> as key because rth->rt_gateway is equal to 0. But I think we only need
>> one vrf neigh entry because we strip the ethernet header and reset the
>> dst_entry in vrf_process_v4_outbound.
>> So I set rth->rt_gateway to loopback addr(It's ok to set any other
>> valid ip address, just choose one.). And we would only create one vrf
>> neigh entry. This helps saving some memory and improving the hitting
>> rate of neigh lookup.
>> If there is something I misunderstand, it's very nice of you to
>> let me know. Thanks a lot.
>>
>> Signed-off-by: linmiaohe <linmiaohe@huawei.com>
>> ---
>>  drivers/net/vrf.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
>> index 7c1430ed0244..2b0227fb8f53 100644
>> --- a/drivers/net/vrf.c
>> +++ b/drivers/net/vrf.c
>> @@ -738,6 +738,7 @@ static int vrf_rtable_create(struct net_device *dev)
>>  		return -ENOMEM;
>>
>>  	rth->dst.output	= vrf_output;
>> +	rth->rt_gateway = htonl(INADDR_LOOPBACK);
>>
>>  	rcu_assign_pointer(vrf->rth, rth);
>>
> 
> Did you investigate how neighbor entries are getting created? The vrf
> device has IFF_NOARP set, so neigh entries should not be created.
> 
> .
> 
Hi,David A.,I investigate how neighbor entries are getting created recently.
But I can't find where neigh entries is not created when vrf device has
IFF_NOARP set.
So I add some printk info,and I ping the different host, here is the output:

[root@localhost ~]# ip vrf exec vrf1 ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=1.78 ms
^C
--- 10.0.0.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.776/1.776/1.776/0.000 ms
[root@localhost ~]# ip vrf exec vrf1 ping 11.0.0.2
PING 11.0.0.2 (11.0.0.2) 56(84) bytes of data.
64 bytes from 11.0.0.2: icmp_seq=1 ttl=64 time=1.59 ms
^C
--- 11.0.0.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.591/1.591/1.591/0.000 ms
[root@localhost ~]# ip vrf exec vrf1 ping 11.0.0.3
PING 11.0.0.3 (11.0.0.3) 56(84) bytes of data.
^C
--- 11.0.0.3 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

Apr 11 11:01:48 localhost kernel: [  337.311270] VRF: IFF_NOARP is set
Apr 11 11:01:48 localhost kernel: [  337.311279] VRF: nexthop = 200000a
Apr 11 11:01:48 localhost kernel: [  337.311284] VRF: neigh =           (null) after lookup
Apr 11 11:01:48 localhost kernel: [  337.311294] VRF: we create a neigh 000000001e8acd79
Apr 11 11:01:51 localhost kernel: [  340.026623] VRF: IFF_NOARP is set
Apr 11 11:01:51 localhost kernel: [  340.026627] VRF: nexthop = 200000b
Apr 11 11:01:51 localhost kernel: [  340.026631] VRF: neigh =           (null) after lookup
Apr 11 11:01:51 localhost kernel: [  340.026637] VRF: we create a neigh 00000000a0ad96da
Apr 11 11:01:56 localhost kernel: [  345.157529] VRF: IFF_NOARP is set
Apr 11 11:01:56 localhost kernel: [  345.157539] VRF: nexthop = 300000b
Apr 11 11:01:56 localhost kernel: [  345.157544] VRF: neigh =           (null) after lookup
Apr 11 11:01:56 localhost kernel: [  345.157556] VRF: we create a neigh 00000000a5167b56

And here is the printk code:

if (vrf_dev->flags & IFF_NOARP) {
    printk(KERN_ERR "VRF: IFF_NOARP is set\n");
    rth = rcu_dereference(vrf->rth);
    nexthop = (__force u32)rt_nexthop(rth, ip_hdr(skb)->daddr);
    printk(KERN_ERR "VRF: nexthop = %x\n", nexthop);
    neigh = __ipv4_neigh_lookup_noref(vrf_dev, nexthop);
    printk(KERN_ERR "VRF: neigh = %p after lookup\n", (void *)neigh);
    if (unlikely(!neigh)) {
        neigh = __neigh_create(&arp_tbl, &nexthop, vrf_dev, false);
        printk(KERN_ERR "VRF: we create a neigh %p\n", (void *)neigh);
    }
}

Could you please tell me if I was misunderstanding something again? It's very nice
of you if you can figure me out that. Thanks a lot.I am looking forward to your reply.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: vrf: remove redundant vrf neigh entry
  2019-04-11  3:39   ` linmiaohe
@ 2019-04-15 19:05     ` David Ahern
  0 siblings, 0 replies; 5+ messages in thread
From: David Ahern @ 2019-04-15 19:05 UTC (permalink / raw)
  To: linmiaohe, Shrijeet Mukherjee, David S. Miller, netdev, linux-kernel
  Cc: Mingfangsen

On 4/10/19 9:39 PM, linmiaohe wrote:
> 
> On 2019/3/22 23:50, David Ahern wrote:
>> On 3/22/19 3:10 PM, linmiaohe wrote:
>>> From: Miaohe Lin <linmiaohe@huawei.com>
>>>
>>> When vrf->rth is created, it wouldn't change in his lifetime.And in
>>> func vrf_finish_output, we always create a neigh with ip_hdr(skb)->daddr
>>> as key because rth->rt_gateway is equal to 0. But I think we only need
>>> one vrf neigh entry because we strip the ethernet header and reset the
>>> dst_entry in vrf_process_v4_outbound.
>>> So I set rth->rt_gateway to loopback addr(It's ok to set any other
>>> valid ip address, just choose one.). And we would only create one vrf
>>> neigh entry. This helps saving some memory and improving the hitting
>>> rate of neigh lookup.
>>> If there is something I misunderstand, it's very nice of you to
>>> let me know. Thanks a lot.
>>>
>>> Signed-off-by: linmiaohe <linmiaohe@huawei.com>
>>> ---
>>>  drivers/net/vrf.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
>>> index 7c1430ed0244..2b0227fb8f53 100644
>>> --- a/drivers/net/vrf.c
>>> +++ b/drivers/net/vrf.c
>>> @@ -738,6 +738,7 @@ static int vrf_rtable_create(struct net_device *dev)
>>>  		return -ENOMEM;
>>>
>>>  	rth->dst.output	= vrf_output;
>>> +	rth->rt_gateway = htonl(INADDR_LOOPBACK);
>>>
>>>  	rcu_assign_pointer(vrf->rth, rth);
>>>
>>
>> Did you investigate how neighbor entries are getting created? The vrf
>> device has IFF_NOARP set, so neigh entries should not be created.
>>
>> .
>>
> Hi,David A.,I investigate how neighbor entries are getting created recently.
> But I can't find where neigh entries is not created when vrf device has
> IFF_NOARP set.
> So I add some printk info,and I ping the different host, here is the output:
> 
> [root@localhost ~]# ip vrf exec vrf1 ping 10.0.0.2
> PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
> 64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=1.78 ms
> ^C
> --- 10.0.0.2 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 1.776/1.776/1.776/0.000 ms
> [root@localhost ~]# ip vrf exec vrf1 ping 11.0.0.2
> PING 11.0.0.2 (11.0.0.2) 56(84) bytes of data.
> 64 bytes from 11.0.0.2: icmp_seq=1 ttl=64 time=1.59 ms
> ^C
> --- 11.0.0.2 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 1.591/1.591/1.591/0.000 ms
> [root@localhost ~]# ip vrf exec vrf1 ping 11.0.0.3
> PING 11.0.0.3 (11.0.0.3) 56(84) bytes of data.
> ^C
> --- 11.0.0.3 ping statistics ---
> 1 packets transmitted, 0 received, 100% packet loss, time 0ms
> 
> Apr 11 11:01:48 localhost kernel: [  337.311270] VRF: IFF_NOARP is set
> Apr 11 11:01:48 localhost kernel: [  337.311279] VRF: nexthop = 200000a
> Apr 11 11:01:48 localhost kernel: [  337.311284] VRF: neigh =           (null) after lookup
> Apr 11 11:01:48 localhost kernel: [  337.311294] VRF: we create a neigh 000000001e8acd79
> Apr 11 11:01:51 localhost kernel: [  340.026623] VRF: IFF_NOARP is set
> Apr 11 11:01:51 localhost kernel: [  340.026627] VRF: nexthop = 200000b
> Apr 11 11:01:51 localhost kernel: [  340.026631] VRF: neigh =           (null) after lookup
> Apr 11 11:01:51 localhost kernel: [  340.026637] VRF: we create a neigh 00000000a0ad96da
> Apr 11 11:01:56 localhost kernel: [  345.157529] VRF: IFF_NOARP is set
> Apr 11 11:01:56 localhost kernel: [  345.157539] VRF: nexthop = 300000b
> Apr 11 11:01:56 localhost kernel: [  345.157544] VRF: neigh =           (null) after lookup
> Apr 11 11:01:56 localhost kernel: [  345.157556] VRF: we create a neigh 00000000a5167b56
> 
> And here is the printk code:
> 
> if (vrf_dev->flags & IFF_NOARP) {
>     printk(KERN_ERR "VRF: IFF_NOARP is set\n");
>     rth = rcu_dereference(vrf->rth);
>     nexthop = (__force u32)rt_nexthop(rth, ip_hdr(skb)->daddr);
>     printk(KERN_ERR "VRF: nexthop = %x\n", nexthop);
>     neigh = __ipv4_neigh_lookup_noref(vrf_dev, nexthop);
>     printk(KERN_ERR "VRF: neigh = %p after lookup\n", (void *)neigh);
>     if (unlikely(!neigh)) {
>         neigh = __neigh_create(&arp_tbl, &nexthop, vrf_dev, false);
>         printk(KERN_ERR "VRF: we create a neigh %p\n", (void *)neigh);
>     }
> }
> 
> Could you please tell me if I was misunderstanding something again? It's very nice
> of you if you can figure me out that. Thanks a lot.I am looking forward to your reply.
> 

In the above statements you are passing vrf_dev to neigh_create, so of
course it is creating an entry against that device. That should not be
happening. When dst->dev is a vrf device it is for local traffic only
(locally originated to a local address) and there are no neighbor
entries for that. Otherwise, it is an enslaved device and the
neigh_create is always done with it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-04-15 19:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-22 14:10 [PATCH net] net: vrf: remove redundant vrf neigh entry linmiaohe
2019-03-22 15:50 ` David Ahern
2019-03-23  1:58   ` linmiaohe
2019-04-11  3:39   ` linmiaohe
2019-04-15 19:05     ` David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).