All of lore.kernel.org
 help / color / mirror / Atom feed
* 100% CPU load when generating traffic to destination network that nexthop is not reachable
@ 2017-08-15 16:30 Paweł Staszewski
  2017-08-15 16:57 ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Paweł Staszewski @ 2017-08-15 16:30 UTC (permalink / raw)
  To: Linux Kernel Network Developers

Hi


Doing some tests i discovered that when traffic is send by pktgen to 
forwarding host where nexthop for destination network on forwarding 
router is not reachable i have 100% cpu on all cores and perf top show 
mostly:

     77.19%  [kernel]            [k] queued_spin_lock_slowpath
     10.20%  [kernel]            [k] acpi_processor_ffh_cstate_enter
      1.41%  [kernel]            [k] queued_write_lock_slowpath


Configuration of forwarding host below:

ip a

Receiving interface:

8: enp175s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state 
UP group default qlen 1000
     link/ether 0c:c4:7a:d8:5d:1c brd ff:ff:ff:ff:ff:ff
     inet 10.0.0.1/30 scope global enp175s0f0
        valid_lft forever preferred_lft forever
     inet6 fe80::ec4:7aff:fed8:5d1c/64 scope link
        valid_lft forever preferred_lft forever

Transmitting vlans (binded to: enp175s0f1)
12: vlan1000@enp175s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 
qdisc noqueue state UP group default qlen 1000
     link/ether 0c:c4:7a:d8:5d:1d brd ff:ff:ff:ff:ff:ff
     inet 10.10.0.1/30 scope global vlan1000
        valid_lft forever preferred_lft forever
     inet6 fe80::ec4:7aff:fed8:5d1d/64 scope link
        valid_lft forever preferred_lft forever
13: vlan1001@enp175s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 
qdisc noqueue state UP group default qlen 1000
     link/ether 0c:c4:7a:d8:5d:1d brd ff:ff:ff:ff:ff:ff
     inet 10.10.1.1/30 scope global vlan1001
        valid_lft forever preferred_lft forever
     inet6 fe80::ec4:7aff:fed8:5d1d/64 scope link
        valid_lft forever preferred_lft forever
14: vlan1002@enp175s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 
qdisc noqueue state UP group default qlen 1000
     link/ether 0c:c4:7a:d8:5d:1d brd ff:ff:ff:ff:ff:ff
     inet 10.10.2.1/30 scope global vlan1002
        valid_lft forever preferred_lft forever
     inet6 fe80::ec4:7aff:fed8:5d1d/64 scope link
        valid_lft forever preferred_lft forever

Routing table:
10.0.0.0/30 dev enp175s0f0 proto kernel scope link src 10.0.0.1
10.10.0.0/30 dev vlan1000 proto kernel scope link src 10.10.0.1
10.10.1.0/30 dev vlan1001 proto kernel scope link src 10.10.1.1
10.10.2.0/30 dev vlan1002 proto kernel scope link src 10.10.2.1
172.16.0.0/24 via 10.10.0.2 dev vlan1000
172.16.1.0/24 via 10.10.1.2 dev vlan1001
172.16.2.0/24 via 10.10.2.2 dev vlan1002


pktgen is transmitting packets to this forwarding hosts and generating 
random destinations from ip range:
     pg_set $dev "dst_min 172.16.0.1"
     pg_set $dev "dst_max 172.16.2.255"


So when packets with destination network 172.16.0.0/24 are reaching 
forwarding host then are routed via  10.10.0.2 dev vlan1000
for packets with destination network 172.16.1.0/24 forwarding host 
routing them via 10.10.1.2 dev vlan1001
and last network 172.16.2.0/24 is routed via 10.10.2.2 dev vlan1002


Normally when situation is like this:

ip neigh ls dev vlan1000
10.10.0.2 lladdr ac:1f:6b:2c:18:89 REACHABLE
ip neigh ls dev vlan1001
10.10.1.2 lladdr ac:1f:6b:2c:18:89 REACHABLE
ip neigh ls dev vlan1002
10.10.2.2 lladdr ac:1f:6b:2c:18:89 REACHABLE


There is no problem router is receiving 11Mpps and forwarding then 
equally to vlans:
  bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
   input: /proc/net/dev type: rate
   -         iface                   Rx Tx                Total
==============================================================================
          vlan1002:            0.00 P/s       3877006.00 P/s 3877006.00 P/s
          vlan1001:            0.00 P/s       3877234.75 P/s 3877234.75 P/s
        enp175s0f0:     11962601.00 P/s             0.00 P/s 11962601.00 P/s
          vlan1000:            0.00 P/s       3862602.00 P/s 3862602.00 P/s
------------------------------------------------------------------------------
             total:     11962601.00 P/s      11616843.00 P/s 23579444.00 P/s



And perf top shows like this:
    PerfTop:  210522 irqs/sec  kernel:99.7%  exact:  0.0% [4000Hz 
cycles],  (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     26.98%  [kernel]       [k] do_raw_spin_lock
      7.69%  [kernel]       [k] acpi_processor_ffh_cstate_enter
      4.92%  [kernel]       [k] fib_table_lookup
      4.28%  [mlx5_core]    [k] mlx5e_xmit
      4.01%  [mlx5_core]    [k] mlx5e_handle_rx_cqe
      2.71%  [kernel]       [k] virt_to_head_page
      2.21%  [kernel]       [k] tasklet_action
      1.87%  [mlx5_core]    [k] mlx5_eq_int
      1.58%  [kernel]       [k] ipt_do_table
      1.55%  [mlx5_core]    [k] mlx5e_poll_tx_cq
      1.53%  [kernel]       [k] irq_entries_start
      1.48%  [kernel]       [k] __dev_queue_xmit
      1.44%  [kernel]       [k] __build_skb
      1.30%  [mlx5_core]    [k] eq_update_ci
      1.20%  [kernel]       [k] read_tsc
      1.10%  [kernel]       [k] ip_finish_output2
      1.06%  [kernel]       [k] ip_rcv
      1.02%  [kernel]       [k] netif_skb_features
      1.01%  [mlx5_core]    [k] mlx5_cqwq_get_cqe
      0.95%  [kernel]       [k] __netif_receive_skb_core



But when i will disable any vlan on the switch - for example I will do 
this for vlan1002
(Forwarding host is connected thru switch where are vlans to the sink host)
root@cumulus:~# ip link set down dev vlan1002.49
root@cumulus:~# ip link set down dev vlan1002.3
root@cumulus:~# ip link set down dev brtest1002

Wait for fdb to expire on switch.

there is incomplete arp on interface vlan1002
ip neigh ls dev vlan1002
10.10.2.2  INCOMPLETE


pktgen is still pushing traffic with packets that destination network is 
172.16.2.0/24




and we have 100% cpu with pps below:
   bwm-ng v0.6.1 (probing every 0.500s), press 'h' for help
   input: /proc/net/dev type: rate
   |         iface                   Rx Tx                Total
==============================================================================
          vlan1002:            0.00 P/s             1.99 P/s             
1.99 P/s
          vlan1001:            0.00 P/s        717227.12 P/s 717227.12 P/s
        enp175s0f0:      2713679.25 P/s             0.00 P/s 2713679.25 P/s
          vlan1000:            0.00 P/s        716145.44 P/s 716145.44 P/s
------------------------------------------------------------------------------
             total:      2713679.25 P/s       1433374.50 P/s 4147054.00 P/s


with perf top:



    PerfTop:  218506 irqs/sec  kernel:99.7%  exact:  0.0% [4000Hz 
cycles],  (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     91.45%  [kernel]            [k] queued_spin_lock_slowpath
      1.71%  [kernel]            [k] queued_write_lock_slowpath
      0.46%  [kernel]            [k] ip_finish_output2
      0.44%  [mlx5_core]         [k] mlx5e_handle_rx_cqe
      0.43%  [kernel]            [k] fib_table_lookup
      0.40%  [kernel]            [k] do_raw_spin_lock
      0.35%  [kernel]            [k] __neigh_event_send
      0.33%  [kernel]            [k] dst_release
      0.26%  [kernel]            [k] queued_write_lock
      0.22%  [mlx5_core]         [k] mlx5_cqwq_get_cqe
      0.22%  [mlx5_core]         [k] mlx5e_xmit
      0.19%  [kernel]            [k] virt_to_head_page
      0.18%  [kernel]            [k] page_frag_free
[...]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 16:30 100% CPU load when generating traffic to destination network that nexthop is not reachable Paweł Staszewski
@ 2017-08-15 16:57 ` Eric Dumazet
  2017-08-15 17:42   ` Paweł Staszewski
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2017-08-15 16:57 UTC (permalink / raw)
  To: Paweł Staszewski; +Cc: Linux Kernel Network Developers

On Tue, 2017-08-15 at 18:30 +0200, Paweł Staszewski wrote:
> Hi
> 
> 
> Doing some tests i discovered that when traffic is send by pktgen to 
> forwarding host where nexthop for destination network on forwarding 
> router is not reachable i have 100% cpu on all cores and perf top show 
> mostly:
> 
>      77.19%  [kernel]            [k] queued_spin_lock_slowpath
>      10.20%  [kernel]            [k] acpi_processor_ffh_cstate_enter
>       1.41%  [kernel]            [k] queued_write_lock_slowpath
> 

To the rescue (for us to help)

perf record -a -g sleep 10

perf report --stdio

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 16:57 ` Eric Dumazet
@ 2017-08-15 17:42   ` Paweł Staszewski
  2017-08-15 19:11     ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Paweł Staszewski @ 2017-08-15 17:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Network Developers


# To display the perf.data header info, please use 
--header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 2M of event 'cycles'
# Event count (approx.): 1585571545969
#
# Children      Self  Command         Shared Object         Symbol
# ........  ........  ..............  .................... 
..............................................
#
      1.82%     0.00%  ksoftirqd/43    [kernel.vmlinux]      [k] 
__softirqentry_text_start
             |
              --1.82%--__softirqentry_text_start
                        |
                         --1.82%--net_rx_action
                                   |
                                    --1.82%--mlx5e_napi_poll
                                              |
--1.81%--mlx5e_poll_rx_cq
                                                         |
--1.81%--mlx5e_handle_rx_cqe
                                                                    |
--1.79%--napi_gro_receive
|
--1.78%--netif_receive_skb_internal
|
--1.78%--__netif_receive_skb
|
--1.78%--__netif_receive_skb_core
|
--1.78%--ip_rcv
|
--1.78%--ip_rcv_finish
|
--1.76%--ip_forward
|
--1.76%--ip_forward_finish
|
--1.76%--ip_output
|
--1.76%--ip_finish_output
|
--1.76%--ip_finish_output2
|
--1.73%--neigh_resolve_output
|
--1.73%--neigh_event_send
|
--1.73%--__neigh_event_send
|
--1.70%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.67%--queued_spin_lock_slowpath



W dniu 2017-08-15 o 18:57, Eric Dumazet pisze:
> On Tue, 2017-08-15 at 18:30 +0200, Paweł Staszewski wrote:
>> Hi
>>
>>
>> Doing some tests i discovered that when traffic is send by pktgen to
>> forwarding host where nexthop for destination network on forwarding
>> router is not reachable i have 100% cpu on all cores and perf top show
>> mostly:
>>
>>       77.19%  [kernel]            [k] queued_spin_lock_slowpath
>>       10.20%  [kernel]            [k] acpi_processor_ffh_cstate_enter
>>        1.41%  [kernel]            [k] queued_write_lock_slowpath
>>
> To the rescue (for us to help)
>
> perf record -a -g sleep 10
>
> perf report --stdio
>
>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 17:42   ` Paweł Staszewski
@ 2017-08-15 19:11     ` Eric Dumazet
  2017-08-15 19:45       ` Julian Anastasov
  2017-08-15 20:53       ` Paweł Staszewski
  0 siblings, 2 replies; 13+ messages in thread
From: Eric Dumazet @ 2017-08-15 19:11 UTC (permalink / raw)
  To: Paweł Staszewski; +Cc: Linux Kernel Network Developers

On Tue, 2017-08-15 at 19:42 +0200, Paweł Staszewski wrote:
> # To display the perf.data header info, please use 
> --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 2M of event 'cycles'
> # Event count (approx.): 1585571545969
> #
> # Children      Self  Command         Shared Object         Symbol
> # ........  ........  ..............  .................... 
> ..............................................
> #
>       1.82%     0.00%  ksoftirqd/43    [kernel.vmlinux]      [k] 
> __softirqentry_text_start
>              |
>               --1.82%--__softirqentry_text_start
>                         |
>                          --1.82%--net_rx_action
>                                    |
>                                     --1.82%--mlx5e_napi_poll
>                                               |
> --1.81%--mlx5e_poll_rx_cq
>                                                          |
> --1.81%--mlx5e_handle_rx_cqe
>                                                                     |
> --1.79%--napi_gro_receive
> |
> --1.78%--netif_receive_skb_internal
> |
> --1.78%--__netif_receive_skb
> |
> --1.78%--__netif_receive_skb_core
> |
> --1.78%--ip_rcv
> |
> --1.78%--ip_rcv_finish
> |
> --1.76%--ip_forward
> |
> --1.76%--ip_forward_finish
> |
> --1.76%--ip_output
> |
> --1.76%--ip_finish_output
> |
> --1.76%--ip_finish_output2
> |
> --1.73%--neigh_resolve_output
> |
> --1.73%--neigh_event_send
> |
> --1.73%--__neigh_event_send
> |
> --1.70%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.67%--queued_spin_lock_slowpath
> 
> 

Please try this :
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 16a1a4c4eb57fa1147f230916e2e62e18ef89562..95e0d7702029b583de8229e3c3eb923f6395b072 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -991,14 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
 
 int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
 {
-	int rc;
 	bool immediate_probe = false;
+	int rc;
+
+	/* We _should_ test this under write_lock_bh(&neigh->lock),
+	 * but this is too costly.
+	 */
+	if (READ_ONCE(neigh->nud_state) & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
+		return 0;
 
 	write_lock_bh(&neigh->lock);
 
 	rc = 0;
-	if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
-		goto out_unlock_bh;
 	if (neigh->dead)
 		goto out_dead;
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 19:11     ` Eric Dumazet
@ 2017-08-15 19:45       ` Julian Anastasov
  2017-08-15 21:06         ` Eric Dumazet
  2017-08-15 20:53       ` Paweł Staszewski
  1 sibling, 1 reply; 13+ messages in thread
From: Julian Anastasov @ 2017-08-15 19:45 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paweł Staszewski, Linux Kernel Network Developers


	Hello,

On Tue, 15 Aug 2017, Eric Dumazet wrote:

> Please try this :
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 16a1a4c4eb57fa1147f230916e2e62e18ef89562..95e0d7702029b583de8229e3c3eb923f6395b072 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -991,14 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
>  
>  int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
>  {
> -	int rc;
>  	bool immediate_probe = false;
> +	int rc;
> +
> +	/* We _should_ test this under write_lock_bh(&neigh->lock),
> +	 * but this is too costly.
> +	 */
> +	if (READ_ONCE(neigh->nud_state) & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
> +		return 0;

	The same fast check is already done in the only caller,
neigh_event_send. Now we risk to enter the
'if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {' block...

>  	write_lock_bh(&neigh->lock);
>  
>  	rc = 0;
> -	if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
> -		goto out_unlock_bh;
>  	if (neigh->dead)
>  		goto out_dead;

Regards

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 19:11     ` Eric Dumazet
  2017-08-15 19:45       ` Julian Anastasov
@ 2017-08-15 20:53       ` Paweł Staszewski
  2017-08-15 22:00         ` Paweł Staszewski
  1 sibling, 1 reply; 13+ messages in thread
From: Paweł Staszewski @ 2017-08-15 20:53 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Network Developers

Hi


Patch applied but no change still 100% CPU

perf below:

      1.99%     0.00%  ksoftirqd/1      [kernel.vmlinux] [k] run_ksoftirqd
             |
             ---run_ksoftirqd
                |
                 --1.99%--__softirqentry_text_start
                           |
                            --1.99%--net_rx_action
                                      |
                                       --1.98%--mlx5e_napi_poll
                                                 |
--1.98%--mlx5e_poll_rx_cq
                                                            |
--1.98%--mlx5e_handle_rx_cqe
                                                                       |
--1.96%--napi_gro_receive
|
--1.96%--netif_receive_skb_internal
|
--1.96%--__netif_receive_skb
|
--1.96%--__netif_receive_skb_core
|
--1.95%--ip_rcv
|
--1.95%--ip_rcv_finish
|
--1.94%--ip_forward
|
--1.94%--ip_forward_finish
|
--1.94%--ip_output
|
--1.94%--ip_finish_output
|
--1.94%--ip_finish_output2
|
--1.90%--neigh_resolve_output
|
--1.90%--neigh_event_send
|
--1.90%--__neigh_event_send
|
--1.87%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.84%--queued_spin_lock_slowpath


W dniu 2017-08-15 o 21:11, Eric Dumazet pisze:
> On Tue, 2017-08-15 at 19:42 +0200, Paweł Staszewski wrote:
>> # To display the perf.data header info, please use
>> --header/--header-only options.
>> #
>> #
>> # Total Lost Samples: 0
>> #
>> # Samples: 2M of event 'cycles'
>> # Event count (approx.): 1585571545969
>> #
>> # Children      Self  Command         Shared Object         Symbol
>> # ........  ........  ..............  ....................
>> ..............................................
>> #
>>        1.82%     0.00%  ksoftirqd/43    [kernel.vmlinux]      [k]
>> __softirqentry_text_start
>>               |
>>                --1.82%--__softirqentry_text_start
>>                          |
>>                           --1.82%--net_rx_action
>>                                     |
>>                                      --1.82%--mlx5e_napi_poll
>>                                                |
>> --1.81%--mlx5e_poll_rx_cq
>>                                                           |
>> --1.81%--mlx5e_handle_rx_cqe
>>                                                                      |
>> --1.79%--napi_gro_receive
>> |
>> --1.78%--netif_receive_skb_internal
>> |
>> --1.78%--__netif_receive_skb
>> |
>> --1.78%--__netif_receive_skb_core
>> |
>> --1.78%--ip_rcv
>> |
>> --1.78%--ip_rcv_finish
>> |
>> --1.76%--ip_forward
>> |
>> --1.76%--ip_forward_finish
>> |
>> --1.76%--ip_output
>> |
>> --1.76%--ip_finish_output
>> |
>> --1.76%--ip_finish_output2
>> |
>> --1.73%--neigh_resolve_output
>> |
>> --1.73%--neigh_event_send
>> |
>> --1.73%--__neigh_event_send
>> |
>> --1.70%--_raw_write_lock_bh
>> queued_write_lock
>> queued_write_lock_slowpath
>> |
>> --1.67%--queued_spin_lock_slowpath
>>
>>
> Please try this :
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 16a1a4c4eb57fa1147f230916e2e62e18ef89562..95e0d7702029b583de8229e3c3eb923f6395b072 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -991,14 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
>   
>   int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
>   {
> -	int rc;
>   	bool immediate_probe = false;
> +	int rc;
> +
> +	/* We _should_ test this under write_lock_bh(&neigh->lock),
> +	 * but this is too costly.
> +	 */
> +	if (READ_ONCE(neigh->nud_state) & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
> +		return 0;
>   
>   	write_lock_bh(&neigh->lock);
>   
>   	rc = 0;
> -	if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
> -		goto out_unlock_bh;
>   	if (neigh->dead)
>   		goto out_dead;
>   
>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 19:45       ` Julian Anastasov
@ 2017-08-15 21:06         ` Eric Dumazet
  2017-08-15 21:49           ` Julian Anastasov
  2017-08-16  7:42           ` Julian Anastasov
  0 siblings, 2 replies; 13+ messages in thread
From: Eric Dumazet @ 2017-08-15 21:06 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: Paweł Staszewski, Linux Kernel Network Developers

On Tue, 2017-08-15 at 22:45 +0300, Julian Anastasov wrote:
> 	Hello,
> 
> On Tue, 15 Aug 2017, Eric Dumazet wrote:
> 
> > Please try this :
> > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > index 16a1a4c4eb57fa1147f230916e2e62e18ef89562..95e0d7702029b583de8229e3c3eb923f6395b072 100644
> > --- a/net/core/neighbour.c
> > +++ b/net/core/neighbour.c
> > @@ -991,14 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
> >  
> >  int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
> >  {
> > -	int rc;
> >  	bool immediate_probe = false;
> > +	int rc;
> > +
> > +	/* We _should_ test this under write_lock_bh(&neigh->lock),
> > +	 * but this is too costly.
> > +	 */
> > +	if (READ_ONCE(neigh->nud_state) & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
> > +		return 0;
> 
> 	The same fast check is already done in the only caller,
> neigh_event_send. Now we risk to enter the
> 'if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {' block...


Right you are.

It must be possible to add a fast path without locks.

(say if jiffies has not changed before last state change)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 21:06         ` Eric Dumazet
@ 2017-08-15 21:49           ` Julian Anastasov
  2017-08-15 22:11             ` Julian Anastasov
  2017-08-16  7:42           ` Julian Anastasov
  1 sibling, 1 reply; 13+ messages in thread
From: Julian Anastasov @ 2017-08-15 21:49 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paweł Staszewski, Linux Kernel Network Developers


	Hello,

On Tue, 15 Aug 2017, Eric Dumazet wrote:

> On Tue, 2017-08-15 at 22:45 +0300, Julian Anastasov wrote:
> > 	Hello,
> > 
> > On Tue, 15 Aug 2017, Eric Dumazet wrote:
> > 
> > > Please try this :
> > > diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> > > index 16a1a4c4eb57fa1147f230916e2e62e18ef89562..95e0d7702029b583de8229e3c3eb923f6395b072 100644
> > > --- a/net/core/neighbour.c
> > > +++ b/net/core/neighbour.c
> > > @@ -991,14 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
> > >  
> > >  int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
> > >  {
> > > -	int rc;
> > >  	bool immediate_probe = false;
> > > +	int rc;
> > > +
> > > +	/* We _should_ test this under write_lock_bh(&neigh->lock),
> > > +	 * but this is too costly.
> > > +	 */
> > > +	if (READ_ONCE(neigh->nud_state) & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
> > > +		return 0;
> > 
> > 	The same fast check is already done in the only caller,
> > neigh_event_send. Now we risk to enter the
> > 'if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {' block...
> 
> 
> Right you are.
> 
> It must be possible to add a fast path without locks.
> 
> (say if jiffies has not changed before last state change)

	I thought about this, it is possible in
neigh_event_send:

        if (neigh->used != now)
                neigh->used = now;
	else if (neigh->nud_state == NUD_INCOMPLETE &&
		 neigh->arp_queue_len_bytes + skb->truesize >
		 NEIGH_VAR(neigh->parms, QUEUE_LEN_BYTES)
		return 1;

	But this is really in fast path and not sure it is
worth it. May be if we can move it somehow in __neigh_event_send
but as neigh->used is changed early we need a better idea
how to reduce the arp_queue hit rate...

Regards

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 20:53       ` Paweł Staszewski
@ 2017-08-15 22:00         ` Paweł Staszewski
  0 siblings, 0 replies; 13+ messages in thread
From: Paweł Staszewski @ 2017-08-15 22:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Network Developers

Also after adding this patch:

[ 3843.036247] NEIGH: BUG, double timer add, state is 1
[ 3843.036253] CPU: 24 PID: 0 Comm: swapper/24 Not tainted 
4.13.0-rc5-next-20170815 #2
[ 3843.036254] Call Trace:
[ 3843.036255]  <IRQ>
[ 3843.036263]  dump_stack+0x4d/0x63
[ 3843.036267]  neigh_add_timer+0x36/0x39
[ 3843.036269]  __neigh_event_send+0x89/0x1bd
[ 3843.036270]  neigh_event_send+0x2b/0x2d
[ 3843.036272]  neigh_resolve_output+0x18/0x122
[ 3843.036276]  ip_finish_output2+0x24b/0x28f
[ 3843.036277]  ip_finish_output+0x101/0x10d
[ 3843.036279]  ip_output+0x56/0xa7
[ 3843.036280]  ? ip_route_input_rcu+0x4dd/0x7d3
[ 3843.036282]  ip_forward_finish+0x53/0x58
[ 3843.036283]  ip_forward+0x2b8/0x309
[ 3843.036285]  ? ip_frag_mem+0x1e/0x1e
[ 3843.036286]  ip_rcv_finish+0x27c/0x287
[ 3843.036287]  ip_rcv+0x2b0/0x2fd
[ 3843.036290]  __netif_receive_skb_core+0x316/0x4ab
[ 3843.036292]  ? __netif_receive_skb_core+0x4a1/0x4ab
[ 3843.036293]  __netif_receive_skb+0x18/0x57
[ 3843.036294]  ? __netif_receive_skb+0x18/0x57
[ 3843.036296]  netif_receive_skb_internal+0x4b/0xa1
[ 3843.036297]  napi_gro_receive+0x75/0xcc
[ 3843.036315]  mlx5e_handle_rx_cqe+0x3d3/0x48f [mlx5_core]
[ 3843.036319]  ? tick_program_event+0x5d/0x64
[ 3843.036327]  mlx5e_poll_rx_cq+0x139/0x166 [mlx5_core]
[ 3843.036334]  mlx5e_napi_poll+0x87/0x26d [mlx5_core]
[ 3843.036336]  net_rx_action+0xd3/0x22d
[ 3843.036341]  __do_softirq+0xe4/0x23a
[ 3843.036346]  irq_exit+0x4d/0x5b
[ 3843.036347]  do_IRQ+0x96/0xae
[ 3843.036349]  common_interrupt+0x90/0x90
[ 3843.036350]  </IRQ>
[ 3843.036353] RIP: 0010:cpuidle_enter_state+0x134/0x189
[ 3843.036354] RSP: 0018:ffffc900033cfea0 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffb6
[ 3843.036356] RAX: 0000037ec6cf262f RBX: 0000000000000001 RCX: 
000000000000001f
[ 3843.036356] RDX: 0000000000000000 RSI: 0000000000000018 RDI: 
0000000000000000
[ 3843.036357] RBP: ffffc900033cfed0 R08: 000000000000000c R09: 
00000000000001a5
[ 3843.036358] R10: ffffc900033cfe70 R11: ffff88087f898c50 R12: 
ffff88046c3d9a00
[ 3843.036359] R13: 0000037ec6cf262f R14: 0000000000000001 R15: 
0000037ec6ce8825
[ 3843.036361]  cpuidle_enter+0x12/0x14
[ 3843.036363]  do_idle+0x113/0x16b
[ 3843.036364]  cpu_startup_entry+0x1a/0x1c
[ 3843.036367]  start_secondary+0xd0/0xd3
[ 3843.036370]  secondary_startup_64+0xa5/0xa5
[ 3843.037807] NEIGH: BUG, double timer add, state is 1
[ 3843.037811] CPU: 9 PID: 0 Comm: swapper/9 Not tainted 
4.13.0-rc5-next-20170815 #2
[ 3843.037812] Call Trace:
[ 3843.037813]  <IRQ>
[ 3843.037819]  dump_stack+0x4d/0x63
[ 3843.037822]  neigh_add_timer+0x36/0x39
[ 3843.037824]  __neigh_event_send+0x89/0x1bd
[ 3843.037825]  neigh_event_send+0x2b/0x2d
[ 3843.037826]  neigh_resolve_output+0x18/0x122
[ 3843.037829]  ip_finish_output2+0x24b/0x28f
[ 3843.037831]  ip_finish_output+0x101/0x10d
[ 3843.037832]  ip_output+0x56/0xa7
[ 3843.037834]  ? ip_route_input_rcu+0x4dd/0x7d3
[ 3843.037836]  ip_forward_finish+0x53/0x58
[ 3843.037837]  ip_forward+0x2b8/0x309
[ 3843.037838]  ? ip_frag_mem+0x1e/0x1e
[ 3843.037840]  ip_rcv_finish+0x27c/0x287
[ 3843.037841]  ip_rcv+0x2b0/0x2fd
[ 3843.037843]  __netif_receive_skb_core+0x316/0x4ab
[ 3843.037845]  __netif_receive_skb+0x18/0x57
[ 3843.037846]  ? __netif_receive_skb+0x18/0x57
[ 3843.037848]  netif_receive_skb_internal+0x4b/0xa1
[ 3843.037849]  napi_gro_receive+0x75/0xcc
[ 3843.037863]  mlx5e_handle_rx_cqe+0x3d3/0x48f [mlx5_core]
[ 3843.037871]  mlx5e_poll_rx_cq+0x139/0x166 [mlx5_core]
[ 3843.037879]  mlx5e_napi_poll+0x87/0x26d [mlx5_core]
[ 3843.037881]  net_rx_action+0xd3/0x22d
[ 3843.037885]  __do_softirq+0xe4/0x23a
[ 3843.037889]  irq_exit+0x4d/0x5b
[ 3843.037890]  do_IRQ+0x96/0xae
[ 3843.037892]  common_interrupt+0x90/0x90
[ 3843.037893]  </IRQ>
[ 3843.037895] RIP: 0010:cpuidle_enter_state+0x134/0x189
[ 3843.037896] RSP: 0018:ffffc90003357ea0 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffff98
[ 3843.037898] RAX: 0000037ec6cffd61 RBX: 0000000000000002 RCX: 
000000000000001f
[ 3843.037899] RDX: 0000000000000000 RSI: 0000000000000009 RDI: 
0000000000000000
[ 3843.037899] RBP: ffffc90003357ed0 R08: 00000000ffffffff R09: 
0000000000000003
[ 3843.037900] R10: ffffc90003357e70 R11: ffff88046fc58c50 R12: 
ffff88046c3b0400
[ 3843.037901] R13: 0000037ec6cffd61 R14: 0000000000000002 R15: 
0000037ec6cf68de
[ 3843.037903]  cpuidle_enter+0x12/0x14
[ 3843.037905]  do_idle+0x113/0x16b
[ 3843.037907]  cpu_startup_entry+0x1a/0x1c
[ 3843.037909]  start_secondary+0xd0/0xd3
[ 3843.037912]  secondary_startup_64+0xa5/0xa5
[ 3855.324247] NEIGH: BUG, double timer add, state is 1
[ 3855.324251] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 
4.13.0-rc5-next-20170815 #2
[ 3855.324252] Call Trace:
[ 3855.324254]  <IRQ>
[ 3855.324259]  dump_stack+0x4d/0x63
[ 3855.324262]  neigh_add_timer+0x36/0x39
[ 3855.324264]  __neigh_event_send+0x89/0x1bd
[ 3855.324265]  neigh_event_send+0x2b/0x2d
[ 3855.324267]  neigh_resolve_output+0x18/0x122
[ 3855.324269]  ip_finish_output2+0x24b/0x28f
[ 3855.324271]  ip_finish_output+0x101/0x10d
[ 3855.324272]  ip_output+0x56/0xa7
[ 3855.324274]  ? ip_route_input_rcu+0x4dd/0x7d3
[ 3855.324276]  ip_forward_finish+0x53/0x58
[ 3855.324277]  ip_forward+0x2b8/0x309
[ 3855.324279]  ? ip_frag_mem+0x1e/0x1e
[ 3855.324280]  ip_rcv_finish+0x27c/0x287
[ 3855.324281]  ip_rcv+0x2b0/0x2fd
[ 3855.324284]  __netif_receive_skb_core+0x316/0x4ab
[ 3855.324285]  __netif_receive_skb+0x18/0x57
[ 3855.324286]  ? __netif_receive_skb+0x18/0x57
[ 3855.324288]  netif_receive_skb_internal+0x4b/0xa1
[ 3855.324289]  napi_gro_receive+0x75/0xcc
[ 3855.324304]  mlx5e_handle_rx_cqe+0x3d3/0x48f [mlx5_core]
[ 3855.324312]  mlx5e_poll_rx_cq+0x139/0x166 [mlx5_core]
[ 3855.324320]  mlx5e_napi_poll+0x87/0x26d [mlx5_core]
[ 3855.324322]  net_rx_action+0xd3/0x22d
[ 3855.324325]  __do_softirq+0xe4/0x23a
[ 3855.324329]  irq_exit+0x4d/0x5b
[ 3855.324330]  do_IRQ+0x96/0xae
[ 3855.324332]  common_interrupt+0x90/0x90
[ 3855.324332]  </IRQ>
[ 3855.324335] RIP: 0010:cpuidle_enter_state+0x134/0x189
[ 3855.324336] RSP: 0018:ffffc90003347ea0 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffb8
[ 3855.324337] RAX: 00000381a339f6dc RBX: 0000000000000002 RCX: 
000000000000001f
[ 3855.324338] RDX: 0000000000000000 RSI: 0000000000000007 RDI: 
0000000000000000
[ 3855.324339] RBP: ffffc90003347ed0 R08: 0000000000000007 R09: 
0000000000000003
[ 3855.324339] R10: ffffc90003347e70 R11: ffff88046fbd8c50 R12: 
ffff88046c398a00
[ 3855.324340] R13: 00000381a339f6dc R14: 0000000000000002 R15: 
00000381a3396ac5
[ 3855.324342]  cpuidle_enter+0x12/0x14
[ 3855.324344]  do_idle+0x113/0x16b
[ 3855.324345]  cpu_startup_entry+0x1a/0x1c
[ 3855.324347]  start_secondary+0xd0/0xd3
[ 3855.324350]  secondary_startup_64+0xa5/0xa5
[ 3855.326261] NEIGH: BUG, double timer add, state is 1
[ 3855.326265] CPU: 45 PID: 0 Comm: swapper/45 Not tainted 
4.13.0-rc5-next-20170815 #2
[ 3855.326267] Call Trace:
[ 3855.326268]  <IRQ>
[ 3855.326275]  dump_stack+0x4d/0x63
[ 3855.326279]  neigh_add_timer+0x36/0x39
[ 3855.326281]  __neigh_event_send+0x89/0x1bd
[ 3855.326282]  neigh_event_send+0x2b/0x2d
[ 3855.326284]  neigh_resolve_output+0x18/0x122
[ 3855.326287]  ip_finish_output2+0x24b/0x28f
[ 3855.326289]  ip_finish_output+0x101/0x10d
[ 3855.326290]  ip_output+0x56/0xa7
[ 3855.326292]  ? ip_route_input_rcu+0x4dd/0x7d3
[ 3855.326294]  ip_forward_finish+0x53/0x58
[ 3855.326295]  ip_forward+0x2b8/0x309
[ 3855.326296]  ? ip_frag_mem+0x1e/0x1e
[ 3855.326297]  ip_rcv_finish+0x27c/0x287
[ 3855.326299]  ip_rcv+0x2b0/0x2fd
[ 3855.326302]  __netif_receive_skb_core+0x316/0x4ab
[ 3855.326303]  __netif_receive_skb+0x18/0x57
[ 3855.326304]  ? __netif_receive_skb+0x18/0x57
[ 3855.326306]  netif_receive_skb_internal+0x4b/0xa1
[ 3855.326307]  napi_gro_receive+0x75/0xcc
[ 3855.326323]  mlx5e_handle_rx_cqe+0x3d3/0x48f [mlx5_core]
[ 3855.326328]  ? handle_irq_event+0x35/0x46
[ 3855.326336]  mlx5e_poll_rx_cq+0x139/0x166 [mlx5_core]
[ 3855.326344]  mlx5e_napi_poll+0x87/0x26d [mlx5_core]
[ 3855.326345]  net_rx_action+0xd3/0x22d
[ 3855.326349]  __do_softirq+0xe4/0x23a
[ 3855.326353]  ? tick_program_event+0x5d/0x64
[ 3855.326356]  irq_exit+0x4d/0x5b
[ 3855.326358]  smp_apic_timer_interrupt+0x29/0x34
[ 3855.326360]  apic_timer_interrupt+0x90/0xa0
[ 3855.326360]  </IRQ>
[ 3855.326363] RIP: 0010:cpuidle_enter_state+0x134/0x189
[ 3855.326364] RSP: 0018:ffffc90003477ea0 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffff10
[ 3855.326366] RAX: 00000381a33b2cd6 RBX: 0000000000000002 RCX: 
000000000000001f
[ 3855.326366] RDX: 0000000000000000 RSI: 000000000000002d RDI: 
0000000000000000
[ 3855.326367] RBP: ffffc90003477ed0 R08: 00000000ffffffe2 R09: 
0000000000000252
[ 3855.326368] R10: ffffc90003477e70 R11: ffff88087fa58c50 R12: 
ffff88046b918200
[ 3855.326368] R13: 00000381a33b2cd6 R14: 0000000000000002 R15: 
00000381a33a9c4a
[ 3855.326371]  cpuidle_enter+0x12/0x14
[ 3855.326372]  do_idle+0x113/0x16b
[ 3855.326373]  cpu_startup_entry+0x1a/0x1c
[ 3855.326376]  start_secondary+0xd0/0xd3
[ 3855.326379]  secondary_startup_64+0xa5/0xa5



W dniu 2017-08-15 o 22:53, Paweł Staszewski pisze:
> Hi
>
>
> Patch applied but no change still 100% CPU
>
> perf below:
>
>      1.99%     0.00%  ksoftirqd/1      [kernel.vmlinux] [k] run_ksoftirqd
>             |
>             ---run_ksoftirqd
>                |
>                 --1.99%--__softirqentry_text_start
>                           |
>                            --1.99%--net_rx_action
>                                      |
>                                       --1.98%--mlx5e_napi_poll
>                                                 |
> --1.98%--mlx5e_poll_rx_cq
>                                                            |
> --1.98%--mlx5e_handle_rx_cqe
>                                                                       |
> --1.96%--napi_gro_receive
> |
> --1.96%--netif_receive_skb_internal
> |
> --1.96%--__netif_receive_skb
> |
> --1.96%--__netif_receive_skb_core
> |
> --1.95%--ip_rcv
> |
> --1.95%--ip_rcv_finish
> |
> --1.94%--ip_forward
> |
> --1.94%--ip_forward_finish
> |
> --1.94%--ip_output
> |
> --1.94%--ip_finish_output
> |
> --1.94%--ip_finish_output2
> |
> --1.90%--neigh_resolve_output
> |
> --1.90%--neigh_event_send
> |
> --1.90%--__neigh_event_send
> |
> --1.87%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.84%--queued_spin_lock_slowpath
>
>
> W dniu 2017-08-15 o 21:11, Eric Dumazet pisze:
>> On Tue, 2017-08-15 at 19:42 +0200, Paweł Staszewski wrote:
>>> # To display the perf.data header info, please use
>>> --header/--header-only options.
>>> #
>>> #
>>> # Total Lost Samples: 0
>>> #
>>> # Samples: 2M of event 'cycles'
>>> # Event count (approx.): 1585571545969
>>> #
>>> # Children      Self  Command         Shared Object Symbol
>>> # ........  ........  ..............  ....................
>>> ..............................................
>>> #
>>>        1.82%     0.00%  ksoftirqd/43    [kernel.vmlinux] [k]
>>> __softirqentry_text_start
>>>               |
>>>                --1.82%--__softirqentry_text_start
>>>                          |
>>>                           --1.82%--net_rx_action
>>>                                     |
>>>                                      --1.82%--mlx5e_napi_poll
>>>                                                |
>>> --1.81%--mlx5e_poll_rx_cq
>>>                                                           |
>>> --1.81%--mlx5e_handle_rx_cqe
>>>                                                                      |
>>> --1.79%--napi_gro_receive
>>> |
>>> --1.78%--netif_receive_skb_internal
>>> |
>>> --1.78%--__netif_receive_skb
>>> |
>>> --1.78%--__netif_receive_skb_core
>>> |
>>> --1.78%--ip_rcv
>>> |
>>> --1.78%--ip_rcv_finish
>>> |
>>> --1.76%--ip_forward
>>> |
>>> --1.76%--ip_forward_finish
>>> |
>>> --1.76%--ip_output
>>> |
>>> --1.76%--ip_finish_output
>>> |
>>> --1.76%--ip_finish_output2
>>> |
>>> --1.73%--neigh_resolve_output
>>> |
>>> --1.73%--neigh_event_send
>>> |
>>> --1.73%--__neigh_event_send
>>> |
>>> --1.70%--_raw_write_lock_bh
>>> queued_write_lock
>>> queued_write_lock_slowpath
>>> |
>>> --1.67%--queued_spin_lock_slowpath
>>>
>>>
>> Please try this :
>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index 
>> 16a1a4c4eb57fa1147f230916e2e62e18ef89562..95e0d7702029b583de8229e3c3eb923f6395b072 
>> 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -991,14 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
>>     int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
>>   {
>> -    int rc;
>>       bool immediate_probe = false;
>> +    int rc;
>> +
>> +    /* We _should_ test this under write_lock_bh(&neigh->lock),
>> +     * but this is too costly.
>> +     */
>> +    if (READ_ONCE(neigh->nud_state) & (NUD_CONNECTED | NUD_DELAY | 
>> NUD_PROBE))
>> +        return 0;
>>         write_lock_bh(&neigh->lock);
>>         rc = 0;
>> -    if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
>> -        goto out_unlock_bh;
>>       if (neigh->dead)
>>           goto out_dead;
>>
>>
>>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 21:49           ` Julian Anastasov
@ 2017-08-15 22:11             ` Julian Anastasov
  0 siblings, 0 replies; 13+ messages in thread
From: Julian Anastasov @ 2017-08-15 22:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paweł Staszewski, Linux Kernel Network Developers


	Hello,

On Wed, 16 Aug 2017, Julian Anastasov wrote:

> 	I thought about this, it is possible in
> neigh_event_send:
> 
>         if (neigh->used != now)
>                 neigh->used = now;
> 	else if (neigh->nud_state == NUD_INCOMPLETE &&
> 		 neigh->arp_queue_len_bytes + skb->truesize >
> 		 NEIGH_VAR(neigh->parms, QUEUE_LEN_BYTES)

	With kfree_skb(skb) here, of course...

> 		return 1;
> 
> 	But this is really in fast path and not sure it is
> worth it. May be if we can move it somehow in __neigh_event_send
> but as neigh->used is changed early we need a better idea
> how to reduce the arp_queue hit rate...

Regards

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-15 21:06         ` Eric Dumazet
  2017-08-15 21:49           ` Julian Anastasov
@ 2017-08-16  7:42           ` Julian Anastasov
  2017-08-16 10:07             ` Paweł Staszewski
  1 sibling, 1 reply; 13+ messages in thread
From: Julian Anastasov @ 2017-08-16  7:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paweł Staszewski, Linux Kernel Network Developers


	Hello,

On Tue, 15 Aug 2017, Eric Dumazet wrote:

> It must be possible to add a fast path without locks.
> 
> (say if jiffies has not changed before last state change)

	New day - new idea. Something like this? But it
has bug: without checking neigh->dead under lock we don't
have the right to access neigh->parms, it can be destroyed
immediately by neigh_release->neigh_destroy->neigh_parms_put->
neigh_parms_destroy->kfree. Not sure, may be kfree_rcu can help
for this...

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 9816df2..f52763c 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -428,10 +428,10 @@ static inline int neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
 {
 	unsigned long now = jiffies;
 	
-	if (neigh->used != now)
-		neigh->used = now;
 	if (!(neigh->nud_state&(NUD_CONNECTED|NUD_DELAY|NUD_PROBE)))
 		return __neigh_event_send(neigh, skb);
+	if (neigh->used != now)
+		neigh->used = now;
 	return 0;
 }
 
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 16a1a4c..52a8718 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -991,8 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
 
 int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
 {
-	int rc;
 	bool immediate_probe = false;
+	unsigned long now = jiffies;
+	int rc;
+
+	if (neigh->used != now) {
+		neigh->used = now;
+	} else if (neigh->nud_state == NUD_INCOMPLETE &&
+		   (!skb || neigh->arp_queue_len_bytes + skb->truesize >
+		    NEIGH_VAR(neigh->parms, QUEUE_LEN_BYTES))) {
+		kfree_skb(skb);
+		return 1;
+	}
 
 	write_lock_bh(&neigh->lock);
 
@@ -1005,7 +1015,7 @@ int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
 	if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
 		if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
 		    NEIGH_VAR(neigh->parms, APP_PROBES)) {
-			unsigned long next, now = jiffies;
+			unsigned long next;
 
 			atomic_set(&neigh->probes,
 				   NEIGH_VAR(neigh->parms, UCAST_PROBES));

Regards

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-16  7:42           ` Julian Anastasov
@ 2017-08-16 10:07             ` Paweł Staszewski
  2017-08-17 12:52               ` Paweł Staszewski
  0 siblings, 1 reply; 13+ messages in thread
From: Paweł Staszewski @ 2017-08-16 10:07 UTC (permalink / raw)
  To: Julian Anastasov, Eric Dumazet; +Cc: Linux Kernel Network Developers

Hi


Patch applied - but no big change - from 0.7Mpps per vlan to 1.2Mpps per 
vlan

previously(without patch) 100% cpu load:

   bwm-ng v0.6.1 (probing every 0.500s), press 'h' for help
   input: /proc/net/dev type: rate
   |         iface                   Rx Tx                Total
============================================================================== 

          vlan1002:            0.00 P/s             1.99 P/s             
1.99 P/s
          vlan1001:            0.00 P/s        717227.12 P/s 717227.12 P/s
        enp175s0f0:      2713679.25 P/s             0.00 P/s 2713679.25 P/s
          vlan1000:            0.00 P/s        716145.44 P/s 716145.44 P/s
------------------------------------------------------------------------------ 

             total:      2713679.25 P/s       1433374.50 P/s 4147054.00 P/s


With patch (100% cpu load a little better pps performance)

  bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
   input: /proc/net/dev type: rate
   |         iface                   Rx Tx                Total
==============================================================================
          vlan1002:            0.00 P/s             1.00 P/s             
1.00 P/s
          vlan1001:            0.00 P/s       1202161.50 P/s 1202161.50 P/s
        enp175s0f0:      3699864.50 P/s             0.00 P/s 3699864.50 P/s
          vlan1000:            0.00 P/s       1196870.38 P/s 1196870.38 P/s
------------------------------------------------------------------------------
             total:      3699864.50 P/s       2399033.00 P/s 6098897.50 P/s


perf top attached below:

      1.90%     0.00%  ksoftirqd/39    [kernel.vmlinux] [k] run_ksoftirqd
             |
              --1.90%--run_ksoftirqd
                        |
                         --1.90%--__softirqentry_text_start
                                   |
                                    --1.90%--net_rx_action
                                              |
--1.90%--mlx5e_napi_poll
                                                         |
--1.89%--mlx5e_poll_rx_cq
|
--1.88%--mlx5e_handle_rx_cqe
|
--1.85%--napi_gro_receive
|
--1.85%--netif_receive_skb_internal
|
--1.85%--__netif_receive_skb
|
--1.85%--__netif_receive_skb_core
|
--1.85%--ip_rcv
|
--1.85%--ip_rcv_finish
|
--1.83%--ip_forward
|
--1.82%--ip_forward_finish
|
--1.82%--ip_output
|
--1.82%--ip_finish_output
|
--1.82%--ip_finish_output2
|
--1.79%--neigh_resolve_output
|
--1.77%--neigh_event_send
|
--1.77%--__neigh_event_send
|
--1.74%--_raw_write_lock_bh
|
--1.74%--queued_write_lock
queued_write_lock_slowpath
|
--1.70%--queued_spin_lock_slowpath


     1.90%     0.00%  ksoftirqd/34    [kernel.vmlinux] [k] 
__softirqentry_text_start
             |
             ---__softirqentry_text_start
                |
                 --1.90%--net_rx_action
                           |
                            --1.90%--mlx5e_napi_poll
                                      |
                                       --1.89%--mlx5e_poll_rx_cq
                                                 |
--1.88%--mlx5e_handle_rx_cqe
                                                            |
--1.86%--napi_gro_receive
                                                                       |
--1.85%--netif_receive_skb_internal
|
--1.85%--__netif_receive_skb
|
--1.85%--__netif_receive_skb_core
|
--1.85%--ip_rcv
|
--1.85%--ip_rcv_finish
|
--1.83%--ip_forward
|
--1.82%--ip_forward_finish
|
--1.82%--ip_output
|
--1.82%--ip_finish_output
|
--1.82%--ip_finish_output2
|
--1.79%--neigh_resolve_output
|
--1.77%--neigh_event_send
|
--1.77%--__neigh_event_send
|
--1.74%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.71%--queued_spin_lock_slowpath

  1.85%     0.00%  ksoftirqd/38    [kernel.vmlinux]          [k] 
ip_rcv_finish
             |
              --1.85%--ip_rcv_finish
                        |
                         --1.83%--ip_forward
                                   |
                                    --1.82%--ip_forward_finish
                                              |
                                               --1.82%--ip_output
                                                         |
--1.82%--ip_finish_output
|
--1.82%--ip_finish_output2
|
--1.79%--neigh_resolve_output
|
--1.77%--neigh_event_send
|
--1.77%--__neigh_event_send
|
--1.74%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.71%--queued_spin_lock_slowpath

      1.85%     0.00%  ksoftirqd/22    [kernel.vmlinux] [k] ip_rcv
             |
              --1.85%--ip_rcv
                        |
                         --1.85%--ip_rcv_finish
                                   |
                                    --1.83%--ip_forward
                                              |
--1.82%--ip_forward_finish
                                                         |
--1.82%--ip_output
|
--1.82%--ip_finish_output
|
--1.82%--ip_finish_output2
|
--1.79%--neigh_resolve_output
|
--1.77%--neigh_event_send
|
--1.77%--__neigh_event_send
|
--1.73%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.70%--queued_spin_lock_slowpath

     1.83%     0.00%  ksoftirqd/9     [kernel.vmlinux]          [k] 
ip_forward
             |
              --1.83%--ip_forward
                        |
                         --1.82%--ip_forward_finish
                                   |
                                    --1.82%--ip_output
                                              |
--1.82%--ip_finish_output
                                                         |
--1.82%--ip_finish_output2
                                                                    |
--1.79%--neigh_resolve_output
|
--1.77%--neigh_event_send
|
--1.77%--__neigh_event_send
|
--1.74%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.70%--queued_spin_lock_slowpath


      1.82%     0.00%  ksoftirqd/35    [kernel.vmlinux]          [k] 
ip_output
             |
              --1.82%--ip_output
                        |
                         --1.82%--ip_finish_output
                                   |
                                    --1.82%--ip_finish_output2
                                              |
--1.79%--neigh_resolve_output
                                                         |
--1.77%--neigh_event_send
                                                                    |
--1.77%--__neigh_event_send
|
--1.74%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.71%--queued_spin_lock_slowpath

      1.82%     0.00%  ksoftirqd/38    [kernel.vmlinux]          [k] 
ip_finish_output
             |
              --1.82%--ip_finish_output
                        |
                         --1.82%--ip_finish_output2
                                   |
                                    --1.79%--neigh_resolve_output
                                              |
--1.77%--neigh_event_send
                                                         |
--1.77%--__neigh_event_send
                                                                    |
--1.74%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.71%--queued_spin_lock_slowpath

      1.82%     0.00%  ksoftirqd/37    [kernel.vmlinux]          [k] 
ip_forward_finish
             |
              --1.82%--ip_forward_finish
                        ip_output
                        |
                         --1.82%--ip_finish_output
                                   |
                                    --1.82%--ip_finish_output2
                                              |
--1.79%--neigh_resolve_output
                                                         |
--1.76%--neigh_event_send
__neigh_event_send
                                                                    |
--1.73%--_raw_write_lock_bh
queued_write_lock
queued_write_lock_slowpath
|
--1.70%--queued_spin_lock_slowpath


W dniu 2017-08-16 o 09:42, Julian Anastasov pisze:
> 	Hello,
>
> On Tue, 15 Aug 2017, Eric Dumazet wrote:
>
>> It must be possible to add a fast path without locks.
>>
>> (say if jiffies has not changed before last state change)
> 	New day - new idea. Something like this? But it
> has bug: without checking neigh->dead under lock we don't
> have the right to access neigh->parms, it can be destroyed
> immediately by neigh_release->neigh_destroy->neigh_parms_put->
> neigh_parms_destroy->kfree. Not sure, may be kfree_rcu can help
> for this...
>
> diff --git a/include/net/neighbour.h b/include/net/neighbour.h
> index 9816df2..f52763c 100644
> --- a/include/net/neighbour.h
> +++ b/include/net/neighbour.h
> @@ -428,10 +428,10 @@ static inline int neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
>   {
>   	unsigned long now = jiffies;
>   	
> -	if (neigh->used != now)
> -		neigh->used = now;
>   	if (!(neigh->nud_state&(NUD_CONNECTED|NUD_DELAY|NUD_PROBE)))
>   		return __neigh_event_send(neigh, skb);
> +	if (neigh->used != now)
> +		neigh->used = now;
>   	return 0;
>   }
>   
> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index 16a1a4c..52a8718 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -991,8 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
>   
>   int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
>   {
> -	int rc;
>   	bool immediate_probe = false;
> +	unsigned long now = jiffies;
> +	int rc;
> +
> +	if (neigh->used != now) {
> +		neigh->used = now;
> +	} else if (neigh->nud_state == NUD_INCOMPLETE &&
> +		   (!skb || neigh->arp_queue_len_bytes + skb->truesize >
> +		    NEIGH_VAR(neigh->parms, QUEUE_LEN_BYTES))) {
> +		kfree_skb(skb);
> +		return 1;
> +	}
>   
>   	write_lock_bh(&neigh->lock);
>   
> @@ -1005,7 +1015,7 @@ int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
>   	if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
>   		if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
>   		    NEIGH_VAR(neigh->parms, APP_PROBES)) {
> -			unsigned long next, now = jiffies;
> +			unsigned long next;
>   
>   			atomic_set(&neigh->probes,
>   				   NEIGH_VAR(neigh->parms, UCAST_PROBES));
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable
  2017-08-16 10:07             ` Paweł Staszewski
@ 2017-08-17 12:52               ` Paweł Staszewski
  0 siblings, 0 replies; 13+ messages in thread
From: Paweł Staszewski @ 2017-08-17 12:52 UTC (permalink / raw)
  To: Julian Anastasov, Eric Dumazet; +Cc: Linux Kernel Network Developers

Hi


Wondering if someone have idea how to optimise this ?

 From reali life perspective it is really important for optimising this 
behavior cause imagine situation - just normal situation with linux 
acting as a router:

Lets say we have Linux router with connected 3 customers and one 
upstream - and have some situation with ddos

1. There is comming ddos traffic from upstream (and lets say we are 
handling this ddos on our forwarding router with 50% cpu load) to our 
router and router is forwarding this traffic to some customer

(remember we have 3 customers - one is getting ddos now dirested to some 
of his ip)

2. Customer X is getting ddos - his router is 100% cpu load (cause low 
end hardware or 100% load on up-link.

3. Customer X router is not responding or some watchdog restarted it - 
or customer X disabled router for security reasons

4. After fdb expires on our service router - two other customers start 
to have problems cause our router goes from 50% to 100% on all cores - 
and everybody experiencing now packet drops and bandwidth drops


And this does'nt need to be ddos - it can be many servers connected by 
linux router and one server will push some udp stream (iptv or other 
filesystem syncing protocol) to the other via forwarding linux router - 
if receiving server will goes down and dissapear from arp all other 
streams that are forwarded by linux router will suffer from this.



W dniu 2017-08-16 o 12:07, Paweł Staszewski pisze:
> Hi
>
>
> Patch applied - but no big change - from 0.7Mpps per vlan to 1.2Mpps 
> per vlan
>
> previously(without patch) 100% cpu load:
>
>   bwm-ng v0.6.1 (probing every 0.500s), press 'h' for help
>   input: /proc/net/dev type: rate
>   |         iface                   Rx Tx                Total
> ============================================================================== 
>
>          vlan1002:            0.00 P/s             1.99 
> P/s             1.99 P/s
>          vlan1001:            0.00 P/s        717227.12 P/s 717227.12 P/s
>        enp175s0f0:      2713679.25 P/s             0.00 P/s 2713679.25 
> P/s
>          vlan1000:            0.00 P/s        716145.44 P/s 716145.44 P/s
> ------------------------------------------------------------------------------ 
>
>             total:      2713679.25 P/s       1433374.50 P/s 4147054.00 
> P/s
>
>
> With patch (100% cpu load a little better pps performance)
>
>  bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
>   input: /proc/net/dev type: rate
>   |         iface                   Rx Tx                Total
> ============================================================================== 
>
>          vlan1002:            0.00 P/s             1.00 
> P/s             1.00 P/s
>          vlan1001:            0.00 P/s       1202161.50 P/s 1202161.50 
> P/s
>        enp175s0f0:      3699864.50 P/s             0.00 P/s 3699864.50 
> P/s
>          vlan1000:            0.00 P/s       1196870.38 P/s 1196870.38 
> P/s
> ------------------------------------------------------------------------------ 
>
>             total:      3699864.50 P/s       2399033.00 P/s 6098897.50 
> P/s
>
>
> perf top attached below:
>
>      1.90%     0.00%  ksoftirqd/39    [kernel.vmlinux] [k] run_ksoftirqd
>             |
>              --1.90%--run_ksoftirqd
>                        |
>                         --1.90%--__softirqentry_text_start
>                                   |
>                                    --1.90%--net_rx_action
>                                              |
> --1.90%--mlx5e_napi_poll
>                                                         |
> --1.89%--mlx5e_poll_rx_cq
> |
> --1.88%--mlx5e_handle_rx_cqe
> |
> --1.85%--napi_gro_receive
> |
> --1.85%--netif_receive_skb_internal
> |
> --1.85%--__netif_receive_skb
> |
> --1.85%--__netif_receive_skb_core
> |
> --1.85%--ip_rcv
> |
> --1.85%--ip_rcv_finish
> |
> --1.83%--ip_forward
> |
> --1.82%--ip_forward_finish
> |
> --1.82%--ip_output
> |
> --1.82%--ip_finish_output
> |
> --1.82%--ip_finish_output2
> |
> --1.79%--neigh_resolve_output
> |
> --1.77%--neigh_event_send
> |
> --1.77%--__neigh_event_send
> |
> --1.74%--_raw_write_lock_bh
> |
> --1.74%--queued_write_lock
> queued_write_lock_slowpath
> |
> --1.70%--queued_spin_lock_slowpath
>
>
>     1.90%     0.00%  ksoftirqd/34    [kernel.vmlinux] [k] 
> __softirqentry_text_start
>             |
>             ---__softirqentry_text_start
>                |
>                 --1.90%--net_rx_action
>                           |
>                            --1.90%--mlx5e_napi_poll
>                                      |
>                                       --1.89%--mlx5e_poll_rx_cq
>                                                 |
> --1.88%--mlx5e_handle_rx_cqe
>                                                            |
> --1.86%--napi_gro_receive
>                                                                       |
> --1.85%--netif_receive_skb_internal
> |
> --1.85%--__netif_receive_skb
> |
> --1.85%--__netif_receive_skb_core
> |
> --1.85%--ip_rcv
> |
> --1.85%--ip_rcv_finish
> |
> --1.83%--ip_forward
> |
> --1.82%--ip_forward_finish
> |
> --1.82%--ip_output
> |
> --1.82%--ip_finish_output
> |
> --1.82%--ip_finish_output2
> |
> --1.79%--neigh_resolve_output
> |
> --1.77%--neigh_event_send
> |
> --1.77%--__neigh_event_send
> |
> --1.74%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.71%--queued_spin_lock_slowpath
>
>  1.85%     0.00%  ksoftirqd/38    [kernel.vmlinux]          [k] 
> ip_rcv_finish
>             |
>              --1.85%--ip_rcv_finish
>                        |
>                         --1.83%--ip_forward
>                                   |
>                                    --1.82%--ip_forward_finish
>                                              |
>                                               --1.82%--ip_output
>                                                         |
> --1.82%--ip_finish_output
> |
> --1.82%--ip_finish_output2
> |
> --1.79%--neigh_resolve_output
> |
> --1.77%--neigh_event_send
> |
> --1.77%--__neigh_event_send
> |
> --1.74%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.71%--queued_spin_lock_slowpath
>
>      1.85%     0.00%  ksoftirqd/22    [kernel.vmlinux] [k] ip_rcv
>             |
>              --1.85%--ip_rcv
>                        |
>                         --1.85%--ip_rcv_finish
>                                   |
>                                    --1.83%--ip_forward
>                                              |
> --1.82%--ip_forward_finish
>                                                         |
> --1.82%--ip_output
> |
> --1.82%--ip_finish_output
> |
> --1.82%--ip_finish_output2
> |
> --1.79%--neigh_resolve_output
> |
> --1.77%--neigh_event_send
> |
> --1.77%--__neigh_event_send
> |
> --1.73%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.70%--queued_spin_lock_slowpath
>
>     1.83%     0.00%  ksoftirqd/9     [kernel.vmlinux]          [k] 
> ip_forward
>             |
>              --1.83%--ip_forward
>                        |
>                         --1.82%--ip_forward_finish
>                                   |
>                                    --1.82%--ip_output
>                                              |
> --1.82%--ip_finish_output
>                                                         |
> --1.82%--ip_finish_output2
> |
> --1.79%--neigh_resolve_output
> |
> --1.77%--neigh_event_send
> |
> --1.77%--__neigh_event_send
> |
> --1.74%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.70%--queued_spin_lock_slowpath
>
>
>      1.82%     0.00%  ksoftirqd/35    [kernel.vmlinux] [k] ip_output
>             |
>              --1.82%--ip_output
>                        |
>                         --1.82%--ip_finish_output
>                                   |
>                                    --1.82%--ip_finish_output2
>                                              |
> --1.79%--neigh_resolve_output
>                                                         |
> --1.77%--neigh_event_send
> |
> --1.77%--__neigh_event_send
> |
> --1.74%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.71%--queued_spin_lock_slowpath
>
>      1.82%     0.00%  ksoftirqd/38    [kernel.vmlinux] [k] 
> ip_finish_output
>             |
>              --1.82%--ip_finish_output
>                        |
>                         --1.82%--ip_finish_output2
>                                   |
>                                    --1.79%--neigh_resolve_output
>                                              |
> --1.77%--neigh_event_send
>                                                         |
> --1.77%--__neigh_event_send
> |
> --1.74%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.71%--queued_spin_lock_slowpath
>
>      1.82%     0.00%  ksoftirqd/37    [kernel.vmlinux] [k] 
> ip_forward_finish
>             |
>              --1.82%--ip_forward_finish
>                        ip_output
>                        |
>                         --1.82%--ip_finish_output
>                                   |
>                                    --1.82%--ip_finish_output2
>                                              |
> --1.79%--neigh_resolve_output
>                                                         |
> --1.76%--neigh_event_send
> __neigh_event_send
> |
> --1.73%--_raw_write_lock_bh
> queued_write_lock
> queued_write_lock_slowpath
> |
> --1.70%--queued_spin_lock_slowpath
>
>
> W dniu 2017-08-16 o 09:42, Julian Anastasov pisze:
>>     Hello,
>>
>> On Tue, 15 Aug 2017, Eric Dumazet wrote:
>>
>>> It must be possible to add a fast path without locks.
>>>
>>> (say if jiffies has not changed before last state change)
>>     New day - new idea. Something like this? But it
>> has bug: without checking neigh->dead under lock we don't
>> have the right to access neigh->parms, it can be destroyed
>> immediately by neigh_release->neigh_destroy->neigh_parms_put->
>> neigh_parms_destroy->kfree. Not sure, may be kfree_rcu can help
>> for this...
>>
>> diff --git a/include/net/neighbour.h b/include/net/neighbour.h
>> index 9816df2..f52763c 100644
>> --- a/include/net/neighbour.h
>> +++ b/include/net/neighbour.h
>> @@ -428,10 +428,10 @@ static inline int neigh_event_send(struct 
>> neighbour *neigh, struct sk_buff *skb)
>>   {
>>       unsigned long now = jiffies;
>>
>> -    if (neigh->used != now)
>> -        neigh->used = now;
>>       if (!(neigh->nud_state&(NUD_CONNECTED|NUD_DELAY|NUD_PROBE)))
>>           return __neigh_event_send(neigh, skb);
>> +    if (neigh->used != now)
>> +        neigh->used = now;
>>       return 0;
>>   }
>>   diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index 16a1a4c..52a8718 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -991,8 +991,18 @@ static void neigh_timer_handler(unsigned long arg)
>>     int __neigh_event_send(struct neighbour *neigh, struct sk_buff *skb)
>>   {
>> -    int rc;
>>       bool immediate_probe = false;
>> +    unsigned long now = jiffies;
>> +    int rc;
>> +
>> +    if (neigh->used != now) {
>> +        neigh->used = now;
>> +    } else if (neigh->nud_state == NUD_INCOMPLETE &&
>> +           (!skb || neigh->arp_queue_len_bytes + skb->truesize >
>> +            NEIGH_VAR(neigh->parms, QUEUE_LEN_BYTES))) {
>> +        kfree_skb(skb);
>> +        return 1;
>> +    }
>>         write_lock_bh(&neigh->lock);
>>   @@ -1005,7 +1015,7 @@ int __neigh_event_send(struct neighbour 
>> *neigh, struct sk_buff *skb)
>>       if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
>>           if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
>>               NEIGH_VAR(neigh->parms, APP_PROBES)) {
>> -            unsigned long next, now = jiffies;
>> +            unsigned long next;
>>                 atomic_set(&neigh->probes,
>>                      NEIGH_VAR(neigh->parms, UCAST_PROBES));
>>
>> Regards
>>
>> -- 
>> Julian Anastasov <ja@ssi.bg>
>>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-08-17 12:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-15 16:30 100% CPU load when generating traffic to destination network that nexthop is not reachable Paweł Staszewski
2017-08-15 16:57 ` Eric Dumazet
2017-08-15 17:42   ` Paweł Staszewski
2017-08-15 19:11     ` Eric Dumazet
2017-08-15 19:45       ` Julian Anastasov
2017-08-15 21:06         ` Eric Dumazet
2017-08-15 21:49           ` Julian Anastasov
2017-08-15 22:11             ` Julian Anastasov
2017-08-16  7:42           ` Julian Anastasov
2017-08-16 10:07             ` Paweł Staszewski
2017-08-17 12:52               ` Paweł Staszewski
2017-08-15 20:53       ` Paweł Staszewski
2017-08-15 22:00         ` Paweł Staszewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.