* High Cpu load when run smartdns : __ipv6_dev_get_saddr
@ 2023-08-27 14:20 Martin Zaharinov
2023-08-27 16:51 ` David Ahern
0 siblings, 1 reply; 5+ messages in thread
From: Martin Zaharinov @ 2023-08-27 14:20 UTC (permalink / raw)
To: netdev, Eric Dumazet
Hi Eric
i need you help to find is this bug or no.
I talk with smartdns team and try to research in his code but for the moment not found ..
test system have 5k ppp users on pppoe device
after run smartdns
service got to 100% load
in normal case when run other 2 type of dns server (isc bind or knot ) all is fine .
but when run smartdns see perf :
PerfTop: 4223 irqs/sec kernel:96.9% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (target_pid: 1208268)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
28.48% [kernel] [k] __ipv6_dev_get_saddr
12.31% [kernel] [k] l3mdev_master_ifindex_rcu
6.63% [pppoe] [k] pppoe_rcv
3.82% [kernel] [k] ipv6_dev_get_saddr
2.07% [kernel] [k] __dev_queue_xmit
1.46% [ixgbe] [k] ixgbe_clean_rx_irq
1.42% [kernel] [k] memcmp
1.37% [nf_tables] [k] nft_do_chain
1.35% [nf_tables] [k] __nft_rbtree_lookup
1.29% [kernel] [k] __netif_receive_skb_core.constprop.0
1.01% [kernel] [k] strncpy
0.93% [kernel] [k] fib_table_lookup
0.91% [kernel] [k] dev_queue_xmit_nit
0.89% [kernel] [k] csum_partial_copy_generic
0.89% [kernel] [k] skb_clone
0.86% [kernel] [k] __skb_flow_dissect
0.83% [kernel] [k] __copy_skb_header
0.80% [kernel] [k] kmem_cache_free
0.73% [nf_tables] [k] nft_rhash_lookup
0.70% [nf_conntrack] [k] __nf_conntrack_find_get.isra.0
0.70% [kernel] [k] skb_release_data
0.64% [ixgbe] [k] ixgbe_tx_map
0.53% [kernel] [k] kmem_cache_alloc
0.51% [kernel] [k] kfree_skb_reason
0.48% [kernel] [k] ip_route_input_slow
0.48% [kernel] [k] dev_hard_start_xmit
0.48% [kernel] [k] ip_finish_output2
0.47% [vlan_mon] [k] vlan_pt_recv
0.42% [kernel] [k] nf_hook_slow
0.40% [kernel] [k] __siphash_unaligned
0.39% [kernel] [k] ___slab_alloc.isra.0
0.38% [nf_tables] [k] nft_lookup_eval
0.38% [ixgbe] [k] ixgbe_xmit_frame_ring
0.36% [kernel] [k] netif_skb_features
0.33% [kernel] [k] dev_gro_receive
0.33% [kernel] [k] ip_rcv_core.constprop.0
0.32% [kernel] [k] vlan_do_receive
0.30% [kernel] [k] ip_forward
0.30% [kernel] [k] get_rps_cpu
0.30% [kernel] [k] process_backlog
0.30% [kernel] [k] ktime_get_with_offset
0.30% [kernel] [k] _raw_spin_lock_irqsave
0.30% [kernel] [k] validate_xmit_skb.isra.0
0.30% [kernel] [k] __rcu_read_unlock
0.30% [kernel] [k] sch_direct_xmit
0.29% [kernel] [k] page_frag_free
0.29% [nf_conntrack] [k] nf_conntrack_in
0.28% [kernel] [k] netdev_core_pick_tx
0.28% [nf_tables] [k] nft_meta_get_eval
0.27% [kernel] [k] kmem_cache_free_bulk.part.0
0.26% [kernel] [k] ip_output
0.26% [kernel] [k] _raw_spin_lock_bh
0.26% [kernel] [k] __local_bh_enable_ip
0.25% [kernel] [k] netdev_pick_tx
0.23% [ppp_generic] [k] __ppp_xmit_process
0.23% [nf_nat] [k] l4proto_manip_pkt
0.23% [ixgbe] [k] ixgbe_process_skb_fields
0.23% [pppoe] [k] pppoe_xmit
0.23% [kernel] [k] skb_network_protocol
0.22% [kernel] [k] inet_gro_receive
0.22% [ppp_generic] [k] ppp_start_xmit
0.22% [kernel] [k] __list_del_entry_valid
0.20% [kernel] [k] __slab_free.isra.0
0.20% [kernel] [k] _raw_spin_lock
0.20% [kernel] [k] __rcu_read_lock
0.20% [kernel] [k] csum_partial
0.20% [kernel] [k] read_tsc
0.19% [nf_nat] [k] nf_nat_ipv4_manip_pkt
0.18% [kernel] [k] napi_build_skb
0.18% [ixgbe] [k] ixgbe_clean_tx_irq
0.18% [kernel] [k] dma_map_page_attrs
0.17% [ppp_generic] [k] ppp_push
0.16% [kernel] [k] vlan_dev_hard_start_xmit
0.15% [kernel] [k] skb_segment
0.14% [kernel] [k] napi_consume_skb
0.14% [kernel] [k] enqueue_to_backlog
0.13% [kernel] [k] kmem_cache_alloc_bulk
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: High Cpu load when run smartdns : __ipv6_dev_get_saddr
2023-08-27 14:20 High Cpu load when run smartdns : __ipv6_dev_get_saddr Martin Zaharinov
@ 2023-08-27 16:51 ` David Ahern
2023-08-27 20:17 ` Martin Zaharinov
0 siblings, 1 reply; 5+ messages in thread
From: David Ahern @ 2023-08-27 16:51 UTC (permalink / raw)
To: Martin Zaharinov, netdev, Eric Dumazet
On 8/27/23 7:20 AM, Martin Zaharinov wrote:
> Hi Eric
>
>
> i need you help to find is this bug or no.
>
> I talk with smartdns team and try to research in his code but for the moment not found ..
>
> test system have 5k ppp users on pppoe device
>
> after run smartdns
>
> service got to 100% load
>
> in normal case when run other 2 type of dns server (isc bind or knot ) all is fine .
>
> but when run smartdns see perf :
>
>
> PerfTop: 4223 irqs/sec kernel:96.9% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (target_pid: 1208268)
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> 28.48% [kernel] [k] __ipv6_dev_get_saddr
> 12.31% [kernel] [k] l3mdev_master_ifindex_rcu
> 6.63% [pppoe] [k] pppoe_rcv
> 3.82% [kernel] [k] ipv6_dev_get_saddr
> 2.07% [kernel] [k] __dev_queue_xmit
Can you post stack traces for the top 5 symbols?
What is the packet rate when the above is taken?
4,223 irqs/sec is not much of a load; can you add some details on the
hardware and networking setup (e.g., l3mdev reference suggests you are
using VRF)?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: High Cpu load when run smartdns : __ipv6_dev_get_saddr
2023-08-27 16:51 ` David Ahern
@ 2023-08-27 20:17 ` Martin Zaharinov
2023-08-28 2:42 ` David Ahern
0 siblings, 1 reply; 5+ messages in thread
From: Martin Zaharinov @ 2023-08-27 20:17 UTC (permalink / raw)
To: David Ahern; +Cc: netdev, Eric Dumazet, pymumu
Hi David,
> On 27 Aug 2023, at 19:51, David Ahern <dsahern@kernel.org> wrote:
>
> On 8/27/23 7:20 AM, Martin Zaharinov wrote:
>> Hi Eric
>>
>>
>> i need you help to find is this bug or no.
>>
>> I talk with smartdns team and try to research in his code but for the moment not found ..
>>
>> test system have 5k ppp users on pppoe device
>>
>> after run smartdns
>>
>> service got to 100% load
>>
>> in normal case when run other 2 type of dns server (isc bind or knot ) all is fine .
>>
>> but when run smartdns see perf :
>>
>>
>> PerfTop: 4223 irqs/sec kernel:96.9% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (target_pid: 1208268)
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> 28.48% [kernel] [k] __ipv6_dev_get_saddr
>> 12.31% [kernel] [k] l3mdev_master_ifindex_rcu
>> 6.63% [pppoe] [k] pppoe_rcv
>> 3.82% [kernel] [k] ipv6_dev_get_saddr
>> 2.07% [kernel] [k] __dev_queue_xmit
>
> Can you post stack traces for the top 5 symbols?
If write how i will get.
>
> What is the packet rate when the above is taken?
its normal rate of dns query… with both other dns server all is fine
>
> 4,223 irqs/sec is not much of a load; can you add some details on the
> hardware and networking setup (e.g., l3mdev reference suggests you are
> using VRF)?
No system is very simple:
eth0 (Internet) Router (smartDNS + pppoe server ) - eth1 ( User side with pppoe server ) here have 5000 ppp interface .
with both other service i dont see all work fine.
Here i will add in CC and Nick Peng he is developer of SmartDNS i he have idea ….
m.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: High Cpu load when run smartdns : __ipv6_dev_get_saddr
2023-08-27 20:17 ` Martin Zaharinov
@ 2023-08-28 2:42 ` David Ahern
2023-08-28 5:06 ` Martin Zaharinov
0 siblings, 1 reply; 5+ messages in thread
From: David Ahern @ 2023-08-28 2:42 UTC (permalink / raw)
To: Martin Zaharinov; +Cc: netdev, Eric Dumazet, pymumu
On 8/27/23 1:17 PM, Martin Zaharinov wrote:
> Hi David,
>
>
>
>> On 27 Aug 2023, at 19:51, David Ahern <dsahern@kernel.org> wrote:
>>
>> On 8/27/23 7:20 AM, Martin Zaharinov wrote:
>>> Hi Eric
>>>
>>>
>>> i need you help to find is this bug or no.
>>>
>>> I talk with smartdns team and try to research in his code but for the moment not found ..
>>>
>>> test system have 5k ppp users on pppoe device
>>>
>>> after run smartdns
>>>
>>> service got to 100% load
>>>
>>> in normal case when run other 2 type of dns server (isc bind or knot ) all is fine .
>>>
>>> but when run smartdns see perf :
>>>
>>>
>>> PerfTop: 4223 irqs/sec kernel:96.9% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (target_pid: 1208268)
>>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>
>>> 28.48% [kernel] [k] __ipv6_dev_get_saddr
>>> 12.31% [kernel] [k] l3mdev_master_ifindex_rcu
>>> 6.63% [pppoe] [k] pppoe_rcv
>>> 3.82% [kernel] [k] ipv6_dev_get_saddr
>>> 2.07% [kernel] [k] __dev_queue_xmit
>>
>> Can you post stack traces for the top 5 symbols?
>
> If write how i will get.
While running traffic load:
perf record -a -g -- sleep 5
perf report --stdio
>
>>
>> What is the packet rate when the above is taken?
>
> its normal rate of dns query… with both other dns server all is fine
That means nothing to me. You will need to post packet rates.
>
>>
>> 4,223 irqs/sec is not much of a load; can you add some details on the
>> hardware and networking setup (e.g., l3mdev reference suggests you are
>> using VRF)?
> No system is very simple:
>
> eth0 (Internet) Router (smartDNS + pppoe server ) - eth1 ( User side with pppoe server ) here have 5000 ppp interface .
>
> with both other service i dont see all work fine.
ip link sh type vrf
--> that does not show any devices? It should because the majority of
work done in l3mdev_master_ifindex_rcu is for vrf port devices. ie., it
should not appear in the perf-top data you posted unless vrf devices are
in play.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: High Cpu load when run smartdns : __ipv6_dev_get_saddr
2023-08-28 2:42 ` David Ahern
@ 2023-08-28 5:06 ` Martin Zaharinov
0 siblings, 0 replies; 5+ messages in thread
From: Martin Zaharinov @ 2023-08-28 5:06 UTC (permalink / raw)
To: David Ahern; +Cc: netdev, Eric Dumazet, pymumu
Hi David
> On 28 Aug 2023, at 5:42, David Ahern <dsahern@kernel.org> wrote:
>
> On 8/27/23 1:17 PM, Martin Zaharinov wrote:
>> Hi David,
>>
>>
>>
>>> On 27 Aug 2023, at 19:51, David Ahern <dsahern@kernel.org> wrote:
>>>
>>> On 8/27/23 7:20 AM, Martin Zaharinov wrote:
>>>> Hi Eric
>>>>
>>>>
>>>> i need you help to find is this bug or no.
>>>>
>>>> I talk with smartdns team and try to research in his code but for the moment not found ..
>>>>
>>>> test system have 5k ppp users on pppoe device
>>>>
>>>> after run smartdns
>>>>
>>>> service got to 100% load
>>>>
>>>> in normal case when run other 2 type of dns server (isc bind or knot ) all is fine .
>>>>
>>>> but when run smartdns see perf :
>>>>
>>>>
>>>> PerfTop: 4223 irqs/sec kernel:96.9% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (target_pid: 1208268)
>>>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>>
>>>> 28.48% [kernel] [k] __ipv6_dev_get_saddr
>>>> 12.31% [kernel] [k] l3mdev_master_ifindex_rcu
>>>> 6.63% [pppoe] [k] pppoe_rcv
>>>> 3.82% [kernel] [k] ipv6_dev_get_saddr
>>>> 2.07% [kernel] [k] __dev_queue_xmit
>>>
>>> Can you post stack traces for the top 5 symbols?
>>
>> If write how i will get.
>
> While running traffic load:
> perf record -a -g -- sleep 5
> perf report --stdio
>
Here is perf.data file : https://easyupload.io/k3ep8l
>>
>>>
>>> What is the packet rate when the above is taken?
>>
>> its normal rate of dns query… with both other dns server all is fine
>
> That means nothing to me. You will need to post packet rates.
I honestly don't know how to measure it, but I don't think they are more than 10k QPS - in system have 5-5.5k users
>
>>
>>>
>>> 4,223 irqs/sec is not much of a load; can you add some details on the
>>> hardware and networking setup (e.g., l3mdev reference suggests you are
>>> using VRF)?
>> No system is very simple:
>>
>> eth0 (Internet) Router (smartDNS + pppoe server ) - eth1 ( User side with pppoe server ) here have 5000 ppp interface .
>>
>> with both other service i dont see all work fine.
>
> ip link sh type vrf
> --> that does not show any devices? It should because the majority of
> work done in l3mdev_master_ifindex_rcu is for vrf port devices. ie., it
> should not appear in the perf-top data you posted unless vrf devices are
> in play.
VRF is disable in kernel config .
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-08-28 5:06 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-27 14:20 High Cpu load when run smartdns : __ipv6_dev_get_saddr Martin Zaharinov
2023-08-27 16:51 ` David Ahern
2023-08-27 20:17 ` Martin Zaharinov
2023-08-28 2:42 ` David Ahern
2023-08-28 5:06 ` Martin Zaharinov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).