All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption
@ 2023-10-17  1:47 Coco Li
  2023-10-17  1:47 ` [PATCH v2 net-next 1/5] Documentations: Analyze heavily used Networking related structs Coco Li
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Coco Li @ 2023-10-17  1:47 UTC (permalink / raw)
  To: Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang, Coco Li

Currently, variable-heavy structs in the networking stack is organized
chronologically, logically and sometimes by cache line access.

This patch series attempts to reorganize the core networking stack
variables to minimize cacheline consumption during the phase of data
transfer. Specifically, we looked at the TCP/IP stack and the fast
path definition in TCP.

For documentation purposes, we also added new files for each core data
structure we considered, although not all ended up being modified due
to the amount of existing cache line they span in the fast path. In
the documentation, we recorded all variables we identified on the
fast path and the reasons. We also hope that in the future when
variables are added/modified, the document can be referred to and
updated accordingly to reflect the latest variable organization.

Tested:
Our tests were run with neper tcp_rr using tcp traffic. The tests have $cpu
number of threads and variable number of flows (see below).

Tests were run on 6.5-rc1

Efficiency is computed as cpu seconds / throughput (one tcp_rr round trip).
The following result shows efficiency delta before and after the patch
series is applied.

On AMD platforms with 100Gb/s NIC and 256Mb L3 cache:
IPv4
Flows   with patches    clean kernel      Percent reduction
30k     0.0001736538065 0.0002741191042 -36.65%
20k     0.0001583661752 0.0002712559158 -41.62%
10k     0.0001639148817 0.0002951800751 -44.47%
5k      0.0001859683866 0.0003320642536 -44.00%
1k      0.0002035190546 0.0003152056382 -35.43%

IPv6
Flows   with patches  clean kernel    Percent reduction
30k     0.000202535503  0.0003275329163 -38.16%
20k     0.0002020654777 0.0003411304786 -40.77%
10k     0.0002122427035 0.0003803674705 -44.20%
5k      0.0002348776729 0.0004030403953 -41.72%
1k      0.0002237384583 0.0002813646157 -20.48%

On Intel platforms with 200Gb/s NIC and 105Mb L3 cache:
IPv6
Flows   with patches    clean kernel    Percent reduction
30k     0.0006296537873 0.0006370427753 -1.16%
20k     0.0003451029365 0.0003628016076 -4.88%
10k     0.0003187646958 0.0003346835645 -4.76%
5k      0.0002954676348 0.000311807592  -5.24%
1k      0.0001909169342 0.0001848069709 3.31%

Chao Wu (1):
  net-smnp: reorganize SNMP fast path variables

Coco Li (4):
  Documentations: Analyze heavily used Networking related structs
  netns-ipv4: reorganize netns_ipv4 fast path variables
  net-device: reorganize net_device fast path variables
  tcp: reorganize tcp_sock fast path variables

 .../net_cachelines/inet_connection_sock.rst   |  42 ++++
 .../networking/net_cachelines/inet_sock.rst   |  37 +++
 .../networking/net_cachelines/net_device.rst  | 167 ++++++++++++
 .../net_cachelines/netns_ipv4_sysctl.rst      | 151 +++++++++++
 .../networking/net_cachelines/snmp.rst        | 128 ++++++++++
 .../networking/net_cachelines/tcp_sock.rst    | 148 +++++++++++
 include/linux/netdevice.h                     |  99 ++++----
 include/linux/tcp.h                           | 238 +++++++++---------
 include/net/netns/ipv4.h                      |  41 +--
 include/uapi/linux/snmp.h                     |  34 ++-
 10 files changed, 896 insertions(+), 189 deletions(-)
 create mode 100644 Documentation/networking/net_cachelines/inet_connection_sock.rst
 create mode 100644 Documentation/networking/net_cachelines/inet_sock.rst
 create mode 100644 Documentation/networking/net_cachelines/net_device.rst
 create mode 100644 Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
 create mode 100644 Documentation/networking/net_cachelines/snmp.rst
 create mode 100644 Documentation/networking/net_cachelines/tcp_sock.rst

-- 
2.42.0.655.g421f12c284-goog


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 net-next 1/5] Documentations: Analyze heavily used Networking related structs
  2023-10-17  1:47 [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Coco Li
@ 2023-10-17  1:47 ` Coco Li
  2023-10-17 14:57   ` Jonathan Corbet
  2023-10-17  1:47 ` [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables Coco Li
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Coco Li @ 2023-10-17  1:47 UTC (permalink / raw)
  To: Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang, Coco Li

Analyzed a few structs in the networking stack by looking at variables
within them that are used in the TCP/IP fast path.

Fast path is defined as TCP path where data is transferred from sender to
receiver unidirectionaly. It doesn't include phases other than
TCP_ESTABLISHED, nor does it look at error paths.

We hope to re-organizing variables that span many cachelines whose fast
path variables are also spread out, and this document can help future
developers keep networking fast path cachelines small.

Optimized_cacheline field is computed as
(Fastpath_Bytes/L3_cacheline_size_x86), and not the actual organized
results (see patches to come for these).

Note that the optimization is not cache line size dependent, we use
x86 as an example of improvements.

Investigation is done on 6.5

Name	                Struct_Cachelines  Cur_fastpath_cache Fastpath_Bytes Optimized_cacheline
tcp_sock	        42 (2664 Bytes)	   12   		396		8
net_device	        39 (2240 bytes)	   12			234		4
inet_sock	        15 (960 bytes)	   14			922		14
Inet_connection_sock	22 (1368 bytes)	   18			1166		18
Netns_ipv4 (sysctls)	12 (768 bytes)     4			77		2
linux_mib	        16 (1060)	   6			104		2

Note how there isn't much improvement space for inet_sock and
Inet_connection_sock because sk and icsk_inet respective take up so
much of the struct that rest of the variables become a small portion of
the struct size.

So, we decided to reorganize tcp_sock, net_device, Netns_ipv4, linux_mib

Signed-off-by: Coco Li <lixiaoyan@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
---
 .../net_cachelines/inet_connection_sock.rst   |  42 +++++
 .../networking/net_cachelines/inet_sock.rst   |  37 ++++
 .../networking/net_cachelines/net_device.rst  | 167 ++++++++++++++++++
 .../net_cachelines/netns_ipv4_sysctl.rst      | 151 ++++++++++++++++
 .../networking/net_cachelines/snmp.rst        | 128 ++++++++++++++
 .../networking/net_cachelines/tcp_sock.rst    | 148 ++++++++++++++++
 6 files changed, 673 insertions(+)
 create mode 100644 Documentation/networking/net_cachelines/inet_connection_sock.rst
 create mode 100644 Documentation/networking/net_cachelines/inet_sock.rst
 create mode 100644 Documentation/networking/net_cachelines/net_device.rst
 create mode 100644 Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
 create mode 100644 Documentation/networking/net_cachelines/snmp.rst
 create mode 100644 Documentation/networking/net_cachelines/tcp_sock.rst

diff --git a/Documentation/networking/net_cachelines/inet_connection_sock.rst b/Documentation/networking/net_cachelines/inet_connection_sock.rst
new file mode 100644
index 0000000000000..121c706393733
--- /dev/null
+++ b/Documentation/networking/net_cachelines/inet_connection_sock.rst
@@ -0,0 +1,42 @@
+..struct                            inet_connection_sock   fastpath_tx_access  fastpath_rx_access  comment
+struct_inet_sock                    icsk_inet              read_mostly         read_mostly         tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data
+struct_request_sock_queue           icsk_accept_queue      -                   -                   
+struct_inet_bind_bucket             icsk_bind_hash         read_mostly         -                   tcp_set_state
+struct_inet_bind2_bucket            icsk_bind2_hash        read_mostly         -                   tcp_set_state,inet_put_port
+unsigned_long                       icsk_timeout           read_mostly         -                   inet_csk_reset_xmit_timer,tcp_connect
+struct_timer_list                   icsk_retransmit_timer  read_mostly         -                   inet_csk_reset_xmit_timer,tcp_connect
+struct_timer_list                   icsk_delack_timer      read_mostly         -                   inet_csk_reset_xmit_timer,tcp_connect
+u32                                 icsk_rto               read_write          -                   tcp_cwnd_validate,tcp_schedule_loss_probe,tcp_connect_init,tcp_connect,tcp_write_xmit,tcp_push_one
+u32                                 icsk_rto_min           -                   -                   
+u32                                 icsk_delack_max        -                   -                   
+u32                                 icsk_pmtu_cookie       read_write          -                   tcp_sync_mss,tcp_current_mss,tcp_send_syn_data,tcp_connect_init,tcp_connect
+struct_tcp_congestion_ops           icsk_ca_ops            read_write          -                   tcp_cwnd_validate,tcp_tso_segs,tcp_ca_dst_init,tcp_connect_init,tcp_connect,tcp_write_xmit
+struct_inet_connection_sock_af_ops  icsk_af_ops            read_mostly         -                   tcp_finish_connect,tcp_send_syn_data,tcp_mtup_init,tcp_mtu_check_reprobe,tcp_mtu_probe,tcp_connect_init,tcp_connect,__tcp_transmit_skb
+struct_tcp_ulp_ops*                 icsk_ulp_ops           -                   -                   
+void*                               icsk_ulp_data          -                   -                   
+u8:5                                icsk_ca_state          read_write          -                   tcp_cwnd_application_limited,tcp_set_ca_state,tcp_enter_cwr,tcp_tso_should_defer,tcp_mtu_probe,tcp_schedule_loss_probe,tcp_write_xmit,__tcp_transmit_skb
+u8:1                                icsk_ca_initialized    read_write          -                   tcp_init_transfer,tcp_init_congestion_control,tcp_init_transfer,tcp_finish_connect,tcp_connect
+u8:1                                icsk_ca_setsockopt     -                   -                   
+u8:1                                icsk_ca_dst_locked     write_mostly        -                   tcp_ca_dst_init,tcp_connect_init,tcp_connect
+u8                                  icsk_retransmits       write_mostly        -                   tcp_connect_init,tcp_connect
+u8                                  icsk_pending           read_write          -                   inet_csk_reset_xmit_timer,tcp_connect,tcp_check_probe_timer,__tcp_push_pending_frames,tcp_rearm_rto,tcp_event_new_data_sent,tcp_event_new_data_sent
+u8                                  icsk_backoff           write_mostly        -                   tcp_write_queue_purge,tcp_connect_init
+u8                                  icsk_syn_retries       -                   -                   
+u8                                  icsk_probes_out        -                   -                   
+u16                                 icsk_ext_hdr_len       read_mostly         -                   __tcp_mtu_to_mss,tcp_mtu_to_rss,tcp_mtu_probe,tcp_write_xmit,tcp_mtu_to_mss,
+struct_icsk_ack_u8                  pending                read_write          read_write          inet_csk_ack_scheduled,__tcp_cleanup_rbuf,tcp_cleanup_rbuf,inet_csk_clear_xmit_timer,tcp_event_ack-sent,inet_csk_reset_xmit_timer
+struct_icsk_ack_u8                  quick                  read_write          write_mostly        tcp_dec_quickack_mode,tcp_event_ack_sent,__tcp_transmit_skb,__tcp_select_window,__tcp_cleanup_rbuf
+struct_icsk_ack_u8                  pingpong               -                   -                   
+struct_icsk_ack_u8                  retry                  write_mostly        read_write          inet_csk_clear_xmit_timer,tcp_rearm_rto,tcp_event_new_data_sent,tcp_write_xmit,__tcp_send_ack,tcp_send_ack,
+struct_icsk_ack_u8                  ato                    read_mostly         write_mostly        tcp_dec_quickack_mode,tcp_event_ack_sent,__tcp_transmit_skb,__tcp_send_ack,tcp_send_ack
+struct_icsk_ack_unsigned_long       timeout                read_write          read_write          inet_csk_reset_xmit_timer,tcp_connect
+struct_icsk_ack_u32                 lrcvtime               read_write          -                   tcp_finish_connect,tcp_connect,tcp_event_data_sent,__tcp_transmit_skb
+struct_icsk_ack_u16                 rcv_mss                write_mostly        read_mostly         __tcp_select_window,__tcp_cleanup_rbuf,tcp_initialize_rcv_mss,tcp_connect_init
+struct_icsk_mtup_int                search_high            read_write          -                   tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_check_reprobe,tcp_write_xmit
+struct_icsk_mtup_int                search_low             read_write          -                   tcp_mtu_probe,tcp_mtu_check_reprobe,tcp_write_xmit,tcp_sync_mss,tcp_connect_init,tcp_mtup_init
+struct_icsk_mtup_u32:31             probe_size             read_write          -                   tcp_mtup_init,tcp_connect_init,__tcp_transmit_skb
+struct_icsk_mtup_u32:1              enabled                read_write          -                   tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_probe,tcp_write_xmit
+struct_icsk_mtup_u32                probe_timestamp        read_write          -                   tcp_mtup_init,tcp_connect_init,tcp_mtu_check_reprobe,tcp_mtu_probe
+u32                                 icsk_probes_tstamp     -                   -                   
+u32                                 icsk_user_timeout      -                   -                   
+u64[104/sizeof(u64)]                icsk_ca_priv           -                   -                   
diff --git a/Documentation/networking/net_cachelines/inet_sock.rst b/Documentation/networking/net_cachelines/inet_sock.rst
new file mode 100644
index 0000000000000..7d5d645c1d3ae
--- /dev/null
+++ b/Documentation/networking/net_cachelines/inet_sock.rst
@@ -0,0 +1,37 @@
+Type                    Name                  fastpath_tx_access  fastpath_rx_access  comment
+..struct                inet_sock                                                     
+struct_sock             sk                    read_mostly         read_mostly         tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data
+struct_ipv6_pinfo*      pinet6                -                   -                   
+be16                    inet_sport            read_mostly         -                   __tcp_transmit_skb
+be32                    inet_daddr            read_mostly         -                   ip_select_ident_segs
+be32                    inet_rcv_saddr        -                   -                   
+be16                    inet_dport            read_mostly         -                   __tcp_transmit_skb
+u16                     inet_num              -                   -                   
+be32                    inet_saddr            -                   -                   
+s16                     uc_ttl                read_mostly         -                   __ip_queue_xmit/ip_select_ttl
+u16                     cmsg_flags            -                   -                   
+struct_ip_options_rcu*  inet_opt              read_mostly         -                   __ip_queue_xmit
+u16                     inet_id               read_mostly         -                   ip_select_ident_segs
+u8                      tos                   read_mostly         -                   ip_queue_xmit
+u8                      min_ttl               -                   -                   
+u8                      mc_ttl                -                   -                   
+u8                      pmtudisc              -                   -                   
+u8:1                    recverr               -                   -                   
+u8:1                    is_icsk               -                   -                   
+u8:1                    freebind              -                   -                   
+u8:1                    hdrincl               -                   -                   
+u8:1                    mc_loop               -                   -                   
+u8:1                    transparent           -                   -                   
+u8:1                    mc_all                -                   -                   
+u8:1                    nodefrag              -                   -                   
+u8:1                    bind_address_no_port  -                   -                   
+u8:1                    recverr_rfc4884       -                   -                   
+u8:1                    defer_connect         read_mostly         -                   tcp_sendmsg_fastopen
+u8                      rcv_tos               -                   -                   
+u8                      convert_csum          -                   -                   
+int                     uc_index              -                   -                   
+int                     mc_index              -                   -                   
+be32                    mc_addr               -                   -                   
+struct_ip_mc_socklist*  mc_list               -                   -                   
+struct_inet_cork_full   cork                  read_mostly         -                   __tcp_transmit_skb
+struct                  local_port_range      -                   -                   
diff --git a/Documentation/networking/net_cachelines/net_device.rst b/Documentation/networking/net_cachelines/net_device.rst
new file mode 100644
index 0000000000000..b6c85300be914
--- /dev/null
+++ b/Documentation/networking/net_cachelines/net_device.rst
@@ -0,0 +1,167 @@
+Type                                Name                    fastpath_tx_access  fastpath_rx_access  Comments
+..struct                            ..net_device                                                    
+char                                name[16]                -                   -                   
+struct_netdev_name_node*            name_node                                                       
+struct_dev_ifalias*                 ifalias                                                         
+unsigned_long                       mem_end                                                         
+unsigned_long                       mem_start                                                       
+unsigned_long                       base_addr                                                       
+unsigned_long                       state                                                           
+struct_list_head                    dev_list                                                        
+struct_list_head                    napi_list                                                       
+struct_list_head                    unreg_list                                                      
+struct_list_head                    close_list                                                      
+struct_list_head                    ptype_all               read_mostly         -                   dev_nit_active(tx)
+struct_list_head                    ptype_specific                              read_mostly         deliver_ptype_list_skb/__netif_receive_skb_core(rx)
+struct                              adj_list                                                        
+unsigned_int                        flags                   read_mostly         read_mostly         __dev_queue_xmit,__dev_xmit_skb,ip6_output,__ip6_finish_output(tx);ip6_rcv_core(rx)
+xdp_features_t                      xdp_features                                                    
+unsigned_long_long                  priv_flags              read_mostly         -                   __dev_queue_xmit(tx)
+struct_net_device_ops*              netdev_ops              read_mostly         -                   netdev_core_pick_tx,netdev_start_xmit(tx)
+struct_xdp_metadata_ops*            xdp_metadata_ops                                                
+int                                 ifindex                 -                   read_mostly         ip6_rcv_core
+unsigned_short                      gflags                                                          
+unsigned_short                      hard_header_len         read_mostly         read_mostly         ip6_xmit(tx);gro_list_prepare(rx)
+unsigned_int                        mtu                     read_mostly         -                   ip_finish_output2
+unsigned_short                      needed_headroom         read_mostly         -                   LL_RESERVED_SPACE/ip_finish_output2
+unsigned_short                      needed_tailroom                                                 
+netdev_features_t                   features                read_mostly         read_mostly         HARD_TX_LOCK,netif_skb_features,sk_setup_caps(tx);netif_elide_gro(rx)
+netdev_features_t                   hw_features                                                     
+netdev_features_t                   wanted_features                                                 
+netdev_features_t                   vlan_features                                                   
+netdev_features_t                   hw_enc_features         -                   -                   netif_skb_features
+netdev_features_t                   mpls_features                                                   
+netdev_features_t                   gso_partial_features                                            
+unsigned_int                        min_mtu                                                         
+unsigned_int                        max_mtu                                                         
+unsigned_short                      type                                                            
+unsigned_char                       min_header_len                                                  
+unsigned_char                       name_assign_type                                                
+int                                 group                                                           
+struct_net_device_stats             stats                                                           
+struct_net_device_core_stats*       core_stats                                                      
+atomic_t                            carrier_up_count                                                
+atomic_t                            carrier_down_count                                              
+struct_iw_handler_def*              wireless_handlers                                               
+struct_iw_public_data*              wireless_data                                                   
+struct_ethtool_ops*                 ethtool_ops                                                     
+struct_l3mdev_ops*                  l3mdev_ops                                                      
+struct_ndisc_ops*                   ndisc_ops                                                       
+struct_xfrmdev_ops*                 xfrmdev_ops                                                     
+struct_tlsdev_ops*                  tlsdev_ops                                                      
+struct_header_ops*                  header_ops              read_mostly         -                   ip_finish_output2,ip6_finish_output2(tx)
+unsigned_char                       operstate                                                       
+unsigned_char                       link_mode                                                       
+unsigned_char                       if_port                                                         
+unsigned_char                       dma                                                             
+unsigned_char                       perm_addr[32]                                                   
+unsigned_char                       addr_assign_type                                                
+unsigned_char                       addr_len                                                        
+unsigned_char                       upper_level                                                     
+unsigned_char                       lower_level                                                     
+unsigned_short                      neigh_priv_len                                                  
+unsigned_short                      padded                                                          
+spinlock_t                          addr_list_lock                                                  
+int                                 irq                                                             
+struct_netdev_hw_addr_list          uc                                                              
+struct_netdev_hw_addr_list          mc                                                              
+struct_netdev_hw_addr_list          dev_addrs                                                       
+struct_kset*                        queues_kset                                                     
+struct_list_head                    unlink_list                                                     
+unsigned_int                        promiscuity                                                     
+unsigned_int                        allmulti                                                        
+bool                                uc_promisc                                                      
+unsigned_char                       nested_level                                                    
+struct_in_device*                   ip_ptr                  read_mostly         read_mostly         __in_dev_get
+struct_inet6_dev*                   ip6_ptr                 read_mostly         read_mostly         __in6_dev_get
+struct_vlan_info*                   vlan_info                                                       
+struct_dsa_port*                    dsa_ptr                                                         
+struct_tipc_bearer*                 tipc_ptr                                                        
+void*                               atalk_ptr                                                       
+void*                               ax25_ptr                                                        
+struct_wireless_dev*                ieee80211_ptr                                                   
+struct_wpan_dev*                    ieee802154_ptr                                                  
+struct_mpls_dev*                    mpls_ptr                                                        
+struct_mctp_dev*                    mctp_ptr                                                        
+unsigned_char*                      dev_addr                                                        
+struct_netdev_queue*                _rx                     read_mostly         -                   netdev_get_rx_queue(rx)
+unsigned_int                        num_rx_queues                                                   
+unsigned_int                        real_num_rx_queues      -                   read_mostly         get_rps_cpu
+struct_bpf_prog*                    xdp_prog                                                        
+unsigned_long                       gro_flush_timeout       -                   read_mostly         napi_complete_done
+int                                 napi_defer_hard_irqs    -                   read_mostly         napi_complete_done
+unsigned_int                        gro_max_size            -                   read_mostly         skb_gro_receive
+unsigned_int                        gro_ipv4_max_size       -                   read_mostly         skb_gro_receive
+rx_handler_func_t*                  rx_handler              read_mostly         -                   __netif_receive_skb_core
+void*                               rx_handler_data         read_mostly         -                   
+struct_mini_Qdisc*                  miniq_ingress                                                   
+struct_netdev_queue*                ingress_queue           read_mostly         -                   
+struct_nf_hook_entries*             nf_hooks_ingress                                                
+unsigned_char                       broadcast[32]                                                   
+struct_cpu_rmap*                    rx_cpu_rmap                                                     
+struct_hlist_node                   index_hlist                                                     
+struct_netdev_queue*                _tx                     read_mostly         -                   netdev_get_tx_queue(tx)
+unsigned_int                        num_tx_queues           -                   -                   
+unsigned_int                        real_num_tx_queues      read_mostly         -                   skb_tx_hash,netdev_core_pick_tx(tx)
+unsigned_int                        tx_queue_len                                                    
+spinlock_t                          tx_global_lock                                                  
+struct_xdp_dev_bulk_queue__percpu*  xdp_bulkq                                                       
+struct_xps_dev_maps*                xps_maps[2]             read_mostly         -                   __netif_set_xps_queue
+struct_mini_Qdisc*                  miniq_egress                                                    
+struct_nf_hook_entries*             nf_hooks_egress         read_mostly         -                   
+struct_hlist_head                   qdisc_hash[16]                                                  
+struct_timer_list                   watchdog_timer                                                  
+int                                 watchdog_timeo                                                  
+u32                                 proto_down_reason                                               
+struct_list_head                    todo_list                                                       
+int__percpu*                        pcpu_refcnt                                                     
+refcount_t                          dev_refcnt                                                      
+struct_ref_tracker_dir              refcnt_tracker                                                  
+struct_list_head                    link_watch_list                                                 
+enum:8                              reg_state                                                       
+bool                                dismantle                                                       
+enum:16                             rtnl_link_state                                                 
+bool                                needs_free_netdev                                               
+void*priv_destructor                struct_net_device                                               
+struct_netpoll_info*                npinfo                  -                   read_mostly         napi_poll/napi_poll_lock
+possible_net_t                      nd_net                  -                   read_mostly         (dev_net)napi_busy_loop,tcp_v(4/6)_rcv,ip(v6)_rcv,ip(6)_input,ip(6)_input_finish
+void*                               ml_priv                                                         
+enum_netdev_ml_priv_type            ml_priv_type                                                    
+struct_pcpu_lstats__percpu*         lstats                                                          
+struct_pcpu_sw_netstats__percpu*    tstats                                                          
+struct_pcpu_dstats__percpu*         dstats                                                          
+struct_garp_port*                   garp_port                                                       
+struct_mrp_port*                    mrp_port                                                        
+struct_dm_hw_stat_delta*            dm_private                                                      
+struct_device                       dev                     -                   -                   
+struct_attribute_group*             sysfs_groups[4]                                                 
+struct_attribute_group*             sysfs_rx_queue_group                                            
+struct_rtnl_link_ops*               rtnl_link_ops                                                   
+unsigned_int                        gso_max_size            read_mostly         -                   sk_dst_gso_max_size
+unsigned_int                        tso_max_size                                                    
+u16                                 gso_max_segs            read_mostly         -                   gso_max_segs
+u16                                 tso_max_segs                                                    
+unsigned_int                        gso_ipv4_max_size       read_mostly         -                   sk_dst_gso_max_size
+struct_dcbnl_rtnl_ops*              dcbnl_ops                                                       
+s16                                 num_tc                  read_mostly         -                   skb_tx_hash
+struct_netdev_tc_txq                tc_to_txq[16]           read_mostly         -                   skb_tx_hash
+u8                                  prio_tc_map[16]                                                 
+unsigned_int                        fcoe_ddp_xid                                                    
+struct_netprio_map*                 priomap                                                         
+struct_phy_device*                  phydev                                                          
+struct_sfp_bus*                     sfp_bus                                                         
+struct_lock_class_key*              qdisc_tx_busylock                                               
+bool                                proto_down                                                      
+unsigned:1                          wol_enabled                                                     
+unsigned:1                          threaded                -                   -                   napi_poll(napi_enable,dev_set_threaded)
+struct_list_head                    net_notifier_list                                               
+struct_macsec_ops*                  macsec_ops                                                      
+struct_udp_tunnel_nic_info*         udp_tunnel_nic_info                                             
+struct_udp_tunnel_nic*              udp_tunnel_nic                                                  
+struct_bpf_xdp_entity               xdp_state[3]                                                    
+u8                                  dev_addr_shadow[32]                                             
+netdevice_tracker                   linkwatch_dev_tracker                                           
+netdevice_tracker                   watchdog_dev_tracker                                            
+netdevice_tracker                   dev_registered_tracker                                          
+struct_rtnl_hw_stats64*             offload_xstats_l3                                               
+struct_devlink_port*                devlink_port                                                    
diff --git a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
new file mode 100644
index 0000000000000..b51619422ce99
--- /dev/null
+++ b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
@@ -0,0 +1,151 @@
+Type                            Name                                         fastpath_tx_access  fastpath_rx_access  comment
+..struct                        ..netns_ipv4                                                                         
+struct_inet_timewait_death_row  tcp_death_row                                                                        
+struct_udp_table*               udp_table                                                                            
+struct_ctl_table_header*        forw_hdr                                                                             
+struct_ctl_table_header*        frags_hdr                                                                            
+struct_ctl_table_header*        ipv4_hdr                                                                             
+struct_ctl_table_header*        route_hdr                                                                            
+struct_ctl_table_header*        xfrm4_hdr                                                                            
+struct_ipv4_devconf*            devconf_all                                                                          
+struct_ipv4_devconf*            devconf_dflt                                                                         
+struct_ip_ra_chain              ra_chain                                                                             
+struct_mutex                    ra_mutex                                                                             
+struct_fib_rules_ops*           rules_ops                                                                            
+struct_fib_table                fib_main                                                                             
+struct_fib_table                fib_default                                                                          
+unsigned_int                    fib_rules_require_fldissect                                                          
+bool                            fib_has_custom_rules                                                                 
+bool                            fib_has_custom_local_routes                                                          
+bool                            fib_offload_disabled                                                                 
+atomic_t                        fib_num_tclassid_users                                                               
+struct_hlist_head*              fib_table_hash                                                                       
+struct_sock*                    fibnl                                                                                
+struct_sock*                    mc_autojoin_sk                                                                       
+struct_inet_peer_base*          peers                                                                                
+struct_fqdir*                   fqdir                                                                                
+u8                              sysctl_icmp_echo_ignore_all                                                          
+u8                              sysctl_icmp_echo_enable_probe                                                        
+u8                              sysctl_icmp_echo_ignore_broadcasts                                                   
+u8                              sysctl_icmp_ignore_bogus_error_responses                                             
+u8                              sysctl_icmp_errors_use_inbound_ifaddr                                                
+int                             sysctl_icmp_ratelimit                                                                
+int                             sysctl_icmp_ratemask                                                                 
+u32                             ip_rt_min_pmtu                               -                   -                   
+int                             ip_rt_mtu_expires                            -                   -                   
+int                             ip_rt_min_advmss                             -                   -                   
+struct_local_ports              ip_local_ports                               -                   -                   
+u8                              sysctl_tcp_ecn                               -                   -                   
+u8                              sysctl_tcp_ecn_fallback                      -                   -                   
+u8                              sysctl_ip_default_ttl                        -                   -                   ip4_dst_hoplimit/ip_select_ttl
+u8                              sysctl_ip_no_pmtu_disc                       -                   -                   
+u8                              sysctl_ip_fwd_use_pmtu                       read_mostly         -                   ip_dst_mtu_maybe_forward/ip_skb_dst_mtu
+u8                              sysctl_ip_fwd_update_priority                -                   -                   ip_forward
+u8                              sysctl_ip_nonlocal_bind                      -                   -                   
+u8                              sysctl_ip_autobind_reuse                     -                   -                   
+u8                              sysctl_ip_dynaddr                            -                   -                   
+u8                              sysctl_ip_early_demux                        -                   read_mostly         ip(6)_rcv_finish_core
+u8                              sysctl_raw_l3mdev_accept                     -                   -                   
+u8                              sysctl_tcp_early_demux                       -                   read_mostly         ip(6)_rcv_finish_core
+u8                              sysctl_udp_early_demux                                                               
+u8                              sysctl_nexthop_compat_mode                   -                   -                   
+u8                              sysctl_fwmark_reflect                        -                   -                   
+u8                              sysctl_tcp_fwmark_accept                     -                   -                   
+u8                              sysctl_tcp_l3mdev_accept                     -                   -                   
+u8                              sysctl_tcp_mtu_probing                       -                   -                   
+int                             sysctl_tcp_mtu_probe_floor                   -                   -                   
+int                             sysctl_tcp_base_mss                          -                   -                   
+int                             sysctl_tcp_min_snd_mss                       read_mostly         -                   __tcp_mtu_to_mss(tcp_write_xmit)
+int                             sysctl_tcp_probe_threshold                   -                   -                   tcp_mtu_probe(tcp_write_xmit)
+u32                             sysctl_tcp_probe_interval                    -                   -                   tcp_mtu_check_reprobe(tcp_write_xmit)
+int                             sysctl_tcp_keepalive_time                    -                   -                   
+int                             sysctl_tcp_keepalive_intvl                   -                   -                   
+u8                              sysctl_tcp_keepalive_probes                  -                   -                   
+u8                              sysctl_tcp_syn_retries                       -                   -                   
+u8                              sysctl_tcp_synack_retries                    -                   -                   
+u8                              sysctl_tcp_syncookies                        -                   -                   generated_on_syn
+u8                              sysctl_tcp_migrate_req                       -                   -                   reuseport
+u8                              sysctl_tcp_comp_sack_nr                      -                   -                   __tcp_ack_snd_check
+int                             sysctl_tcp_reordering                        -                   read_mostly         tcp_may_raise_cwnd/tcp_cong_control
+u8                              sysctl_tcp_retries1                          -                   -                   
+u8                              sysctl_tcp_retries2                          -                   -                   
+u8                              sysctl_tcp_orphan_retries                    -                   -                   
+u8                              sysctl_tcp_tw_reuse                          -                   -                   timewait_sock_ops
+int                             sysctl_tcp_fin_timeout                       -                   -                   TCP_LAST_ACK/tcp_rcv_state_process
+unsigned_int                    sysctl_tcp_notsent_lowat                     read_mostly         -                   tcp_notsent_lowat/tcp_stream_memory_free
+u8                              sysctl_tcp_sack                              -                   -                   tcp_syn_options
+u8                              sysctl_tcp_window_scaling                    -                   -                   tcp_syn_options,tcp_parse_options
+u8                              sysctl_tcp_timestamps                                                                
+u8                              sysctl_tcp_early_retrans                     read_mostly         -                   tcp_schedule_loss_probe(tcp_write_xmit)
+u8                              sysctl_tcp_recovery                          -                   -                   tcp_fastretrans_alert
+u8                              sysctl_tcp_thin_linear_timeouts              -                   -                   tcp_retrans_timer(on_thin_streams)
+u8                              sysctl_tcp_slow_start_after_idle             -                   -                   unlikely(tcp_cwnd_validate-network-not-starved)
+u8                              sysctl_tcp_retrans_collapse                  -                   -                   
+u8                              sysctl_tcp_stdurg                            -                   -                   unlikely(tcp_check_urg)
+u8                              sysctl_tcp_rfc1337                           -                   -                   
+u8                              sysctl_tcp_abort_on_overflow                 -                   -                   
+u8                              sysctl_tcp_fack                              -                   -                   
+int                             sysctl_tcp_max_reordering                    -                   -                   tcp_check_sack_reordering
+int                             sysctl_tcp_adv_win_scale                     -                   -                   tcp_init_buffer_space
+u8                              sysctl_tcp_dsack                             -                   -                   partial_packet_or_retrans_in_tcp_data_queue
+u8                              sysctl_tcp_app_win                           -                   -                   tcp_win_from_space
+u8                              sysctl_tcp_frto                              -                   -                   tcp_enter_loss
+u8                              sysctl_tcp_nometrics_save                    -                   -                   TCP_LAST_ACK/tcp_update_metrics
+u8                              sysctl_tcp_no_ssthresh_metrics_save          -                   -                   TCP_LAST_ACK/tcp_(update/init)_metrics
+u8                              sysctl_tcp_moderate_rcvbuf                   read_mostly         read_mostly         tcp_tso_should_defer(tx);tcp_rcv_space_adjust(rx)
+u8                              sysctl_tcp_tso_win_divisor                   read_mostly         -                   tcp_tso_should_defer(tcp_write_xmit)
+u8                              sysctl_tcp_workaround_signed_windows         -                   -                   tcp_select_window
+int                             sysctl_tcp_limit_output_bytes                read_mostly         -                   tcp_small_queue_check(tcp_write_xmit)
+int                             sysctl_tcp_challenge_ack_limit               -                   -                   
+int                             sysctl_tcp_min_rtt_wlen                      read_mostly         -                   tcp_ack_update_rtt
+u8                              sysctl_tcp_min_tso_segs                      -                   -                   unlikely(icsk_ca_ops-written)
+u8                              sysctl_tcp_tso_rtt_log                       read_mostly         -                   tcp_tso_autosize
+u8                              sysctl_tcp_autocorking                       read_mostly         -                   tcp_push/tcp_should_autocork
+u8                              sysctl_tcp_reflect_tos                       -                   -                   tcp_v(4/6)_send_synack
+int                             sysctl_tcp_invalid_ratelimit                 -                   -                   
+int                             sysctl_tcp_pacing_ss_ratio                   -                   -                   default_cong_cont(tcp_update_pacing_rate)
+int                             sysctl_tcp_pacing_ca_ratio                   -                   -                   default_cong_cont(tcp_update_pacing_rate)
+int                             sysctl_tcp_wmem[3]                           read_mostly         -                   tcp_wmem_schedule(sendmsg/sendpage)
+int                             sysctl_tcp_rmem[3]                           -                   read_mostly         __tcp_grow_window(tx),tcp_rcv_space_adjust(rx)
+unsigned_int                    sysctl_tcp_child_ehash_entries                                                       
+unsigned_long                   sysctl_tcp_comp_sack_delay_ns                -                   -                   __tcp_ack_snd_check
+unsigned_long                   sysctl_tcp_comp_sack_slack_ns                -                   -                   __tcp_ack_snd_check
+int                             sysctl_max_syn_backlog                       -                   -                   
+int                             sysctl_tcp_fastopen                          -                   -                   
+struct_tcp_congestion_ops       tcp_congestion_control                       -                   -                   init_cc
+struct_tcp_fastopen_context     tcp_fastopen_ctx                             -                   -                   
+unsigned_int                    sysctl_tcp_fastopen_blackhole_timeout        -                   -                   
+atomic_t                        tfo_active_disable_times                     -                   -                   
+unsigned_long                   tfo_active_disable_stamp                     -                   -                   
+u32                             tcp_challenge_timestamp                      -                   -                   
+u32                             tcp_challenge_count                          -                   -                   
+u8                              sysctl_tcp_plb_enabled                       -                   -                   
+u8                              sysctl_tcp_plb_idle_rehash_rounds            -                   -                   
+u8                              sysctl_tcp_plb_rehash_rounds                 -                   -                   
+u8                              sysctl_tcp_plb_suspend_rto_sec               -                   -                   
+int                             sysctl_tcp_plb_cong_thresh                   -                   -                   
+int                             sysctl_udp_wmem_min                                                                  
+int                             sysctl_udp_rmem_min                                                                  
+u8                              sysctl_fib_notify_on_flag_change                                                     
+u8                              sysctl_udp_l3mdev_accept                                                             
+u8                              sysctl_igmp_llm_reports                                                              
+int                             sysctl_igmp_max_memberships                                                          
+int                             sysctl_igmp_max_msf                                                                  
+int                             sysctl_igmp_qrv                                                                      
+struct_ping_group_range         ping_group_range                                                                     
+atomic_t                        dev_addr_genid                                                                       
+unsigned_int                    sysctl_udp_child_hash_entries                                                        
+unsigned_long*                  sysctl_local_reserved_ports                                                          
+int                             sysctl_ip_prot_sock                                                                  
+struct_mr_table*                mrt                                                                                  
+struct_list_head                mr_tables                                                                            
+struct_fib_rules_ops*           mr_rules_ops                                                                         
+u32                             sysctl_fib_multipath_hash_fields                                                     
+u8                              sysctl_fib_multipath_use_neigh                                                       
+u8                              sysctl_fib_multipath_hash_policy                                                     
+struct_fib_notifier_ops*        notifier_ops                                                                         
+unsigned_int                    fib_seq                                                                              
+struct_fib_notifier_ops*        ipmr_notifier_ops                                                                    
+unsigned_int                    ipmr_seq                                                                             
+atomic_t                        rt_genid                                                                             
+siphash_key_t                   ip_id_key                                                                                      
diff --git a/Documentation/networking/net_cachelines/snmp.rst b/Documentation/networking/net_cachelines/snmp.rst
new file mode 100644
index 0000000000000..1918a91d45e55
--- /dev/null
+++ b/Documentation/networking/net_cachelines/snmp.rst
@@ -0,0 +1,128 @@
+Type           Name                                  fastpath_tx_access  fastpath_rx_access  comment
+..Timer                                                                                      
+unsigned_long  LINUX_MIB_TCPKEEPALIVE                write_mostly        -                   tcp_keepalive_timer
+unsigned_long  LINUX_MIB_DELAYEDACKS                 write_mostly        -                   tcp_delack_timer_handler,tcp_delack_timer
+unsigned_long  LINUX_MIB_DELAYEDACKLOCKED            write_mostly        -                   tcp_delack_timer_handler,tcp_delack_timer
+unsigned_long  LINUX_MIB_TCPAUTOCORKING              write_mostly        -                   tcp_push,tcp_sendmsg_locked
+unsigned_long  LINUX_MIB_TCPFROMZEROWINDOWADV        write_mostly        -                   tcp_select_window,tcp_transmit-skb
+unsigned_long  LINUX_MIB_TCPTOZEROWINDOWADV          write_mostly        -                   tcp_select_window,tcp_transmit-skb
+unsigned_long  LINUX_MIB_TCPWANTZEROWINDOWADV        write_mostly        -                   tcp_select_window,tcp_transmit-skb
+unsigned_long  LINUX_MIB_TCPORIGDATASENT             write_mostly        -                   tcp_write_xmit
+unsigned_long  LINUX_MIB_TCPHPHITS                   -                   write_mostly        tcp_rcv_established,tcp_v4_do_rcv,tcp_v6_do_rcv
+unsigned_long  LINUX_MIB_TCPRCVCOALESCE              -                   write_mostly        tcp_try_coalesce,tcp_queue_rcv,tcp_rcv_established
+unsigned_long  LINUX_MIB_TCPPUREACKS                 -                   write_mostly        tcp_ack,tcp_rcv_established
+unsigned_long  LINUX_MIB_TCPHPACKS                   -                   write_mostly        tcp_ack,tcp_rcv_established
+unsigned_long  LINUX_MIB_TCPDELIVERED                -                   write_mostly        tcp_newly_delivered,tcp_ack,tcp_rcv_established
+unsigned_long  LINUX_MIB_SYNCOOKIESSENT                                                      
+unsigned_long  LINUX_MIB_SYNCOOKIESRECV                                                      
+unsigned_long  LINUX_MIB_SYNCOOKIESFAILED                                                    
+unsigned_long  LINUX_MIB_EMBRYONICRSTS                                                       
+unsigned_long  LINUX_MIB_PRUNECALLED                                                         
+unsigned_long  LINUX_MIB_RCVPRUNED                                                           
+unsigned_long  LINUX_MIB_OFOPRUNED                                                           
+unsigned_long  LINUX_MIB_OUTOFWINDOWICMPS                                                    
+unsigned_long  LINUX_MIB_LOCKDROPPEDICMPS                                                    
+unsigned_long  LINUX_MIB_ARPFILTER                                                           
+unsigned_long  LINUX_MIB_TIMEWAITED                                                          
+unsigned_long  LINUX_MIB_TIMEWAITRECYCLED                                                    
+unsigned_long  LINUX_MIB_TIMEWAITKILLED                                                      
+unsigned_long  LINUX_MIB_PAWSACTIVEREJECTED                                                  
+unsigned_long  LINUX_MIB_PAWSESTABREJECTED                                                   
+unsigned_long  LINUX_MIB_DELAYEDACKLOST                                                      
+unsigned_long  LINUX_MIB_LISTENOVERFLOWS                                                     
+unsigned_long  LINUX_MIB_LISTENDROPS                                                         
+unsigned_long  LINUX_MIB_TCPRENORECOVERY                                                     
+unsigned_long  LINUX_MIB_TCPSACKRECOVERY                                                     
+unsigned_long  LINUX_MIB_TCPSACKRENEGING                                                     
+unsigned_long  LINUX_MIB_TCPSACKREORDER                                                      
+unsigned_long  LINUX_MIB_TCPRENOREORDER                                                      
+unsigned_long  LINUX_MIB_TCPTSREORDER                                                        
+unsigned_long  LINUX_MIB_TCPFULLUNDO                                                         
+unsigned_long  LINUX_MIB_TCPPARTIALUNDO                                                      
+unsigned_long  LINUX_MIB_TCPDSACKUNDO                                                        
+unsigned_long  LINUX_MIB_TCPLOSSUNDO                                                         
+unsigned_long  LINUX_MIB_TCPLOSTRETRANSMIT                                                   
+unsigned_long  LINUX_MIB_TCPRENOFAILURES                                                     
+unsigned_long  LINUX_MIB_TCPSACKFAILURES                                                     
+unsigned_long  LINUX_MIB_TCPLOSSFAILURES                                                     
+unsigned_long  LINUX_MIB_TCPFASTRETRANS                                                      
+unsigned_long  LINUX_MIB_TCPSLOWSTARTRETRANS                                                 
+unsigned_long  LINUX_MIB_TCPTIMEOUTS                                                         
+unsigned_long  LINUX_MIB_TCPLOSSPROBES                                                       
+unsigned_long  LINUX_MIB_TCPLOSSPROBERECOVERY                                                
+unsigned_long  LINUX_MIB_TCPRENORECOVERYFAIL                                                 
+unsigned_long  LINUX_MIB_TCPSACKRECOVERYFAIL                                                 
+unsigned_long  LINUX_MIB_TCPRCVCOLLAPSED                                                     
+unsigned_long  LINUX_MIB_TCPDSACKOLDSENT                                                     
+unsigned_long  LINUX_MIB_TCPDSACKOFOSENT                                                     
+unsigned_long  LINUX_MIB_TCPDSACKRECV                                                        
+unsigned_long  LINUX_MIB_TCPDSACKOFORECV                                                     
+unsigned_long  LINUX_MIB_TCPABORTONDATA                                                      
+unsigned_long  LINUX_MIB_TCPABORTONCLOSE                                                     
+unsigned_long  LINUX_MIB_TCPABORTONMEMORY                                                    
+unsigned_long  LINUX_MIB_TCPABORTONTIMEOUT                                                   
+unsigned_long  LINUX_MIB_TCPABORTONLINGER                                                    
+unsigned_long  LINUX_MIB_TCPABORTFAILED                                                      
+unsigned_long  LINUX_MIB_TCPMEMORYPRESSURES                                                  
+unsigned_long  LINUX_MIB_TCPMEMORYPRESSURESCHRONO                                            
+unsigned_long  LINUX_MIB_TCPSACKDISCARD                                                      
+unsigned_long  LINUX_MIB_TCPDSACKIGNOREDOLD                                                  
+unsigned_long  LINUX_MIB_TCPDSACKIGNOREDNOUNDO                                               
+unsigned_long  LINUX_MIB_TCPSPURIOUSRTOS                                                     
+unsigned_long  LINUX_MIB_TCPMD5NOTFOUND                                                      
+unsigned_long  LINUX_MIB_TCPMD5UNEXPECTED                                                    
+unsigned_long  LINUX_MIB_TCPMD5FAILURE                                                       
+unsigned_long  LINUX_MIB_SACKSHIFTED                                                         
+unsigned_long  LINUX_MIB_SACKMERGED                                                          
+unsigned_long  LINUX_MIB_SACKSHIFTFALLBACK                                                   
+unsigned_long  LINUX_MIB_TCPBACKLOGDROP                                                      
+unsigned_long  LINUX_MIB_PFMEMALLOCDROP                                                      
+unsigned_long  LINUX_MIB_TCPMINTTLDROP                                                       
+unsigned_long  LINUX_MIB_TCPDEFERACCEPTDROP                                                  
+unsigned_long  LINUX_MIB_IPRPFILTER                                                          
+unsigned_long  LINUX_MIB_TCPTIMEWAITOVERFLOW                                                 
+unsigned_long  LINUX_MIB_TCPREQQFULLDOCOOKIES                                                
+unsigned_long  LINUX_MIB_TCPREQQFULLDROP                                                     
+unsigned_long  LINUX_MIB_TCPRETRANSFAIL                                                      
+unsigned_long  LINUX_MIB_TCPBACKLOGCOALESCE                                                  
+unsigned_long  LINUX_MIB_TCPOFOQUEUE                                                         
+unsigned_long  LINUX_MIB_TCPOFODROP                                                          
+unsigned_long  LINUX_MIB_TCPOFOMERGE                                                         
+unsigned_long  LINUX_MIB_TCPCHALLENGEACK                                                     
+unsigned_long  LINUX_MIB_TCPSYNCHALLENGE                                                     
+unsigned_long  LINUX_MIB_TCPFASTOPENACTIVE                                                   
+unsigned_long  LINUX_MIB_TCPFASTOPENACTIVEFAIL                                               
+unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVE                                                  
+unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVEFAIL                                              
+unsigned_long  LINUX_MIB_TCPFASTOPENLISTENOVERFLOW                                           
+unsigned_long  LINUX_MIB_TCPFASTOPENCOOKIEREQD                                               
+unsigned_long  LINUX_MIB_TCPFASTOPENBLACKHOLE                                                
+unsigned_long  LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES                                          
+unsigned_long  LINUX_MIB_BUSYPOLLRXPACKETS                                                   
+unsigned_long  LINUX_MIB_TCPSYNRETRANS                                                       
+unsigned_long  LINUX_MIB_TCPHYSTARTTRAINDETECT                                               
+unsigned_long  LINUX_MIB_TCPHYSTARTTRAINCWND                                                 
+unsigned_long  LINUX_MIB_TCPHYSTARTDELAYDETECT                                               
+unsigned_long  LINUX_MIB_TCPHYSTARTDELAYCWND                                                 
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDSYNRECV                                                
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDPAWS                                                   
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDSEQ                                                    
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDFINWAIT2                                               
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDTIMEWAIT                                               
+unsigned_long  LINUX_MIB_TCPACKSKIPPEDCHALLENGE                                              
+unsigned_long  LINUX_MIB_TCPWINPROBE                                                         
+unsigned_long  LINUX_MIB_TCPMTUPFAIL                                                         
+unsigned_long  LINUX_MIB_TCPMTUPSUCCESS                                                      
+unsigned_long  LINUX_MIB_TCPDELIVEREDCE                                                      
+unsigned_long  LINUX_MIB_TCPACKCOMPRESSED                                                    
+unsigned_long  LINUX_MIB_TCPZEROWINDOWDROP                                                   
+unsigned_long  LINUX_MIB_TCPRCVQDROP                                                         
+unsigned_long  LINUX_MIB_TCPWQUEUETOOBIG                                                     
+unsigned_long  LINUX_MIB_TCPFASTOPENPASSIVEALTKEY                                            
+unsigned_long  LINUX_MIB_TCPTIMEOUTREHASH                                                    
+unsigned_long  LINUX_MIB_TCPDUPLICATEDATAREHASH                                              
+unsigned_long  LINUX_MIB_TCPDSACKRECVSEGS                                                    
+unsigned_long  LINUX_MIB_TCPDSACKIGNOREDDUBIOUS                                              
+unsigned_long  LINUX_MIB_TCPMIGRATEREQSUCCESS                                                
+unsigned_long  LINUX_MIB_TCPMIGRATEREQFAILURE                                                
+unsigned_long  __LINUX_MIB_MAX                                                               
diff --git a/Documentation/networking/net_cachelines/tcp_sock.rst b/Documentation/networking/net_cachelines/tcp_sock.rst
new file mode 100644
index 0000000000000..ff7c5e933fc35
--- /dev/null
+++ b/Documentation/networking/net_cachelines/tcp_sock.rst
@@ -0,0 +1,148 @@
+Type                          Name                    fastpath_tx_access  fastpath_rx_access  Comments
+..struct                      tcp_sock                                                        
+struct_inet_connection_sock   inet_conn                                                       
+u16                           tcp_header_len          read_mostly         read_mostly         tcp_bound_to_half_wnd,tcp_current_mss(tx);tcp_rcv_established(rx)
+u16                           gso_segs                read_mostly         -                   tcp_xmit_size_goal
+__be32                        pred_flags              read_write          read_mostly         tcp_select_window(tx);tcp_rcv_established(rx)
+u64                           bytes_received          -                   read_write          tcp_rcv_nxt_update(rx)
+u32                           segs_in                 -                   read_write          tcp_v6_rcv(rx)
+u32                           data_segs_in            -                   read_write          tcp_v6_rcv(rx)
+u32                           rcv_nxt                 read_mostly         read_write          tcp_cleanup_rbuf,tcp_send_ack,tcp_inq_hint,tcp_transmit_skb,tcp_receive_window(tx);tcp_v6_do_rcv,tcp_rcv_established,tcp_data_queue,tcp_receive_window,tcp_rcv_nxt_update(write)(rx)
+u32                           copied_seq              -                   read_mostly         tcp_cleanup_rbuf,tcp_rcv_space_adjust,tcp_inq_hint
+u32                           rcv_wup                 -                   read_write          __tcp_cleanup_rbuf,tcp_receive_window,tcp_receive_established
+u32                           snd_nxt                 read_write          read_mostly         tcp_rate_check_app_limited,__tcp_transmit_skb,tcp_event_new_data_sent(write)(tx);tcp_rcv_established,tcp_ack,tcp_clean_rtx_queue(rx)
+u32                           segs_out                read_write          -                   __tcp_transmit_skb
+u32                           data_segs_out           read_write          -                   __tcp_transmit_skb,tcp_update_skb_after_send
+u64                           bytes_sent              read_write          -                   __tcp_transmit_skb
+u64                           bytes_acked             -                   read_write          tcp_snd_una_update/tcp_ack
+u32                           dsack_dups                                                      
+u32                           snd_una                 read_mostly         read_write          tcp_wnd_end,tcp_urg_mode,tcp_minshall_check,tcp_cwnd_validate(tx);tcp_ack,tcp_may_update_window,tcp_clean_rtx_queue(write),tcp_ack_tstamp(rx)
+u32                           snd_sml                 read_write          -                   tcp_minshall_check,tcp_minshall_update
+u32                           rcv_tstamp              -                   read_mostly         tcp_ack
+u32                           lsndtime                read_write          -                   tcp_slow_start_after_idle_check,tcp_event_data_sent
+u32                           last_oow_ack_time                                               
+u32                           compressed_ack_rcv_nxt                                          
+u32                           tsoffset                read_mostly         read_mostly         tcp_established_options(tx);tcp_fast_parse_options(rx)
+struct_list_head              tsq_node                -                   -                   
+struct_list_head              tsorted_sent_queue      read_write          -                   tcp_update_skb_after_send
+u32                           snd_wl1                 -                   read_mostly         tcp_may_update_window
+u32                           snd_wnd                 read_mostly         read_mostly         tcp_wnd_end,tcp_tso_should_defer(tx);tcp_fast_path_on(rx)
+u32                           max_window              read_mostly         -                   tcp_bound_to_half_wnd,forced_push
+u32                           mss_cache               read_mostly         read_mostly         tcp_rate_check_app_limited,tcp_current_mss,tcp_sync_mss,tcp_sndbuf_expand,tcp_tso_should_defer(tx);tcp_update_pacing_rate,tcp_clean_rtx_queue(rx)
+u32                           window_clamp            read_mostly         read_write          tcp_rcv_space_adjust,__tcp_select_window
+u32                           rcv_ssthresh            read_mostly         -                   __tcp_select_window
+struct                        tcp_rack                                                        
+u16                           advmss                  -                   read_mostly         tcp_rcv_space_adjust
+u8                            compressed_ack                                                  
+u8:2                          dup_ack_counter                                                 
+u8:1                          tlp_retrans                                                     
+u32                           chrono_start            read_write          -                   tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
+u32[3]                        chrono_stat             read_write          -                   tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
+u8:2                          chrono_type             read_write          -                   tcp_chrono_start/stop(tcp_write_xmit,tcp_cwnd_validate,tcp_send_syn_data)
+u8:1                          rate_app_limited        -                   read_write          tcp_rate_gen
+u8:1                          fastopen_connect                                                
+u8:1                          fastopen_no_cookie                                              
+u8:1                          is_sack_reneg           -                   read_mostly         tcp_skb_entail,tcp_ack
+u8:2                          fastopen_client_fail                                            
+u8:4                          nonagle                 read_write          -                   tcp_skb_entail,tcp_push_pending_frames
+u8:1                          thin_lto                                                        
+u8:1                          recvmsg_inq                                                     
+u8:1                          repair                  read_mostly         -                   tcp_write_xmit
+u8:1                          frto                                                            
+u8                            repair_queue            -                   -                   
+u8:2                          save_syn                                                        
+u8:1                          syn_data                                                        
+u8:1                          syn_fastopen                                                    
+u8:1                          syn_fastopen_exp                                                
+u8:1                          syn_fastopen_ch                                                 
+u8:1                          syn_data_acked                                                  
+u8:1                          is_cwnd_limited         read_mostly         -                   tcp_cwnd_validate,tcp_is_cwnd_limited
+u32                           tlp_high_seq            -                   read_mostly         tcp_ack
+u32                           tcp_tx_delay                                                    
+u64                           tcp_wstamp_ns           read_write          -                   tcp_pacing_check,tcp_tso_should_defer,tcp_update_skb_after_send
+u64                           tcp_clock_cache         read_write          read_write          tcp_mstamp_refresh(tcp_write_xmit/tcp_rcv_space_adjust),__tcp_transmit_skb,tcp_tso_should_defer;timer
+u64                           tcp_mstamp              read_write          read_write          tcp_mstamp_refresh(tcp_write_xmit/tcp_rcv_space_adjust)(tx);tcp_rcv_space_adjust,tcp_rate_gen,tcp_clean_rtx_queue,tcp_ack_update_rtt/tcp_time_stamp(rx);timer
+u32                           srtt_us                 read_mostly         read_write          tcp_tso_should_defer(tx);tcp_update_pacing_rate,__tcp_set_rto,tcp_rtt_estimator(rx)
+u32                           mdev_us                 read_write          -                   tcp_rtt_estimator
+u32                           mdev_max_us                                                     
+u32                           rttvar_us               -                   read_mostly         __tcp_set_rto
+u32                           rtt_seq                 read_write                              tcp_rtt_estimator
+struct_minmax                 rtt_min                 -                   read_mostly         tcp_min_rtt/tcp_rate_gen,tcp_min_rtttcp_update_rtt_min
+u32                           packets_out             read_write          read_write          tcp_packets_in_flight(tx/rx);tcp_slow_start_after_idle_check,tcp_nagle_check,tcp_rate_skb_sent,tcp_event_new_data_sent,tcp_cwnd_validate,tcp_write_xmit(tx);tcp_ack,tcp_clean_rtx_queue,tcp_update_pacing_rate(rx)
+u32                           retrans_out             -                   read_mostly         tcp_packets_in_flight,tcp_rate_check_app_limited
+u32                           max_packets_out         -                   read_write          tcp_cwnd_validate
+u32                           cwnd_usage_seq          -                   read_write          tcp_cwnd_validate
+u16                           urg_data                -                   read_mostly         tcp_fast_path_check
+u8                            ecn_flags               read_write          -                   tcp_ecn_send
+u8                            keepalive_probes                                                
+u32                           reordering              read_mostly         -                   tcp_sndbuf_expand
+u32                           reord_seen                                                      
+u32                           snd_up                  read_write          read_mostly         tcp_mark_urg,tcp_urg_mode,__tcp_transmit_skb(tx);tcp_clean_rtx_queue(rx)
+struct_tcp_options_received   rx_opt                  read_mostly         read_write          tcp_established_options(tx);tcp_fast_path_on,tcp_ack_update_window,tcp_is_sack,tcp_data_queue,tcp_rcv_established,tcp_ack_update_rtt(rx)
+u32                           snd_ssthresh            -                   read_mostly         tcp_update_pacing_rate
+u32                           snd_cwnd                read_mostly         read_mostly         tcp_snd_cwnd,tcp_rate_check_app_limited,tcp_tso_should_defer(tx);tcp_update_pacing_rate
+u32                           snd_cwnd_cnt                                                    
+u32                           snd_cwnd_clamp                                                  
+u32                           snd_cwnd_used                                                   
+u32                           snd_cwnd_stamp                                                  
+u32                           prior_cwnd                                                      
+u32                           prr_delivered                                                   
+u32                           prr_out                 read_mostly         read_mostly         tcp_rate_skb_sent,tcp_newly_delivered(tx);tcp_ack,tcp_rate_gen,tcp_clean_rtx_queue(rx)
+u32                           delivered               read_mostly         read_write          tcp_rate_skb_sent, tcp_newly_delivered(tx);tcp_ack, tcp_rate_gen, tcp_clean_rtx_queue (rx)
+u32                           delivered_ce            read_mostly         read_write          tcp_rate_skb_sent(tx);tcp_rate_gen(rx)
+u32                           lost                    -                   read_mostly         tcp_ack
+u32                           app_limited             read_write          read_mostly         tcp_rate_check_app_limited,tcp_rate_skb_sent(tx);tcp_rate_gen(rx)
+u64                           first_tx_mstamp         read_write          -                   tcp_rate_skb_sent
+u64                           delivered_mstamp        read_write          -                   tcp_rate_skb_sent
+u32                           rate_delivered          -                   read_mostly         tcp_rate_gen
+u32                           rate_interval_us        -                   read_mostly         rate_delivered,rate_app_limited
+u32                           rcv_wnd                 read_write          read_mostly         tcp_select_window,tcp_receive_window,tcp_fast_path_check
+u32                           write_seq               read_write          -                   tcp_rate_check_app_limited,tcp_write_queue_empty,tcp_skb_entail,forced_push,tcp_mark_push
+u32                           notsent_lowat           read_mostly         -                   tcp_stream_memory_free
+u32                           pushed_seq              read_write          -                   tcp_mark_push,forced_push
+u32                           lost_out                read_mostly         read_mostly         tcp_left_out(tx);tcp_packets_in_flight(tx/rx);tcp_rate_check_app_limited(rx)
+u32                           sacked_out              read_mostly         read_mostly         tcp_left_out(tx);tcp_packets_in_flight(tx/rx);tcp_clean_rtx_queue(rx)
+struct_hrtimer                pacing_timer                                                    
+struct_hrtimer                compressed_ack_timer                                            
+struct_sk_buff*               lost_skb_hint           read_mostly                             tcp_clean_rtx_queue
+struct_sk_buff*               retransmit_skb_hint     read_mostly         -                   tcp_clean_rtx_queue
+struct_rb_root                out_of_order_queue      -                   read_mostly         tcp_data_queue,tcp_fast_path_check
+struct_sk_buff*               ooo_last_skb                                                    
+struct_tcp_sack_block[1]      duplicate_sack                                                  
+struct_tcp_sack_block[4]      selective_acks                                                  
+struct_tcp_sack_block[4]      recv_sack_cache                                                 
+struct_sk_buff*               highest_sack            read_write          -                   tcp_event_new_data_sent
+int                           lost_cnt_hint                                                   
+u32                           prior_ssthresh                                                  
+u32                           high_seq                                                        
+u32                           retrans_stamp                                                   
+u32                           undo_marker                                                     
+int                           undo_retrans                                                    
+u64                           bytes_retrans                                                   
+u32                           total_retrans                                                   
+u32                           rto_stamp                                                       
+u16                           total_rto                                                       
+u16                           total_rto_recoveries                                            
+u32                           total_rto_time                                                  
+u32                           urg_seq                 -                   -                   
+unsigned_int                  keepalive_time                                                  
+unsigned_int                  keepalive_intvl                                                 
+int                           linger2                                                         
+u8                            bpf_sock_ops_cb_flags                                           
+u8:1                          bpf_chg_cc_inprogress                                           
+u16                           timeout_rehash                                                  
+u32                           rcv_ooopack                                                     
+u32                           rcv_rtt_last_tsecr                                              
+struct                        rcv_rtt_est             -                   read_write          tcp_rcv_space_adjust,tcp_rcv_established
+struct                        rcvq_space              -                   read_write          tcp_rcv_space_adjust
+struct                        mtu_probe                                                       
+u32                           plb_rehash                                                      
+u32                           mtu_info                                                        
+bool                          is_mptcp                                                        
+bool                          smc_hs_congested                                                
+bool                          syn_smc                                                         
+struct_tcp_sock_af_ops*       af_specific                                                     
+struct_tcp_md5sig_info*       md5sig_info                                                     
+struct_tcp_fastopen_request*  fastopen_req                                                    
+struct_request_sock*          fastopen_rsk                                                    
+struct_saved_syn*             saved_syn                                                        
\ No newline at end of file
-- 
2.42.0.655.g421f12c284-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables
  2023-10-17  1:47 [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Coco Li
  2023-10-17  1:47 ` [PATCH v2 net-next 1/5] Documentations: Analyze heavily used Networking related structs Coco Li
@ 2023-10-17  1:47 ` Coco Li
  2023-10-17  1:57   ` Andrew Lunn
  2023-10-17  1:47 ` [PATCH v2 net-next 3/5] netns-ipv4: reorganize netns_ipv4 " Coco Li
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Coco Li @ 2023-10-17  1:47 UTC (permalink / raw)
  To: Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang, Coco Li, David Ahern

From: Chao Wu <wwchao@google.com>

Reorganize fast path variables on tx-txrx-rx order.
Fast path cacheline ends afer LINUX_MIB_DELAYEDACKLOCKED.
There are only read-write variables here.

Below data generated with pahole on x86 architecture.

Fast path variables span cache lines before change: 12
Fast path variables span cache lines after change: 2

Signed-off-by: Chao Wu <wwchao@google.com>
Signed-off-by: Coco Li <lixiaoyan@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
---
 include/uapi/linux/snmp.h | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 26f33a4c253d7..aefb39edb87c6 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -169,7 +169,28 @@ enum
 /* linux mib definitions */
 enum
 {
+	/* Caacheline organization can be found documented in
+	 * Documentation/networking/net_cachelines/snmp.rst.
+	 * Please update the document when adding new fields.
+	 */
+
 	LINUX_MIB_NUM = 0,
+	/* TX hotpath */
+	LINUX_MIB_TCPAUTOCORKING,		/* TCPAutoCorking */
+	LINUX_MIB_TCPFROMZEROWINDOWADV,		/* TCPFromZeroWindowAdv */
+	LINUX_MIB_TCPTOZEROWINDOWADV,		/* TCPToZeroWindowAdv */
+	LINUX_MIB_TCPWANTZEROWINDOWADV,		/* TCPWantZeroWindowAdv */
+	LINUX_MIB_TCPORIGDATASENT,		/* TCPOrigDataSent */
+	LINUX_MIB_TCPPUREACKS,			/* TCPPureAcks */
+	LINUX_MIB_TCPHPACKS,			/* TCPHPAcks */
+	LINUX_MIB_TCPDELIVERED,			/* TCPDelivered */
+	/* RX hotpath */
+	LINUX_MIB_TCPHPHITS,			/* TCPHPHits */
+	LINUX_MIB_TCPRCVCOALESCE,		/* TCPRcvCoalesce */
+	LINUX_MIB_TCPKEEPALIVE,			/* TCPKeepAlive */
+	LINUX_MIB_DELAYEDACKS,			/* DelayedACKs */
+	LINUX_MIB_DELAYEDACKLOCKED,		/* DelayedACKLocked */
+	/* End of hotpath variables */
 	LINUX_MIB_SYNCOOKIESSENT,		/* SyncookiesSent */
 	LINUX_MIB_SYNCOOKIESRECV,		/* SyncookiesRecv */
 	LINUX_MIB_SYNCOOKIESFAILED,		/* SyncookiesFailed */
@@ -185,14 +206,9 @@ enum
 	LINUX_MIB_TIMEWAITKILLED,		/* TimeWaitKilled */
 	LINUX_MIB_PAWSACTIVEREJECTED,		/* PAWSActiveRejected */
 	LINUX_MIB_PAWSESTABREJECTED,		/* PAWSEstabRejected */
-	LINUX_MIB_DELAYEDACKS,			/* DelayedACKs */
-	LINUX_MIB_DELAYEDACKLOCKED,		/* DelayedACKLocked */
 	LINUX_MIB_DELAYEDACKLOST,		/* DelayedACKLost */
 	LINUX_MIB_LISTENOVERFLOWS,		/* ListenOverflows */
 	LINUX_MIB_LISTENDROPS,			/* ListenDrops */
-	LINUX_MIB_TCPHPHITS,			/* TCPHPHits */
-	LINUX_MIB_TCPPUREACKS,			/* TCPPureAcks */
-	LINUX_MIB_TCPHPACKS,			/* TCPHPAcks */
 	LINUX_MIB_TCPRENORECOVERY,		/* TCPRenoRecovery */
 	LINUX_MIB_TCPSACKRECOVERY,		/* TCPSackRecovery */
 	LINUX_MIB_TCPSACKRENEGING,		/* TCPSACKReneging */
@@ -246,7 +262,6 @@ enum
 	LINUX_MIB_TCPREQQFULLDOCOOKIES,		/* TCPReqQFullDoCookies */
 	LINUX_MIB_TCPREQQFULLDROP,		/* TCPReqQFullDrop */
 	LINUX_MIB_TCPRETRANSFAIL,		/* TCPRetransFail */
-	LINUX_MIB_TCPRCVCOALESCE,		/* TCPRcvCoalesce */
 	LINUX_MIB_TCPBACKLOGCOALESCE,		/* TCPBacklogCoalesce */
 	LINUX_MIB_TCPOFOQUEUE,			/* TCPOFOQueue */
 	LINUX_MIB_TCPOFODROP,			/* TCPOFODrop */
@@ -262,12 +277,7 @@ enum
 	LINUX_MIB_TCPFASTOPENBLACKHOLE,		/* TCPFastOpenBlackholeDetect */
 	LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES, /* TCPSpuriousRtxHostQueues */
 	LINUX_MIB_BUSYPOLLRXPACKETS,		/* BusyPollRxPackets */
-	LINUX_MIB_TCPAUTOCORKING,		/* TCPAutoCorking */
-	LINUX_MIB_TCPFROMZEROWINDOWADV,		/* TCPFromZeroWindowAdv */
-	LINUX_MIB_TCPTOZEROWINDOWADV,		/* TCPToZeroWindowAdv */
-	LINUX_MIB_TCPWANTZEROWINDOWADV,		/* TCPWantZeroWindowAdv */
 	LINUX_MIB_TCPSYNRETRANS,		/* TCPSynRetrans */
-	LINUX_MIB_TCPORIGDATASENT,		/* TCPOrigDataSent */
 	LINUX_MIB_TCPHYSTARTTRAINDETECT,	/* TCPHystartTrainDetect */
 	LINUX_MIB_TCPHYSTARTTRAINCWND,		/* TCPHystartTrainCwnd */
 	LINUX_MIB_TCPHYSTARTDELAYDETECT,	/* TCPHystartDelayDetect */
@@ -279,10 +289,8 @@ enum
 	LINUX_MIB_TCPACKSKIPPEDTIMEWAIT,	/* TCPACKSkippedTimeWait */
 	LINUX_MIB_TCPACKSKIPPEDCHALLENGE,	/* TCPACKSkippedChallenge */
 	LINUX_MIB_TCPWINPROBE,			/* TCPWinProbe */
-	LINUX_MIB_TCPKEEPALIVE,			/* TCPKeepAlive */
 	LINUX_MIB_TCPMTUPFAIL,			/* TCPMTUPFail */
 	LINUX_MIB_TCPMTUPSUCCESS,		/* TCPMTUPSuccess */
-	LINUX_MIB_TCPDELIVERED,			/* TCPDelivered */
 	LINUX_MIB_TCPDELIVEREDCE,		/* TCPDeliveredCE */
 	LINUX_MIB_TCPACKCOMPRESSED,		/* TCPAckCompressed */
 	LINUX_MIB_TCPZEROWINDOWDROP,		/* TCPZeroWindowDrop */
-- 
2.42.0.655.g421f12c284-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 net-next 3/5] netns-ipv4: reorganize netns_ipv4 fast path variables
  2023-10-17  1:47 [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Coco Li
  2023-10-17  1:47 ` [PATCH v2 net-next 1/5] Documentations: Analyze heavily used Networking related structs Coco Li
  2023-10-17  1:47 ` [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables Coco Li
@ 2023-10-17  1:47 ` Coco Li
  2023-10-17  1:47 ` [PATCH v2 net-next 4/5] net-device: reorganize net_device " Coco Li
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Coco Li @ 2023-10-17  1:47 UTC (permalink / raw)
  To: Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang, Coco Li, David Ahern

Reorganize fast path variables on tx-txrx-rx order.
Fastpath cacheline ends after sysctl_tcp_rmem.
There are only read-only variables here. (write is on the control path
and not considered in this case)

Below data generated with pahole on x86 architecture.
Fast path variables span cache lines before change: 4
Fast path variables span cache lines after change: 2

Signed-off-by: Coco Li <lixiaoyan@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Wei Wang <weiwan@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
---
 include/net/netns/ipv4.h | 41 +++++++++++++++++++++++++---------------
 1 file changed, 26 insertions(+), 15 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 73f43f6991999..809e8cef87f64 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -42,6 +42,32 @@ struct inet_timewait_death_row {
 struct tcp_fastopen_context;
 
 struct netns_ipv4 {
+	/* Caacheline organization can be found documented in
+	 * Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst.
+	 * Please update the document when adding new fields.
+	 */
+
+	/* TX readonly hotpath cache lines */
+	u8 sysctl_tcp_early_retrans;
+	u8 sysctl_tcp_tso_win_divisor;
+	u8 sysctl_tcp_tso_rtt_log;
+	u8 sysctl_tcp_autocorking;
+	int sysctl_tcp_min_snd_mss;
+	unsigned int sysctl_tcp_notsent_lowat;
+	int sysctl_tcp_limit_output_bytes;
+	int sysctl_tcp_min_rtt_wlen;
+	int sysctl_tcp_wmem[3];
+	u8 sysctl_ip_fwd_use_pmtu;
+
+	/* TXRX readonly hotpath cache lines */
+	u8 sysctl_tcp_moderate_rcvbuf;
+
+	/* RX readonly hotpath cache line */
+	u8 sysctl_ip_early_demux;
+	u8 sysctl_tcp_early_demux;
+	int sysctl_tcp_reordering;
+	int sysctl_tcp_rmem[3];
+
 	struct inet_timewait_death_row tcp_death_row;
 	struct udp_table *udp_table;
 
@@ -96,17 +122,14 @@ struct netns_ipv4 {
 
 	u8 sysctl_ip_default_ttl;
 	u8 sysctl_ip_no_pmtu_disc;
-	u8 sysctl_ip_fwd_use_pmtu;
 	u8 sysctl_ip_fwd_update_priority;
 	u8 sysctl_ip_nonlocal_bind;
 	u8 sysctl_ip_autobind_reuse;
 	/* Shall we try to damage output packets if routing dev changes? */
 	u8 sysctl_ip_dynaddr;
-	u8 sysctl_ip_early_demux;
 #ifdef CONFIG_NET_L3_MASTER_DEV
 	u8 sysctl_raw_l3mdev_accept;
 #endif
-	u8 sysctl_tcp_early_demux;
 	u8 sysctl_udp_early_demux;
 
 	u8 sysctl_nexthop_compat_mode;
@@ -119,7 +142,6 @@ struct netns_ipv4 {
 	u8 sysctl_tcp_mtu_probing;
 	int sysctl_tcp_mtu_probe_floor;
 	int sysctl_tcp_base_mss;
-	int sysctl_tcp_min_snd_mss;
 	int sysctl_tcp_probe_threshold;
 	u32 sysctl_tcp_probe_interval;
 
@@ -135,17 +157,14 @@ struct netns_ipv4 {
 	u8 sysctl_tcp_backlog_ack_defer;
 	u8 sysctl_tcp_pingpong_thresh;
 
-	int sysctl_tcp_reordering;
 	u8 sysctl_tcp_retries1;
 	u8 sysctl_tcp_retries2;
 	u8 sysctl_tcp_orphan_retries;
 	u8 sysctl_tcp_tw_reuse;
 	int sysctl_tcp_fin_timeout;
-	unsigned int sysctl_tcp_notsent_lowat;
 	u8 sysctl_tcp_sack;
 	u8 sysctl_tcp_window_scaling;
 	u8 sysctl_tcp_timestamps;
-	u8 sysctl_tcp_early_retrans;
 	u8 sysctl_tcp_recovery;
 	u8 sysctl_tcp_thin_linear_timeouts;
 	u8 sysctl_tcp_slow_start_after_idle;
@@ -161,21 +180,13 @@ struct netns_ipv4 {
 	u8 sysctl_tcp_frto;
 	u8 sysctl_tcp_nometrics_save;
 	u8 sysctl_tcp_no_ssthresh_metrics_save;
-	u8 sysctl_tcp_moderate_rcvbuf;
-	u8 sysctl_tcp_tso_win_divisor;
 	u8 sysctl_tcp_workaround_signed_windows;
-	int sysctl_tcp_limit_output_bytes;
 	int sysctl_tcp_challenge_ack_limit;
-	int sysctl_tcp_min_rtt_wlen;
 	u8 sysctl_tcp_min_tso_segs;
-	u8 sysctl_tcp_tso_rtt_log;
-	u8 sysctl_tcp_autocorking;
 	u8 sysctl_tcp_reflect_tos;
 	int sysctl_tcp_invalid_ratelimit;
 	int sysctl_tcp_pacing_ss_ratio;
 	int sysctl_tcp_pacing_ca_ratio;
-	int sysctl_tcp_wmem[3];
-	int sysctl_tcp_rmem[3];
 	unsigned int sysctl_tcp_child_ehash_entries;
 	unsigned long sysctl_tcp_comp_sack_delay_ns;
 	unsigned long sysctl_tcp_comp_sack_slack_ns;
-- 
2.42.0.655.g421f12c284-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 net-next 4/5] net-device: reorganize net_device fast path variables
  2023-10-17  1:47 [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Coco Li
                   ` (2 preceding siblings ...)
  2023-10-17  1:47 ` [PATCH v2 net-next 3/5] netns-ipv4: reorganize netns_ipv4 " Coco Li
@ 2023-10-17  1:47 ` Coco Li
  2023-10-17  8:52   ` Daniel Borkmann
  2023-10-17  1:47 ` [PATCH v2 net-next 5/5] tcp: reorganize tcp_sock " Coco Li
  2023-10-17  3:46 ` [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Florian Fainelli
  5 siblings, 1 reply; 19+ messages in thread
From: Coco Li @ 2023-10-17  1:47 UTC (permalink / raw)
  To: Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang, Coco Li, David Ahern

Reorganize fast path variables on tx-txrx-rx order
Fastpath variables end after npinfo.

Below data generated with pahole on x86 architecture.

Fast path variables span cache lines before change: 12
Fast path variables span cache lines after change: 4

Signed-off-by: Coco Li <lixiaoyan@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
---
 include/linux/netdevice.h | 99 ++++++++++++++++++++-------------------
 1 file changed, 52 insertions(+), 47 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1c7681263d302..d72b71b76bf82 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2053,6 +2053,58 @@ enum netdev_ml_priv_type {
  */
 
 struct net_device {
+	/* Caacheline organization can be found documented in
+	 * Documentation/networking/net_cachelines/net_device.rst.
+	 * Please update the document when adding new fields.
+	 */
+
+	/* TX read-mostly hotpath */
+	unsigned long long	priv_flags;
+	const struct net_device_ops *netdev_ops;
+	const struct header_ops *header_ops;
+	struct netdev_queue	*_tx;
+	unsigned int		real_num_tx_queues;
+	unsigned int		gso_max_size;
+	unsigned int		gso_ipv4_max_size;
+	u16			gso_max_segs;
+	s16			num_tc;
+	/* Note : dev->mtu is often read without holding a lock.
+	 * Writers usually hold RTNL.
+	 * It is recommended to use READ_ONCE() to annotate the reads,
+	 * and to use WRITE_ONCE() to annotate the writes.
+	 */
+	unsigned int		mtu;
+	unsigned short		needed_headroom;
+	struct netdev_tc_txq	tc_to_txq[TC_MAX_QUEUE];
+#ifdef CONFIG_XPS
+	struct xps_dev_maps __rcu *xps_maps[XPS_MAPS_MAX];
+#endif
+#ifdef CONFIG_NETFILTER_EGRESS
+	struct nf_hook_entries __rcu *nf_hooks_egress;
+#endif
+
+	/* TXRX read-mostly hotpath */
+	unsigned int		flags;
+	unsigned short		hard_header_len;
+	netdev_features_t	features;
+	struct inet6_dev __rcu	*ip6_ptr;
+
+	/* RX read-mostly hotpath */
+	struct list_head	ptype_specific;
+	int			ifindex;
+	unsigned int		real_num_rx_queues;
+	struct netdev_rx_queue	*_rx;
+	unsigned long		gro_flush_timeout;
+	int			napi_defer_hard_irqs;
+	unsigned int		gro_max_size;
+	unsigned int		gro_ipv4_max_size;
+	rx_handler_func_t __rcu	*rx_handler;
+	void __rcu		*rx_handler_data;
+	possible_net_t			nd_net;
+#ifdef CONFIG_NETPOLL
+	struct netpoll_info __rcu	*npinfo;
+#endif
+
 	char			name[IFNAMSIZ];
 	struct netdev_name_node	*name_node;
 	struct dev_ifalias	__rcu *ifalias;
@@ -2077,7 +2129,6 @@ struct net_device {
 	struct list_head	unreg_list;
 	struct list_head	close_list;
 	struct list_head	ptype_all;
-	struct list_head	ptype_specific;
 
 	struct {
 		struct list_head upper;
@@ -2085,25 +2136,12 @@ struct net_device {
 	} adj_list;
 
 	/* Read-mostly cache-line for fast-path access */
-	unsigned int		flags;
 	xdp_features_t		xdp_features;
-	unsigned long long	priv_flags;
-	const struct net_device_ops *netdev_ops;
 	const struct xdp_metadata_ops *xdp_metadata_ops;
-	int			ifindex;
 	unsigned short		gflags;
-	unsigned short		hard_header_len;
 
-	/* Note : dev->mtu is often read without holding a lock.
-	 * Writers usually hold RTNL.
-	 * It is recommended to use READ_ONCE() to annotate the reads,
-	 * and to use WRITE_ONCE() to annotate the writes.
-	 */
-	unsigned int		mtu;
-	unsigned short		needed_headroom;
 	unsigned short		needed_tailroom;
 
-	netdev_features_t	features;
 	netdev_features_t	hw_features;
 	netdev_features_t	wanted_features;
 	netdev_features_t	vlan_features;
@@ -2147,8 +2185,6 @@ struct net_device {
 	const struct tlsdev_ops *tlsdev_ops;
 #endif
 
-	const struct header_ops *header_ops;
-
 	unsigned char		operstate;
 	unsigned char		link_mode;
 
@@ -2189,9 +2225,7 @@ struct net_device {
 
 
 	/* Protocol-specific pointers */
-
 	struct in_device __rcu	*ip_ptr;
-	struct inet6_dev __rcu	*ip6_ptr;
 #if IS_ENABLED(CONFIG_VLAN_8021Q)
 	struct vlan_info __rcu	*vlan_info;
 #endif
@@ -2226,23 +2260,14 @@ struct net_device {
 	/* Interface address info used in eth_type_trans() */
 	const unsigned char	*dev_addr;
 
-	struct netdev_rx_queue	*_rx;
 	unsigned int		num_rx_queues;
-	unsigned int		real_num_rx_queues;
-
 	struct bpf_prog __rcu	*xdp_prog;
-	unsigned long		gro_flush_timeout;
-	int			napi_defer_hard_irqs;
 #define GRO_LEGACY_MAX_SIZE	65536u
 /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE),
  * and shinfo->gso_segs is a 16bit field.
  */
 #define GRO_MAX_SIZE		(8 * 65535u)
-	unsigned int		gro_max_size;
-	unsigned int		gro_ipv4_max_size;
 	unsigned int		xdp_zc_max_segs;
-	rx_handler_func_t __rcu	*rx_handler;
-	void __rcu		*rx_handler_data;
 #ifdef CONFIG_NET_XGRESS
 	struct bpf_mprog_entry __rcu *tcx_ingress;
 #endif
@@ -2260,24 +2285,15 @@ struct net_device {
 /*
  * Cache lines mostly used on transmit path
  */
-	struct netdev_queue	*_tx ____cacheline_aligned_in_smp;
 	unsigned int		num_tx_queues;
-	unsigned int		real_num_tx_queues;
 	struct Qdisc __rcu	*qdisc;
 	unsigned int		tx_queue_len;
 	spinlock_t		tx_global_lock;
 
 	struct xdp_dev_bulk_queue __percpu *xdp_bulkq;
-
-#ifdef CONFIG_XPS
-	struct xps_dev_maps __rcu *xps_maps[XPS_MAPS_MAX];
-#endif
 #ifdef CONFIG_NET_XGRESS
 	struct bpf_mprog_entry __rcu *tcx_egress;
 #endif
-#ifdef CONFIG_NETFILTER_EGRESS
-	struct nf_hook_entries __rcu *nf_hooks_egress;
-#endif
 
 #ifdef CONFIG_NET_SCHED
 	DECLARE_HASHTABLE	(qdisc_hash, 4);
@@ -2317,12 +2333,6 @@ struct net_device {
 	bool needs_free_netdev;
 	void (*priv_destructor)(struct net_device *dev);
 
-#ifdef CONFIG_NETPOLL
-	struct netpoll_info __rcu	*npinfo;
-#endif
-
-	possible_net_t			nd_net;
-
 	/* mid-layer private */
 	void				*ml_priv;
 	enum netdev_ml_priv_type	ml_priv_type;
@@ -2356,20 +2366,15 @@ struct net_device {
  */
 #define GSO_MAX_SIZE		(8 * GSO_MAX_SEGS)
 
-	unsigned int		gso_max_size;
 #define TSO_LEGACY_MAX_SIZE	65536
 #define TSO_MAX_SIZE		UINT_MAX
 	unsigned int		tso_max_size;
-	u16			gso_max_segs;
 #define TSO_MAX_SEGS		U16_MAX
 	u16			tso_max_segs;
-	unsigned int		gso_ipv4_max_size;
 
 #ifdef CONFIG_DCB
 	const struct dcbnl_rtnl_ops *dcbnl_ops;
 #endif
-	s16			num_tc;
-	struct netdev_tc_txq	tc_to_txq[TC_MAX_QUEUE];
 	u8			prio_tc_map[TC_BITMASK + 1];
 
 #if IS_ENABLED(CONFIG_FCOE)
-- 
2.42.0.655.g421f12c284-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 net-next 5/5] tcp: reorganize tcp_sock fast path variables
  2023-10-17  1:47 [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Coco Li
                   ` (3 preceding siblings ...)
  2023-10-17  1:47 ` [PATCH v2 net-next 4/5] net-device: reorganize net_device " Coco Li
@ 2023-10-17  1:47 ` Coco Li
  2023-10-17  3:46 ` [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Florian Fainelli
  5 siblings, 0 replies; 19+ messages in thread
From: Coco Li @ 2023-10-17  1:47 UTC (permalink / raw)
  To: Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang, Coco Li, David Ahern

The variables are organized according in the following way:

- TX read-mostly hotpath cache lines
- TXRX read-mostly hotpath cache lines
- RX read-mostly hotpath cache lines
- TX read-write hotpath cache line
- TXRX read-write hotpath cache line
- RX read-write hotpath cache line

Fastpath cachelines end after rcvq_space.

Cache line boundaries are enfored only between read-mostly and
read-write. That is, if read-mostly tx cachelines bleed into
read-mostly txrx cachelines, we do not care. We care about the
boundaries between read and write cachelines because we want
to prevent false sharing.

Fast path variables span cache lines before change: 12
Fast path variables span cache lines after change: 8

Signed-off-by: Coco Li <lixiaoyan@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Wei Wang <weiwan@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
---
 include/linux/tcp.h | 238 +++++++++++++++++++++++---------------------
 1 file changed, 124 insertions(+), 114 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index e15452df9804f..5195a0657997a 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -175,23 +175,110 @@ static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)
 #define TCP_RMEM_TO_WIN_SCALE 8
 
 struct tcp_sock {
+	/* Caacheline organization can be found documented in
+	 * Documentation/networking/net_cachelines/tcp_sock.rst.
+	 * Please update the document when adding new fields.
+	 */
+
 	/* inet_connection_sock has to be the first member of tcp_sock */
 	struct inet_connection_sock	inet_conn;
-	u16	tcp_header_len;	/* Bytes of tcp header to send		*/
+
+	/* TX read-mostly hotpath cache lines */
+	/* timestamp of last sent data packet (for restart window) */
+	u32	max_window;	/* Maximal window ever seen from peer	*/
+	u32	rcv_ssthresh;	/* Current window clamp			*/
+	u32	reordering;	/* Packet reordering metric.		*/
+	u32	notsent_lowat;	/* TCP_NOTSENT_LOWAT */
 	u16	gso_segs;	/* Max number of segs per GSO packet	*/
+	/* from STCP, retrans queue hinting */
+	struct sk_buff *lost_skb_hint;
+	struct sk_buff *retransmit_skb_hint;
+
+	/* TXRX read-mostly hotpath cache lines */
+	u32	tsoffset;	/* timestamp offset */
+	u32	snd_wnd;	/* The window we expect to receive	*/
+	u32	mss_cache;	/* Cached effective mss, not including SACKS */
+	u32	snd_cwnd;	/* Sending congestion window		*/
+	u32	prr_out;	/* Total number of pkts sent during Recovery. */
+	u32	lost_out;	/* Lost packets			*/
+	u32	sacked_out;	/* SACK'd packets			*/
+	u16	tcp_header_len;	/* Bytes of tcp header to send		*/
+	u8	chrono_type : 2,	/* current chronograph type */
+		repair      : 1,
+		is_sack_reneg:1,    /* in recovery from loss with SACK reneg? */
+		is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
+
+	/* RX read-mostly hotpath cache lines */
+	u32	copied_seq;	/* Head of yet unread data */
+	u32	rcv_tstamp;	/* timestamp of last received ACK (for keepalives) */
+	u32	snd_wl1;	/* Sequence for window update		*/
+	u32	tlp_high_seq;	/* snd_nxt at the time of TLP */
+	u32	rttvar_us;	/* smoothed mdev_max			*/
+	u32	retrans_out;	/* Retransmitted packets out		*/
+	u16	advmss;		/* Advertised MSS			*/
+	u16	urg_data;	/* Saved octet of OOB data and control flags */
+	u32	lost;		/* Total data packets lost incl. rexmits */
+	struct  minmax rtt_min;
+	/* OOO segments go in this rbtree. Socket lock must be held. */
+	struct rb_root	out_of_order_queue;
+	u32	snd_ssthresh;	/* Slow start size threshold		*/
 
+	/* TX read-write hotpath cache lines */
+	u32	segs_out ____cacheline_aligned;	/* RFC4898 tcpEStatsPerfSegsOut
+				 * The total number of segments sent.
+				 */
+	u32	data_segs_out;	/* RFC4898 tcpEStatsPerfDataSegsOut
+				 * total number of data segments sent.
+				 */
+	u64	bytes_sent;	/* RFC4898 tcpEStatsPerfHCDataOctetsOut
+				 * total number of data bytes sent.
+				 */
+	u32	snd_sml;	/* Last byte of the most recently transmitted small packet */
+	u32	chrono_start;	/* Start time in jiffies of a TCP chrono */
+	u32	chrono_stat[3];	/* Time in jiffies for chrono_stat stats */
+	u32	write_seq;	/* Tail(+1) of data held in tcp send buffer */
+	u32	pushed_seq;	/* Last pushed seq, required to talk to windows */
+	u32	lsndtime;
+	u32	mdev_us;	/* medium deviation			*/
+	u64	tcp_wstamp_ns;	/* departure time for next sent data packet */
+	u64	tcp_clock_cache; /* cache last tcp_clock_ns() (see tcp_mstamp_refresh()) */
+	u64	tcp_mstamp;	/* most recent packet received/sent */
+	u32	rtt_seq;	/* sequence number to update rttvar	*/
+	struct list_head tsorted_sent_queue; /* time-sorted sent but un-SACKed skbs */
+	struct sk_buff *highest_sack;   /* skb just after the highest
+					 * skb with SACKed bit set
+					 * (validity guaranteed only if
+					 * sacked_out > 0)
+					 */
+	u8	ecn_flags;	/* ECN status bits.			*/
+
+	/* TXRX read-write hotpath cache lines */
 /*
  *	Header prediction flags
  *	0x5?10 << 16 + snd_wnd in net byte order
  */
 	__be32	pred_flags;
-
+	u32	rcv_nxt;	/* What we want to receive next 	*/
+	u32	snd_nxt;	/* Next sequence we send		*/
+	u32	snd_una;	/* First byte we want an ack for	*/
+	u32	window_clamp;	/* Maximal window to advertise		*/
+	u32	srtt_us;	/* smoothed round trip time << 3 in usecs */
+	u32	packets_out;	/* Packets which are "in flight"	*/
+	u32	snd_up;		/* Urgent pointer		*/
+	u32	delivered;	/* Total data packets delivered incl. rexmits */
+	u32	delivered_ce;	/* Like the above but only ECE marked packets */
+	u32	app_limited;	/* limited until "delivered" reaches this val */
+	u32	rcv_wnd;	/* Current receiver window		*/
 /*
- *	RFC793 variables by their proper names. This means you can
- *	read the code and the spec side by side (and laugh ...)
- *	See RFC793 and RFC1122. The RFC writes these in capitals.
+ *      Options received (usually on last packet, some only on SYN packets).
  */
-	u64	bytes_received;	/* RFC4898 tcpEStatsAppHCThruOctetsReceived
+	struct tcp_options_received rx_opt;
+	u8	nonagle     : 4,/* Disable Nagle algorithm?             */
+		rate_app_limited:1;  /* rate_{delivered,interval_us} limited? */
+
+	/* RX read-write hotpath cache lines */
+	u64	bytes_received;
+				/* RFC4898 tcpEStatsAppHCThruOctetsReceived
 				 * sum(delta(rcv_nxt)), or how many bytes
 				 * were acked.
 				 */
@@ -201,45 +288,44 @@ struct tcp_sock {
 	u32	data_segs_in;	/* RFC4898 tcpEStatsPerfDataSegsIn
 				 * total number of data segments in.
 				 */
- 	u32	rcv_nxt;	/* What we want to receive next 	*/
-	u32	copied_seq;	/* Head of yet unread data		*/
 	u32	rcv_wup;	/* rcv_nxt on last window update sent	*/
- 	u32	snd_nxt;	/* Next sequence we send		*/
-	u32	segs_out;	/* RFC4898 tcpEStatsPerfSegsOut
-				 * The total number of segments sent.
-				 */
-	u32	data_segs_out;	/* RFC4898 tcpEStatsPerfDataSegsOut
-				 * total number of data segments sent.
-				 */
-	u64	bytes_sent;	/* RFC4898 tcpEStatsPerfHCDataOctetsOut
-				 * total number of data bytes sent.
-				 */
+	u32	max_packets_out;  /* max packets_out in last window */
+	u32	cwnd_usage_seq;  /* right edge of cwnd usage tracking flight */
+	u32	rate_delivered;    /* saved rate sample: packets delivered */
+	u32	rate_interval_us;  /* saved rate sample: time elapsed */
+	u32	rcv_rtt_last_tsecr;
+	u64	first_tx_mstamp;  /* start of window send phase */
+	u64	delivered_mstamp; /* time we reached "delivered" */
 	u64	bytes_acked;	/* RFC4898 tcpEStatsAppHCThruOctetsAcked
 				 * sum(delta(snd_una)), or how many bytes
 				 * were acked.
 				 */
+	struct {
+		u32	rtt_us;
+		u32	seq;
+		u64	time;
+	} rcv_rtt_est;
+/* Receiver queue space */
+	struct {
+		u32	space;
+		u32	seq;
+		u64	time;
+	} rcvq_space;
+
+	/* End of Hot Path */
+
+/*
+ *	RFC793 variables by their proper names. This means you can
+ *	read the code and the spec side by side (and laugh ...)
+ *	See RFC793 and RFC1122. The RFC writes these in capitals.
+ */
 	u32	dsack_dups;	/* RFC4898 tcpEStatsStackDSACKDups
 				 * total number of DSACK blocks received
 				 */
- 	u32	snd_una;	/* First byte we want an ack for	*/
- 	u32	snd_sml;	/* Last byte of the most recently transmitted small packet */
-	u32	rcv_tstamp;	/* timestamp of last received ACK (for keepalives) */
-	u32	lsndtime;	/* timestamp of last sent data packet (for restart window) */
 	u32	last_oow_ack_time;  /* timestamp of last out-of-window ACK */
 	u32	compressed_ack_rcv_nxt;
-
-	u32	tsoffset;	/* timestamp offset */
-
 	struct list_head tsq_node; /* anchor in tsq_tasklet.head list */
-	struct list_head tsorted_sent_queue; /* time-sorted sent but un-SACKed skbs */
-
-	u32	snd_wl1;	/* Sequence for window update		*/
-	u32	snd_wnd;	/* The window we expect to receive	*/
-	u32	max_window;	/* Maximal window ever seen from peer	*/
-	u32	mss_cache;	/* Cached effective mss, not including SACKS */
 
-	u32	window_clamp;	/* Maximal window to advertise		*/
-	u32	rcv_ssthresh;	/* Current window clamp			*/
 	u8	scaling_ratio;	/* see tcp_win_from_space() */
 	/* Information of the most recently (s)acked skb */
 	struct tcp_rack {
@@ -253,23 +339,16 @@ struct tcp_sock {
 		   dsack_seen:1, /* Whether DSACK seen after last adj */
 		   advanced:1;	 /* mstamp advanced since last lost marking */
 	} rack;
-	u16	advmss;		/* Advertised MSS			*/
 	u8	compressed_ack;
 	u8	dup_ack_counter:2,
 		tlp_retrans:1,	/* TLP is a retransmission */
 		unused:5;
-	u32	chrono_start;	/* Start time in jiffies of a TCP chrono */
-	u32	chrono_stat[3];	/* Time in jiffies for chrono_stat stats */
-	u8	chrono_type:2,	/* current chronograph type */
-		rate_app_limited:1,  /* rate_{delivered,interval_us} limited? */
+
+	u8	thin_lto    : 1,/* Use linear timeouts for thin streams */
+		recvmsg_inq : 1,/* Indicate # of bytes in queue upon recvmsg */
 		fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */
 		fastopen_no_cookie:1, /* Allow send/recv SYN+data without a cookie */
-		is_sack_reneg:1,    /* in recovery from loss with SACK reneg? */
-		fastopen_client_fail:2; /* reason why fastopen failed */
-	u8	nonagle     : 4,/* Disable Nagle algorithm?             */
-		thin_lto    : 1,/* Use linear timeouts for thin streams */
-		recvmsg_inq : 1,/* Indicate # of bytes in queue upon recvmsg */
-		repair      : 1,
+		fastopen_client_fail:2, /* reason why fastopen failed */
 		frto        : 1;/* F-RTO (RFC5682) activated in CA_Loss */
 	u8	repair_queue;
 	u8	save_syn:2,	/* Save headers of SYN packet */
@@ -277,45 +356,19 @@ struct tcp_sock {
 		syn_fastopen:1,	/* SYN includes Fast Open option */
 		syn_fastopen_exp:1,/* SYN includes Fast Open exp. option */
 		syn_fastopen_ch:1, /* Active TFO re-enabling probe */
-		syn_data_acked:1,/* data in SYN is acked by SYN-ACK */
-		is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
-	u32	tlp_high_seq;	/* snd_nxt at the time of TLP */
+		syn_data_acked:1;/* data in SYN is acked by SYN-ACK */
 
 	u32	tcp_tx_delay;	/* delay (in usec) added to TX packets */
-	u64	tcp_wstamp_ns;	/* departure time for next sent data packet */
-	u64	tcp_clock_cache; /* cache last tcp_clock_ns() (see tcp_mstamp_refresh()) */
 
 /* RTT measurement */
-	u64	tcp_mstamp;	/* most recent packet received/sent */
-	u32	srtt_us;	/* smoothed round trip time << 3 in usecs */
-	u32	mdev_us;	/* medium deviation			*/
 	u32	mdev_max_us;	/* maximal mdev for the last rtt period	*/
-	u32	rttvar_us;	/* smoothed mdev_max			*/
-	u32	rtt_seq;	/* sequence number to update rttvar	*/
-	struct  minmax rtt_min;
 
-	u32	packets_out;	/* Packets which are "in flight"	*/
-	u32	retrans_out;	/* Retransmitted packets out		*/
-	u32	max_packets_out;  /* max packets_out in last window */
-	u32	cwnd_usage_seq;  /* right edge of cwnd usage tracking flight */
-
-	u16	urg_data;	/* Saved octet of OOB data and control flags */
-	u8	ecn_flags;	/* ECN status bits.			*/
 	u8	keepalive_probes; /* num of allowed keep alive probes	*/
-	u32	reordering;	/* Packet reordering metric.		*/
 	u32	reord_seen;	/* number of data packet reordering events */
-	u32	snd_up;		/* Urgent pointer		*/
-
-/*
- *      Options received (usually on last packet, some only on SYN packets).
- */
-	struct tcp_options_received rx_opt;
 
 /*
  *	Slow start and congestion control (see also Nagle, and Karn & Partridge)
  */
- 	u32	snd_ssthresh;	/* Slow start size threshold		*/
- 	u32	snd_cwnd;	/* Sending congestion window		*/
 	u32	snd_cwnd_cnt;	/* Linear increase counter		*/
 	u32	snd_cwnd_clamp; /* Do not allow snd_cwnd to grow above this */
 	u32	snd_cwnd_used;
@@ -323,32 +376,10 @@ struct tcp_sock {
 	u32	prior_cwnd;	/* cwnd right before starting loss recovery */
 	u32	prr_delivered;	/* Number of newly delivered packets to
 				 * receiver in Recovery. */
-	u32	prr_out;	/* Total number of pkts sent during Recovery. */
-	u32	delivered;	/* Total data packets delivered incl. rexmits */
-	u32	delivered_ce;	/* Like the above but only ECE marked packets */
-	u32	lost;		/* Total data packets lost incl. rexmits */
-	u32	app_limited;	/* limited until "delivered" reaches this val */
-	u64	first_tx_mstamp;  /* start of window send phase */
-	u64	delivered_mstamp; /* time we reached "delivered" */
-	u32	rate_delivered;    /* saved rate sample: packets delivered */
-	u32	rate_interval_us;  /* saved rate sample: time elapsed */
-
- 	u32	rcv_wnd;	/* Current receiver window		*/
-	u32	write_seq;	/* Tail(+1) of data held in tcp send buffer */
-	u32	notsent_lowat;	/* TCP_NOTSENT_LOWAT */
-	u32	pushed_seq;	/* Last pushed seq, required to talk to windows */
-	u32	lost_out;	/* Lost packets			*/
-	u32	sacked_out;	/* SACK'd packets			*/
 
 	struct hrtimer	pacing_timer;
 	struct hrtimer	compressed_ack_timer;
 
-	/* from STCP, retrans queue hinting */
-	struct sk_buff* lost_skb_hint;
-	struct sk_buff *retransmit_skb_hint;
-
-	/* OOO segments go in this rbtree. Socket lock must be held. */
-	struct rb_root	out_of_order_queue;
 	struct sk_buff	*ooo_last_skb; /* cache rb_last(out_of_order_queue) */
 
 	/* SACKs data, these 2 need to be together (see tcp_options_write) */
@@ -357,12 +388,6 @@ struct tcp_sock {
 
 	struct tcp_sack_block recv_sack_cache[4];
 
-	struct sk_buff *highest_sack;   /* skb just after the highest
-					 * skb with SACKed bit set
-					 * (validity guaranteed only if
-					 * sacked_out > 0)
-					 */
-
 	int     lost_cnt_hint;
 
 	u32	prior_ssthresh; /* ssthresh saved at recovery start	*/
@@ -413,21 +438,6 @@ struct tcp_sock {
 
 	u32 rcv_ooopack; /* Received out-of-order packets, for tcpinfo */
 
-/* Receiver side RTT estimation */
-	u32 rcv_rtt_last_tsecr;
-	struct {
-		u32	rtt_us;
-		u32	seq;
-		u64	time;
-	} rcv_rtt_est;
-
-/* Receiver queue space */
-	struct {
-		u32	space;
-		u32	seq;
-		u64	time;
-	} rcvq_space;
-
 /* TCP-specific MTU probe information. */
 	struct {
 		u32		  probe_seq_start;
-- 
2.42.0.655.g421f12c284-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables
  2023-10-17  1:47 ` [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables Coco Li
@ 2023-10-17  1:57   ` Andrew Lunn
  2023-10-17 18:10     ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2023-10-17  1:57 UTC (permalink / raw)
  To: Coco Li
  Cc: Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni, netdev, Chao Wu, Wei Wang,
	David Ahern

On Tue, Oct 17, 2023 at 01:47:13AM +0000, Coco Li wrote:
> From: Chao Wu <wwchao@google.com>
> 
> Reorganize fast path variables on tx-txrx-rx order.
> Fast path cacheline ends afer LINUX_MIB_DELAYEDACKLOCKED.
> There are only read-write variables here.
> 
> Below data generated with pahole on x86 architecture.
> 
> Fast path variables span cache lines before change: 12
> Fast path variables span cache lines after change: 2

As i pointed out for the first version, this is a UAPI file.

Please could you add some justification that this does not cause any
UAPI changes. Will old user space binaries still work after this?

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption
  2023-10-17  1:47 [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Coco Li
                   ` (4 preceding siblings ...)
  2023-10-17  1:47 ` [PATCH v2 net-next 5/5] tcp: reorganize tcp_sock " Coco Li
@ 2023-10-17  3:46 ` Florian Fainelli
  2023-10-17  9:06   ` Daniel Borkmann
  5 siblings, 1 reply; 19+ messages in thread
From: Florian Fainelli @ 2023-10-17  3:46 UTC (permalink / raw)
  To: Coco Li, Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang



On 10/16/2023 6:47 PM, Coco Li wrote:
> Currently, variable-heavy structs in the networking stack is organized
> chronologically, logically and sometimes by cache line access.
> 
> This patch series attempts to reorganize the core networking stack
> variables to minimize cacheline consumption during the phase of data
> transfer. Specifically, we looked at the TCP/IP stack and the fast
> path definition in TCP.
> 
> For documentation purposes, we also added new files for each core data
> structure we considered, although not all ended up being modified due
> to the amount of existing cache line they span in the fast path. In
> the documentation, we recorded all variables we identified on the
> fast path and the reasons. We also hope that in the future when
> variables are added/modified, the document can be referred to and
> updated accordingly to reflect the latest variable organization.

This is great stuff, while Eric mentioned this work during Netconf'23 
one concern that came up however is how can we make sure that a future 
change which adds/removes/shuffles members in those structures is not 
going to be detrimental to the work you just did? Is there a way to 
"lock" the structure layout to avoid causing performance drops?

I suppose we could use pahole before/after for these structures and 
ensure that the layout on a cacheline basis remains preserved, but that 
means adding custom scripts to CI.
-- 
Florian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 4/5] net-device: reorganize net_device fast path variables
  2023-10-17  1:47 ` [PATCH v2 net-next 4/5] net-device: reorganize net_device " Coco Li
@ 2023-10-17  8:52   ` Daniel Borkmann
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Borkmann @ 2023-10-17  8:52 UTC (permalink / raw)
  To: Coco Li, Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang, David Ahern

Hi Coco,

Thanks for looking into this, awesome work!

On 10/17/23 3:47 AM, Coco Li wrote:
> Reorganize fast path variables on tx-txrx-rx order
> Fastpath variables end after npinfo.
> 
> Below data generated with pahole on x86 architecture.
> 
> Fast path variables span cache lines before change: 12
> Fast path variables span cache lines after change: 4
> 
> Signed-off-by: Coco Li <lixiaoyan@google.com>
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Reviewed-by: David Ahern <dsahern@kernel.org>
> ---
>   include/linux/netdevice.h | 99 ++++++++++++++++++++-------------------
>   1 file changed, 52 insertions(+), 47 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 1c7681263d302..d72b71b76bf82 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2053,6 +2053,58 @@ enum netdev_ml_priv_type {
>    */
>   
>   struct net_device {
> +	/* Caacheline organization can be found documented in

tiny nit: typo

> +	 * Documentation/networking/net_cachelines/net_device.rst.

I think this doc is not based on current struct net_device as I saw
some members in there which are not in net_device anymore today?

For the doc, please document also tcx_ingress member as fastpath_rx_access
(sch_handle_ingress) and tcx_egress member as fastpath_tx_access
(sch_handle_egress).

> +	 * Please update the document when adding new fields.
> +	 */
> +
> +	/* TX read-mostly hotpath */
> +	unsigned long long	priv_flags;
> +	const struct net_device_ops *netdev_ops;
> +	const struct header_ops *header_ops;
> +	struct netdev_queue	*_tx;
> +	unsigned int		real_num_tx_queues;
> +	unsigned int		gso_max_size;
> +	unsigned int		gso_ipv4_max_size;
> +	u16			gso_max_segs;
> +	s16			num_tc;
> +	/* Note : dev->mtu is often read without holding a lock.
> +	 * Writers usually hold RTNL.
> +	 * It is recommended to use READ_ONCE() to annotate the reads,
> +	 * and to use WRITE_ONCE() to annotate the writes.
> +	 */
> +	unsigned int		mtu;
> +	unsigned short		needed_headroom;
> +	struct netdev_tc_txq	tc_to_txq[TC_MAX_QUEUE];
> +#ifdef CONFIG_XPS
> +	struct xps_dev_maps __rcu *xps_maps[XPS_MAPS_MAX];
> +#endif
> +#ifdef CONFIG_NETFILTER_EGRESS
> +	struct nf_hook_entries __rcu *nf_hooks_egress;
> +#endif

As mentioned above, please also add tcx_egress to TX read-mostly
hotpath cacheline.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption
  2023-10-17  3:46 ` [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Florian Fainelli
@ 2023-10-17  9:06   ` Daniel Borkmann
  2023-10-17 16:50     ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Borkmann @ 2023-10-17  9:06 UTC (permalink / raw)
  To: Florian Fainelli, Coco Li, Jakub Kicinski, Eric Dumazet,
	Neal Cardwell, Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang

On 10/17/23 5:46 AM, Florian Fainelli wrote:
> On 10/16/2023 6:47 PM, Coco Li wrote:
>> Currently, variable-heavy structs in the networking stack is organized
>> chronologically, logically and sometimes by cache line access.
>>
>> This patch series attempts to reorganize the core networking stack
>> variables to minimize cacheline consumption during the phase of data
>> transfer. Specifically, we looked at the TCP/IP stack and the fast
>> path definition in TCP.
>>
>> For documentation purposes, we also added new files for each core data
>> structure we considered, although not all ended up being modified due
>> to the amount of existing cache line they span in the fast path. In
>> the documentation, we recorded all variables we identified on the
>> fast path and the reasons. We also hope that in the future when
>> variables are added/modified, the document can be referred to and
>> updated accordingly to reflect the latest variable organization.
> 
> This is great stuff, while Eric mentioned this work during Netconf'23 one concern that came up however is how can we make sure that a future change which adds/removes/shuffles members in those structures is not going to be detrimental to the work you just did? Is there a way to "lock" the structure layout to avoid causing performance drops?
> 
> I suppose we could use pahole before/after for these structures and ensure that the layout on a cacheline basis remains preserved, but that means adding custom scripts to CI.

It should be possible without extra CI. We could probably have zero-sized markers
as we have in sk_buff e.g. __cloned_offset[0], and use some macros to force grouping.

ASSERT_CACHELINE_GROUP() could then throw a build error for example if the member is
not within __begin_cacheline_group and __end_cacheline_group :

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9ea3ec906b57..c664e0594da4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2059,6 +2059,7 @@ struct net_device {
          */

         /* TX read-mostly hotpath */
+       __begin_cacheline_group(tx_read_mostly);
         unsigned long long      priv_flags;
         const struct net_device_ops *netdev_ops;
         const struct header_ops *header_ops;
@@ -2085,6 +2086,7 @@ struct net_device {
  #ifdef CONFIG_NET_XGRESS
         struct bpf_mprog_entry __rcu *tcx_egress;
  #endif
+       __end_cacheline_group(tx_read_mostly);

         /* TXRX read-mostly hotpath */
         unsigned int            flags;
diff --git a/net/core/dev.c b/net/core/dev.c
index 97e7b9833db9..2a91bd4077ad 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11523,6 +11523,9 @@ static int __init net_dev_init(void)

         BUG_ON(!dev_boot_phase);

+       ASSERT_CACHELINE_GROUP(tx_read_mostly, priv_flags);
+       ASSERT_CACHELINE_GROUP(tx_read_mostly, netdev_ops);
+       [...]
+
         if (dev_proc_init())
                 goto out;

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 1/5] Documentations: Analyze heavily used Networking related structs
  2023-10-17  1:47 ` [PATCH v2 net-next 1/5] Documentations: Analyze heavily used Networking related structs Coco Li
@ 2023-10-17 14:57   ` Jonathan Corbet
  0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Corbet @ 2023-10-17 14:57 UTC (permalink / raw)
  To: Coco Li, Jakub Kicinski, Eric Dumazet, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni
  Cc: netdev, Chao Wu, Wei Wang, Coco Li

Coco Li <lixiaoyan@google.com> writes:

> Analyzed a few structs in the networking stack by looking at variables
> within them that are used in the TCP/IP fast path.
>
> Fast path is defined as TCP path where data is transferred from sender to
> receiver unidirectionaly. It doesn't include phases other than
> TCP_ESTABLISHED, nor does it look at error paths.
>
> We hope to re-organizing variables that span many cachelines whose fast
> path variables are also spread out, and this document can help future
> developers keep networking fast path cachelines small.
>
> Optimized_cacheline field is computed as
> (Fastpath_Bytes/L3_cacheline_size_x86), and not the actual organized
> results (see patches to come for these).
>
> Note that the optimization is not cache line size dependent, we use
> x86 as an example of improvements.
>
> Investigation is done on 6.5
>
> Name	                Struct_Cachelines  Cur_fastpath_cache Fastpath_Bytes Optimized_cacheline
> tcp_sock	        42 (2664 Bytes)	   12   		396		8
> net_device	        39 (2240 bytes)	   12			234		4
> inet_sock	        15 (960 bytes)	   14			922		14
> Inet_connection_sock	22 (1368 bytes)	   18			1166		18
> Netns_ipv4 (sysctls)	12 (768 bytes)     4			77		2
> linux_mib	        16 (1060)	   6			104		2
>
> Note how there isn't much improvement space for inet_sock and
> Inet_connection_sock because sk and icsk_inet respective take up so
> much of the struct that rest of the variables become a small portion of
> the struct size.
>
> So, we decided to reorganize tcp_sock, net_device, Netns_ipv4, linux_mib
>
> Signed-off-by: Coco Li <lixiaoyan@google.com>
> Suggested-by: Eric Dumazet <edumazet@google.com>
> ---
>  .../net_cachelines/inet_connection_sock.rst   |  42 +++++
>  .../networking/net_cachelines/inet_sock.rst   |  37 ++++
>  .../networking/net_cachelines/net_device.rst  | 167 ++++++++++++++++++
>  .../net_cachelines/netns_ipv4_sysctl.rst      | 151 ++++++++++++++++
>  .../networking/net_cachelines/snmp.rst        | 128 ++++++++++++++
>  .../networking/net_cachelines/tcp_sock.rst    | 148 ++++++++++++++++

So none of this changelog tells us anything about this documentation you
are adding or what readers are supposed to gain from it.  What are these
files?

What they are *not* is RST; you clearly have not tried a documentation
build with these files in place.  I would say that needs to be fixed,
but I do wonder if this kind of information (to the extent that I
understand what it is) isn't better placed in the source itself?  If
nothing else, I would expect it to have a somewhat higher chance of
staying current that way.

Thanks,

jon

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption
  2023-10-17  9:06   ` Daniel Borkmann
@ 2023-10-17 16:50     ` Eric Dumazet
  2023-10-17 17:07       ` Daniel Borkmann
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2023-10-17 16:50 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Florian Fainelli, Coco Li, Jakub Kicinski, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni, netdev, Chao Wu, Wei Wang

On Tue, Oct 17, 2023 at 11:06 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 10/17/23 5:46 AM, Florian Fainelli wrote:
> > On 10/16/2023 6:47 PM, Coco Li wrote:
> >> Currently, variable-heavy structs in the networking stack is organized
> >> chronologically, logically and sometimes by cache line access.
> >>
> >> This patch series attempts to reorganize the core networking stack
> >> variables to minimize cacheline consumption during the phase of data
> >> transfer. Specifically, we looked at the TCP/IP stack and the fast
> >> path definition in TCP.
> >>
> >> For documentation purposes, we also added new files for each core data
> >> structure we considered, although not all ended up being modified due
> >> to the amount of existing cache line they span in the fast path. In
> >> the documentation, we recorded all variables we identified on the
> >> fast path and the reasons. We also hope that in the future when
> >> variables are added/modified, the document can be referred to and
> >> updated accordingly to reflect the latest variable organization.
> >
> > This is great stuff, while Eric mentioned this work during Netconf'23 one concern that came up however is how can we make sure that a future change which adds/removes/shuffles members in those structures is not going to be detrimental to the work you just did? Is there a way to "lock" the structure layout to avoid causing performance drops?
> >
> > I suppose we could use pahole before/after for these structures and ensure that the layout on a cacheline basis remains preserved, but that means adding custom scripts to CI.
>
> It should be possible without extra CI. We could probably have zero-sized markers
> as we have in sk_buff e.g. __cloned_offset[0], and use some macros to force grouping.
>
> ASSERT_CACHELINE_GROUP() could then throw a build error for example if the member is
> not within __begin_cacheline_group and __end_cacheline_group :
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 9ea3ec906b57..c664e0594da4 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2059,6 +2059,7 @@ struct net_device {
>           */
>
>          /* TX read-mostly hotpath */
> +       __begin_cacheline_group(tx_read_mostly);
>          unsigned long long      priv_flags;
>          const struct net_device_ops *netdev_ops;
>          const struct header_ops *header_ops;
> @@ -2085,6 +2086,7 @@ struct net_device {
>   #ifdef CONFIG_NET_XGRESS
>          struct bpf_mprog_entry __rcu *tcx_egress;
>   #endif
> +       __end_cacheline_group(tx_read_mostly);
>
>          /* TXRX read-mostly hotpath */
>          unsigned int            flags;
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 97e7b9833db9..2a91bd4077ad 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -11523,6 +11523,9 @@ static int __init net_dev_init(void)
>
>          BUG_ON(!dev_boot_phase);
>
> +       ASSERT_CACHELINE_GROUP(tx_read_mostly, priv_flags);
> +       ASSERT_CACHELINE_GROUP(tx_read_mostly, netdev_ops);

Great idea, we only need to generate these automatically from the file
describing the fields (currently in Documentation/ )

I think the initial intent was to find a way to generate the layout of
the structure itself, but this looked a bit tricky.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption
  2023-10-17 16:50     ` Eric Dumazet
@ 2023-10-17 17:07       ` Daniel Borkmann
  2023-10-17 21:21         ` Daniel Borkmann
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Borkmann @ 2023-10-17 17:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Florian Fainelli, Coco Li, Jakub Kicinski, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni, netdev, Chao Wu, Wei Wang

On 10/17/23 6:50 PM, Eric Dumazet wrote:
> On Tue, Oct 17, 2023 at 11:06 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 10/17/23 5:46 AM, Florian Fainelli wrote:
>>> On 10/16/2023 6:47 PM, Coco Li wrote:
>>>> Currently, variable-heavy structs in the networking stack is organized
>>>> chronologically, logically and sometimes by cache line access.
>>>>
>>>> This patch series attempts to reorganize the core networking stack
>>>> variables to minimize cacheline consumption during the phase of data
>>>> transfer. Specifically, we looked at the TCP/IP stack and the fast
>>>> path definition in TCP.
>>>>
>>>> For documentation purposes, we also added new files for each core data
>>>> structure we considered, although not all ended up being modified due
>>>> to the amount of existing cache line they span in the fast path. In
>>>> the documentation, we recorded all variables we identified on the
>>>> fast path and the reasons. We also hope that in the future when
>>>> variables are added/modified, the document can be referred to and
>>>> updated accordingly to reflect the latest variable organization.
>>>
>>> This is great stuff, while Eric mentioned this work during Netconf'23 one concern that came up however is how can we make sure that a future change which adds/removes/shuffles members in those structures is not going to be detrimental to the work you just did? Is there a way to "lock" the structure layout to avoid causing performance drops?
>>>
>>> I suppose we could use pahole before/after for these structures and ensure that the layout on a cacheline basis remains preserved, but that means adding custom scripts to CI.
>>
>> It should be possible without extra CI. We could probably have zero-sized markers
>> as we have in sk_buff e.g. __cloned_offset[0], and use some macros to force grouping.
>>
>> ASSERT_CACHELINE_GROUP() could then throw a build error for example if the member is
>> not within __begin_cacheline_group and __end_cacheline_group :
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 9ea3ec906b57..c664e0594da4 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -2059,6 +2059,7 @@ struct net_device {
>>            */
>>
>>           /* TX read-mostly hotpath */
>> +       __begin_cacheline_group(tx_read_mostly);
>>           unsigned long long      priv_flags;
>>           const struct net_device_ops *netdev_ops;
>>           const struct header_ops *header_ops;
>> @@ -2085,6 +2086,7 @@ struct net_device {
>>    #ifdef CONFIG_NET_XGRESS
>>           struct bpf_mprog_entry __rcu *tcx_egress;
>>    #endif
>> +       __end_cacheline_group(tx_read_mostly);
>>
>>           /* TXRX read-mostly hotpath */
>>           unsigned int            flags;
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 97e7b9833db9..2a91bd4077ad 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -11523,6 +11523,9 @@ static int __init net_dev_init(void)
>>
>>           BUG_ON(!dev_boot_phase);
>>
>> +       ASSERT_CACHELINE_GROUP(tx_read_mostly, priv_flags);
>> +       ASSERT_CACHELINE_GROUP(tx_read_mostly, netdev_ops);

nit, should have been sth like:

   ASSERT_CACHELINE_GROUP(struct net_device, netdev_ops, tx_read_mostly)

> Great idea, we only need to generate these automatically from the file
> describing the fields (currently in Documentation/ )
> 
> I think the initial intent was to find a way to generate the layout of
> the structure itself, but this looked a bit tricky.

Agree, ideally this could be scripted from the Documentation/ file of this
series, and perhaps the latter may not even be needed then if we have it
self-documented in code behind some macro magic with BUILD_BUG_ON assertion
which probes offsetof wrt the field being within markers.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables
  2023-10-17  1:57   ` Andrew Lunn
@ 2023-10-17 18:10     ` Eric Dumazet
  2023-10-17 19:09       ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2023-10-17 18:10 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Coco Li, Jakub Kicinski, Neal Cardwell, Mubashir Adnan Qureshi,
	Paolo Abeni, netdev, Chao Wu, Wei Wang, David Ahern

On Tue, Oct 17, 2023 at 3:57 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Tue, Oct 17, 2023 at 01:47:13AM +0000, Coco Li wrote:
> > From: Chao Wu <wwchao@google.com>
> >
> > Reorganize fast path variables on tx-txrx-rx order.
> > Fast path cacheline ends afer LINUX_MIB_DELAYEDACKLOCKED.
> > There are only read-write variables here.
> >
> > Below data generated with pahole on x86 architecture.
> >
> > Fast path variables span cache lines before change: 12
> > Fast path variables span cache lines after change: 2
>
> As i pointed out for the first version, this is a UAPI file.
>
> Please could you add some justification that this does not cause any
> UAPI changes. Will old user space binaries still work after this?
>
> Thanks
>         Andrew

I do not think the particular order is really UAPI. Not sure why they
were pushed in uapi in the first place.

Kernel exports these counters with a leading line with the names of the metrics.

We already in the past added fields and nothing broke.

So the answer is : user space binaries not ignoring the names of the
metrics will work as before.

nstat is one of the standard binary.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables
  2023-10-17 18:10     ` Eric Dumazet
@ 2023-10-17 19:09       ` Andrew Lunn
  2023-10-17 19:15         ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Lunn @ 2023-10-17 19:09 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Coco Li, Jakub Kicinski, Neal Cardwell, Mubashir Adnan Qureshi,
	Paolo Abeni, netdev, Chao Wu, Wei Wang, David Ahern

On Tue, Oct 17, 2023 at 08:10:21PM +0200, Eric Dumazet wrote:
> On Tue, Oct 17, 2023 at 3:57 AM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Tue, Oct 17, 2023 at 01:47:13AM +0000, Coco Li wrote:
> > > From: Chao Wu <wwchao@google.com>
> > >
> > > Reorganize fast path variables on tx-txrx-rx order.
> > > Fast path cacheline ends afer LINUX_MIB_DELAYEDACKLOCKED.
> > > There are only read-write variables here.
> > >
> > > Below data generated with pahole on x86 architecture.
> > >
> > > Fast path variables span cache lines before change: 12
> > > Fast path variables span cache lines after change: 2
> >
> > As i pointed out for the first version, this is a UAPI file.
> >
> > Please could you add some justification that this does not cause any
> > UAPI changes. Will old user space binaries still work after this?
> >
> > Thanks
> >         Andrew
> 
> I do not think the particular order is really UAPI. Not sure why they
> were pushed in uapi in the first place.
> 
> Kernel exports these counters with a leading line with the names of the metrics.
> 
> We already in the past added fields and nothing broke.
> 
> So the answer is : user space binaries not ignoring the names of the
> metrics will work as before.
> 
> nstat is one of the standard binary.

This is the sort of thing which i think should be in the commit
message. It makes it clear somebody has thought about this, and they
think the risk is minimal. Without such a comment, somebody will ask
if changing to a uapi file is safe.

   Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables
  2023-10-17 19:09       ` Andrew Lunn
@ 2023-10-17 19:15         ` Eric Dumazet
  2023-10-18 14:55           ` David Ahern
  0 siblings, 1 reply; 19+ messages in thread
From: Eric Dumazet @ 2023-10-17 19:15 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Coco Li, Jakub Kicinski, Neal Cardwell, Mubashir Adnan Qureshi,
	Paolo Abeni, netdev, Chao Wu, Wei Wang, David Ahern

On Tue, Oct 17, 2023 at 9:10 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Tue, Oct 17, 2023 at 08:10:21PM +0200, Eric Dumazet wrote:
> > On Tue, Oct 17, 2023 at 3:57 AM Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > On Tue, Oct 17, 2023 at 01:47:13AM +0000, Coco Li wrote:
> > > > From: Chao Wu <wwchao@google.com>
> > > >
> > > > Reorganize fast path variables on tx-txrx-rx order.
> > > > Fast path cacheline ends afer LINUX_MIB_DELAYEDACKLOCKED.
> > > > There are only read-write variables here.
> > > >
> > > > Below data generated with pahole on x86 architecture.
> > > >
> > > > Fast path variables span cache lines before change: 12
> > > > Fast path variables span cache lines after change: 2
> > >
> > > As i pointed out for the first version, this is a UAPI file.
> > >
> > > Please could you add some justification that this does not cause any
> > > UAPI changes. Will old user space binaries still work after this?
> > >
> > > Thanks
> > >         Andrew
> >
> > I do not think the particular order is really UAPI. Not sure why they
> > were pushed in uapi in the first place.
> >
> > Kernel exports these counters with a leading line with the names of the metrics.
> >
> > We already in the past added fields and nothing broke.
> >
> > So the answer is : user space binaries not ignoring the names of the
> > metrics will work as before.
> >
> > nstat is one of the standard binary.
>
> This is the sort of thing which i think should be in the commit
> message. It makes it clear somebody has thought about this, and they
> think the risk is minimal. Without such a comment, somebody will ask
> if changing to a uapi file is safe.

Sure, although we never said such a thing in prior changes.

Perhaps add a big comment in the file itself, instead of repeating it
on future commit changelogs ?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption
  2023-10-17 17:07       ` Daniel Borkmann
@ 2023-10-17 21:21         ` Daniel Borkmann
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Borkmann @ 2023-10-17 21:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Florian Fainelli, Coco Li, Jakub Kicinski, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni, netdev, Chao Wu, Wei Wang

On 10/17/23 7:07 PM, Daniel Borkmann wrote:
> On 10/17/23 6:50 PM, Eric Dumazet wrote:
[...]
>> Great idea, we only need to generate these automatically from the file
>> describing the fields (currently in Documentation/ )
>>
>> I think the initial intent was to find a way to generate the layout of
>> the structure itself, but this looked a bit tricky.
> 
> Agree, ideally this could be scripted from the Documentation/ file of this
> series, and perhaps the latter may not even be needed then if we have it
> self-documented in code behind some macro magic with BUILD_BUG_ON assertion
> which probes offsetof wrt the field being within markers.

... been playing around a bit, perhaps could be made nicer but this seems
to do it & also pahole will have the markers visible:

  include/linux/cache.h     | 26 ++++++++++++++++++++++++++
  include/linux/netdevice.h |  2 ++
  net/core/dev.c            | 25 +++++++++++++++++++++++++
  3 files changed, 53 insertions(+)

diff --git a/include/linux/cache.h b/include/linux/cache.h
index 9900d20b76c2..f7e166b2897a 100644
--- a/include/linux/cache.h
+++ b/include/linux/cache.h
@@ -85,6 +85,32 @@
  #define cache_line_size()	L1_CACHE_BYTES
  #endif

+#ifndef __cacheline_group_begin
+#define __cacheline_group_begin(GROUP) \
+	__u8 __cacheline_group_begin__##GROUP[0]
+#endif
+
+#ifndef __cacheline_group_end
+#define __cacheline_group_end(GROUP) \
+	__u8 __cacheline_group_end__##GROUP[0]
+#endif
+
+#ifndef CACHELINE_ASSERT_GROUP_MEMBER
+#define CACHELINE_ASSERT_GROUP_MEMBER(TYPE, GROUP, MEMBER) \
+	BUILD_BUG_ON(!(offsetof(TYPE, MEMBER) >= \
+		       offsetofend(TYPE, __cacheline_group_begin__##GROUP) && \
+		       offsetofend(TYPE, MEMBER) <= \
+		       offsetof(TYPE, __cacheline_group_end__##GROUP)))
+#endif
+
+#ifndef CACHELINE_ASSERT_GROUP_MAXSIZ
+#define CACHELINE_ASSERT_MIN_BOUNDARY 64
+#define CACHELINE_ASSERT_GROUP_MAXSIZ(TYPE, GROUP, NUM) \
+	BUILD_BUG_ON(offsetof(TYPE, __cacheline_group_end__##GROUP) - \
+		     offsetofend(TYPE, __cacheline_group_begin__##GROUP) > \
+		     ((NUM) * CACHELINE_ASSERT_MIN_BOUNDARY))
+#endif
+
  /*
   * Helper to add padding within a struct to ensure data fall into separate
   * cachelines.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d72b71b76bf8..7a47d43b95de 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2059,6 +2059,7 @@ struct net_device {
  	 */

  	/* TX read-mostly hotpath */
+	__cacheline_group_begin(tx_read_mostly);
  	unsigned long long	priv_flags;
  	const struct net_device_ops *netdev_ops;
  	const struct header_ops *header_ops;
@@ -2082,6 +2083,7 @@ struct net_device {
  #ifdef CONFIG_NETFILTER_EGRESS
  	struct nf_hook_entries __rcu *nf_hooks_egress;
  #endif
+	__cacheline_group_end(tx_read_mostly);

  	/* TXRX read-mostly hotpath */
  	unsigned int		flags;
diff --git a/net/core/dev.c b/net/core/dev.c
index 4420831180c6..5f6b88c2c902 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11515,6 +11515,29 @@ static struct pernet_operations __net_initdata default_device_ops = {
   *
   */

+static void __init net_dev_struct_check(void)
+{
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, priv_flags);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, netdev_ops);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, header_ops);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, _tx);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, real_num_tx_queues);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, gso_max_size);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, gso_ipv4_max_size);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, gso_max_segs);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, num_tc);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, mtu);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, needed_headroom);
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, tc_to_txq);
+#ifdef CONFIG_XPS
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, xps_maps);
+#endif
+#ifdef CONFIG_NETFILTER_EGRESS
+	CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, tx_read_mostly, nf_hooks_egress);
+#endif
+	CACHELINE_ASSERT_GROUP_MAXSIZ(struct net_device, tx_read_mostly, 3);
+}
+
  /*
   *       This is called single threaded during boot, so no need
   *       to take the rtnl semaphore.
@@ -11525,6 +11548,8 @@ static int __init net_dev_init(void)

  	BUG_ON(!dev_boot_phase);

+	net_dev_struct_check();
+
  	if (dev_proc_init())
  		goto out;


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables
  2023-10-17 19:15         ` Eric Dumazet
@ 2023-10-18 14:55           ` David Ahern
  2023-10-18 15:11             ` Andrew Lunn
  0 siblings, 1 reply; 19+ messages in thread
From: David Ahern @ 2023-10-18 14:55 UTC (permalink / raw)
  To: Eric Dumazet, Andrew Lunn
  Cc: Coco Li, Jakub Kicinski, Neal Cardwell, Mubashir Adnan Qureshi,
	Paolo Abeni, netdev, Chao Wu, Wei Wang

On 10/17/23 1:15 PM, Eric Dumazet wrote:
> Perhaps add a big comment in the file itself, instead of repeating it
> on future commit changelogs ?

I think a comment in the file would be better. I spent a fair amount of
time reviewing code double checking the impact of the moves; a comment
in that header file would have been helpful.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables
  2023-10-18 14:55           ` David Ahern
@ 2023-10-18 15:11             ` Andrew Lunn
  0 siblings, 0 replies; 19+ messages in thread
From: Andrew Lunn @ 2023-10-18 15:11 UTC (permalink / raw)
  To: David Ahern
  Cc: Eric Dumazet, Coco Li, Jakub Kicinski, Neal Cardwell,
	Mubashir Adnan Qureshi, Paolo Abeni, netdev, Chao Wu, Wei Wang

On Wed, Oct 18, 2023 at 08:55:44AM -0600, David Ahern wrote:
> On 10/17/23 1:15 PM, Eric Dumazet wrote:
> > Perhaps add a big comment in the file itself, instead of repeating it
> > on future commit changelogs ?
> 
> I think a comment in the file would be better. I spent a fair amount of
> time reviewing code double checking the impact of the moves; a comment
> in that header file would have been helpful.

We probably want both.

A patch to a uapi file is something which as a reviewer immediately
triggers questions about is it going to break backwards
compatibility. Having it clearly mentioned in the commit message
immediately answers those questions. I would say it is best practice
to do so.

Patching the header itself makes a lot of sense if we actually think
it is useless being in uapi.

   Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-10-18 15:11 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-17  1:47 [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Coco Li
2023-10-17  1:47 ` [PATCH v2 net-next 1/5] Documentations: Analyze heavily used Networking related structs Coco Li
2023-10-17 14:57   ` Jonathan Corbet
2023-10-17  1:47 ` [PATCH v2 net-next 2/5] net-smnp: reorganize SNMP fast path variables Coco Li
2023-10-17  1:57   ` Andrew Lunn
2023-10-17 18:10     ` Eric Dumazet
2023-10-17 19:09       ` Andrew Lunn
2023-10-17 19:15         ` Eric Dumazet
2023-10-18 14:55           ` David Ahern
2023-10-18 15:11             ` Andrew Lunn
2023-10-17  1:47 ` [PATCH v2 net-next 3/5] netns-ipv4: reorganize netns_ipv4 " Coco Li
2023-10-17  1:47 ` [PATCH v2 net-next 4/5] net-device: reorganize net_device " Coco Li
2023-10-17  8:52   ` Daniel Borkmann
2023-10-17  1:47 ` [PATCH v2 net-next 5/5] tcp: reorganize tcp_sock " Coco Li
2023-10-17  3:46 ` [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Florian Fainelli
2023-10-17  9:06   ` Daniel Borkmann
2023-10-17 16:50     ` Eric Dumazet
2023-10-17 17:07       ` Daniel Borkmann
2023-10-17 21:21         ` Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.