Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH 0/4] Reducing memory usage of i40e for kdump
@ 2021-02-22  7:06 Coiby Xu
  2021-02-22  7:06 ` [RFC PATCH 1/4] i40e: use minimal tx and rx pairs " Coiby Xu
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Coiby Xu @ 2021-02-22  7:06 UTC (permalink / raw)
  To: netdev
  Cc: kexec, intel-wired-lan, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh,
	open list:BPF (Safe dynamic programs and tools)

Currently, i40e consumes lots of memory and causes the failure of kdump.

After reducing the allocation of tx/rx/arg/asq ring buffers to the
minimum, the memory consumption is significantly reduced,
    - x86_64: 85.1MB to 1.2MB 
    - POWER9: 15368.5MB to 20.8MB

i40iw consumes even much more memory. For the above x86_64 machine, it
alone consumes 1513.7MB. So disable registering an i40e client driver
for a kdump kernel.

After applying this patch set, we can still achieve 100MB+/s network
speed which I think is limited by the net link (1000Mb/s) and this is 
sufficient for kdump.

memstrack report for the x86_64 machine
=======================================

After applying this patch set,

    ======== Report format module_summary: ========
    Module i40e using 20.8MB (332 pages), peak allocation 20.9MB (335 pages)
    Module i2c_core using 19.4MB (310 pages), peak allocation 22.8MB (365 pages)
    ======== Report format module_summary END ========
    
    ======== Report format module_top: ========
    Top stack usage of module i40e:
      (null) Pages: 332 (peak: 335)
        system_call_common (0xc00000000000d260) Pages: 267 (peak: 268)
          system_call_exception (0xc000000000034334) Pages: 267 (peak: 268)
            __sys_sendmsg (0xc000000000fd727c) Pages: 267 (peak: 268)
              ___sys_sendmsg (0xc000000000fd22ec) Pages: 267 (peak: 268)
                sock_sendmsg (0xc000000000fd0a90) Pages: 267 (peak: 268)
                  netlink_sendmsg (0xc0000000010dbd4c) Pages: 267 (peak: 268)
                    netlink_unicast (0xc0000000010db948) Pages: 267 (peak: 268)
                      rtnetlink_rcv (0xc000000001033058) Pages: 267 (peak: 268)
                        netlink_rcv_skb (0xc0000000010dc534) Pages: 267 (peak: 268)
                          rtnetlink_rcv_msg (0xc0000000010340fc) Pages: 267 (peak: 268)
                            rtnl_newlink (0xc000000001038290) Pages: 267 (peak: 268)
                              __rtnl_newlink (0xc000000001037d64) Pages: 267 (peak: 268)
                                do_setlink (0xc00000000103626c) Pages: 267 (peak: 268)
                                  dev_change_flags (0xc00000000101e2fc) Pages: 267 (peak: 268)
                                    __dev_change_flags (0xc00000000101e1fc) Pages: 267 (peak: 268)
                                      __dev_open (0xc00000000101dda8) Pages: 267 (peak: 268)
                                        i40e_open i40e (0xc00800000851a238) Pages: 267 (peak: 268)
                                          i40e_vsi_open i40e (0xc008000008519f54) Pages: 252 (peak: 252)
                                            i40e_vsi_configure i40e (0xc0080000085093ac) Pages: 252 (peak: 252)
                                              i40e_configure_rx_ring i40e (0xc0080000085055b0) Pages: 252 (peak: 252)
                                                i40e_alloc_rx_buffers i40e (0xc008000008540d1c) Pages: 252 (peak: 252)
                                                  __alloc_pages_nodemask (0xc0000000004d74e0) Pages: 252 (peak: 252)
                                                    (null) Pages: 252 (peak: 252)
                                                      __traceiter_mm_page_alloc (0xc00000000047c754) Pages: 504 (peak: 504)
    

Before applying this patch set,

    ======== Report format module_summary: ========
    Module i40iw using 1513.7MB (387507 pages), peak allocation 1513.7MB (387507 pages)
    Module i40e using 85.8MB (21977 pages), peak allocation 87.0MB (22276 pages)
    Module xfs using 1.2MB (299 pages), peak allocation 1.2MB (300 pages)
    Module rdma_ucm using 0.8MB (210 pages), peak allocation 0.8MB (211 pages)
    Module ib_uverbs using 0.5MB (131 pages), peak allocation 3.8MB (971 pages)
    Module ib_iser using 0.4MB (109 pages), peak allocation 0.4MB (109 pages)
    Module target_core_mod using 0.4MB (109 pages), peak allocation 0.4MB (111 pages)
    Module rdma_cm using 0.2MB (46 pages), peak allocation 0.2MB (46 pages)
    Module e1000e using 0.2MB (46 pages), peak allocation 0.2MB (46 pages)
    Module ib_core using 0.2MB (45 pages), peak allocation 0.2MB (45 pages)
    Module iw_cm using 0.2MB (44 pages), peak allocation 0.2MB (44 pages)
    Module scsi_transport_iscsi using 0.1MB (20 pages), peak allocation 0.1MB (20 pages)
    Module ib_isert using 0.1MB (19 pages), peak allocation 0.1MB (19 pages)
    Module iscsi_target_mod using 0.1MB (17 pages), peak allocation 0.1MB (17 pages)
    Module libiscsi using 0.1MB (15 pages), peak allocation 0.1MB (15 pages)
    Module ib_cm using 0.1MB (14 pages), peak allocation 0.1MB (14 pages)
    Module ib_srpt using 0.0MB (9 pages), peak allocation 0.0MB (9 pages)
    Module rpcrdma using 0.0MB (0 pages), peak allocation 0.0MB (0 pages)
    ======== Report format module_summary END ========
    
    ======== Report format module_top: ========
    Top stack usage of module i40iw:
      (null) Pages: 387507 (peak: 387507)
        ret_from_fork (0xffffffffb5000255) Pages: 387507 (peak: 387507)
          kthread (0xffffffffb4700696) Pages: 387507 (peak: 387507)
            worker_thread (0xffffffffb46facf0) Pages: 387507 (peak: 387507)
              process_one_work (0xffffffffb46fa627) Pages: 387507 (peak: 387507)
                i40e_service_task i40e (0xffffffffc1146183) Pages: 387507 (peak: 387507)
                  i40e_client_subtask i40e (0xffffffffc1163a34) Pages: 387507 (peak: 387507)
                    i40iw_open.part.14 i40iw (0xffffffffc11b0ea9) Pages: 344064 (peak: 344064)
                      i40iw_sc_create_hmc_obj i40iw (0xffffffffc11ae4c9) Pages: 344064 (peak: 344064)
                        i40iw_add_sd_table_entry i40iw (0xffffffffc11ae079) Pages: 344064 (peak: 344064)
                          i40iw_allocate_dma_mem i40iw (0xffffffffc11b7517) Pages: 344064 (peak: 344064)
                            dma_direct_alloc_pages (0xffffffffb4762035) Pages: 344064 (peak: 344064)
                              __dma_direct_alloc_pages (0xffffffffb4761f04) Pages: 344064 (peak: 344064)
                                __alloc_pages_nodemask (0xffffffffb48b7367) Pages: 344064 (peak: 344064)
                                  __alloc_pages_nodemask (0xffffffffb48b7367) Pages: 688128 (peak: 688128)
                    i40iw_open.part.14 i40iw (0xffffffffc11b1278) Pages: 25883 (peak: 25883)
                      i40iw_puda_create_rsrc i40iw (0xffffffffc11b47da) Pages: 24576 (peak: 24576)
                        i40iw_allocate_dma_mem i40iw (0xffffffffc11b7517) Pages: 24576 (peak: 24576)
                          dma_direct_alloc_pages (0xffffffffb4762035) Pages: 24576 (peak: 24576)
                            __dma_direct_alloc_pages (0xffffffffb4761f04) Pages: 24576 (peak: 24576)
                              __alloc_pages_nodemask (0xffffffffb48b7367) Pages: 24576 (peak: 24576)
                                __alloc_pages_nodemask (0xffffffffb48b7367) Pages: 49152 (peak: 49152)
                      i40iw_puda_create_rsrc i40iw (0xffffffffc11b485a) Pages: 731 (peak: 731)
                        i40iw_allocate_virt_mem i40iw (0xffffffffc11b758d) Pages: 731 (peak: 731)
    



memstrack report for the POWER9 machine
=======================================

After applying this patch set,

    ======== Report format module_summary: ========
    Module i40e using 1.2MB (316 pages), peak allocation 1.4MB (369 pages)
    ======== Report format module_summary END ========
    
    ======== Report format module_top: ========
    Top stack usage of module i40e:
      (null) Pages: 316 (peak: 369)
        i40e_init_interrupt_scheme i40e (0xffffffffc03f85f8) Pages: 79 (peak: 79)
          __pci_enable_msix_range.part.0 (0xffffffff966b9878) Pages: 69 (peak: 69)
            __msi_domain_alloc_irqs (0xffffffff9614c31b) Pages: 69 (peak: 69)
              __irq_domain_alloc_irqs (0xffffffff96149fb5) Pages: 35 (peak: 35)
                irq_domain_alloc_descs.part.0 (0xffffffff961489e5) Pages: 35 (peak: 35)
                  __irq_alloc_descs (0xffffffff96bc0583) Pages: 22 (peak: 22)
                    kobject_add (0xffffffff9666292e) Pages: 22 (peak: 22)
                      kobject_add_internal (0xffffffff966622e2) Pages: 22 (peak: 22)
                        internal_create_groups.part.0 (0xffffffff963d962d) Pages: 22 (peak: 22)
                          internal_create_group (0xffffffff963d8f56) Pages: 22 (peak: 22)
                            sysfs_add_file_mode_ns (0xffffffff963d837e) Pages: 22 (peak: 22)
                              __kernfs_create_file (0xffffffff963d7865) Pages: 22 (peak: 22)
                                kernfs_new_node (0xffffffff963d5ad3) Pages: 22 (peak: 22)
                                  __kernfs_new_node (0xffffffff963d4e4e) Pages: 18 (peak: 18)
                                    kmem_cache_alloc (0xffffffff962e8b04) Pages: 18 (peak: 18)
                                      __slab_alloc (0xffffffff962e891c) Pages: 18 (peak: 18)
                                        ___slab_alloc (0xffffffff962e875c) Pages: 18 (peak: 18)
                                          allocate_slab (0xffffffff962e6223) Pages: 18 (peak: 18)
                                            __alloc_pages_nodemask (0xffffffff962c07f5) Pages: 18 (peak: 18)
                                              __alloc_pages_nodemask (0xffffffff962c07f5) Pages: 36 (peak: 36)
    ======== Report format module_top END ========


Before applying this patch set,


    ======== Report format module_summary: ========
    Module i40e using 15368.5MB (245896 pages), peak allocation 15368.6MB (245897 pages)
    Module bpf using 5.8MB (92 pages), peak allocation 7.4MB (118 pages)
    Module xfs using 0.8MB (12 pages), peak allocation 0.8MB (12 pages)
    ======== Report format module_summary END ========
    
    ======== Report format module_top: ========
    Top stack usage of module i40e:
      (null) Pages: 245896 (peak: 245897)
        system_call_common (0xc00000000000d260) Pages: 243801 (peak: 243801)
          system_call_exception (0xc000000000034254) Pages: 243801 (peak: 243801)
            __sys_sendmsg (0xc000000000fd221c) Pages: 243801 (peak: 243801)
              ___sys_sendmsg (0xc000000000fcd28c) Pages: 243801 (peak: 243801)
                sock_sendmsg (0xc000000000fcba30) Pages: 243801 (peak: 243801)
                  netlink_sendmsg (0xc0000000010d69ac) Pages: 243801 (peak: 243801)
                    netlink_unicast (0xc0000000010d65a8) Pages: 243801 (peak: 243801)
                      rtnetlink_rcv (0xc00000000102de78) Pages: 243801 (peak: 243801)
                        netlink_rcv_skb (0xc0000000010d7194) Pages: 243801 (peak: 243801)
                          rtnetlink_rcv_msg (0xc00000000102ef1c) Pages: 243801 (peak: 243801)
                            rtnl_newlink (0xc0000000010330b0) Pages: 243801 (peak: 243801)
                              __rtnl_newlink (0xc000000001032b84) Pages: 243801 (peak: 243801)
                                do_setlink (0xc00000000103108c) Pages: 243801 (peak: 243801)
                                  dev_change_flags (0xc00000000101917c) Pages: 243801 (peak: 243801)
                                    __dev_change_flags (0xc00000000101907c) Pages: 243801 (peak: 243801)
                                      __dev_open (0xc000000001018c28) Pages: 243801 (peak: 243801)
                                        i40e_open i40e (0xc0080000083be3c8) Pages: 243801 (peak: 243801)
                                          i40e_vsi_open i40e (0xc0080000083be074) Pages: 242655 (peak: 242655)
                                            i40e_vsi_configure i40e (0xc0080000083a8efc) Pages: 242647 (peak: 242647)
                                              i40e_configure_rx_ring i40e (0xc0080000083a6a60) Pages: 242583 (peak: 242583)
                                                i40e_alloc_rx_buffers i40e (0xc0080000083e9280) Pages: 242583 (peak: 242583)
                                                  __alloc_pages_nodemask (0xc0000000004d66c0) Pages: 242583 (peak: 242583)
                                                    (null) Pages: 242583 (peak: 242583)
                                                      __traceiter_mm_page_alloc (0xc00000000047bab4) Pages: 485166 (peak: 485166)
                                              i40e_configure_rx_ring i40e (0xc0080000083a6850) Pages: 64 (peak: 64)
                                                i40e_alloc_rx_bi i40e (0xc0080000083e8cec) Pages: 64 (peak: 64)
                                                  __kmalloc (0xc00000000051cc54) Pages: 64 (peak: 64)
                                                    ___slab_alloc (0xc00000000051c4a0) Pages: 64 (peak: 64)
                                                      allocate_slab (0xc000000000517f94) Pages: 64 (peak: 64)
                                                        alloc_pages_current (0xc0000000005054f0) Pages: 64 (peak: 64)
                                                          __alloc_pages_nodemask (0xc0000000004d66c0) Pages: 64 (peak: 64)
                                                            (null) Pages: 64 (peak: 64)
                                                              __traceiter_mm_page_alloc (0xc00000000047bab4) Pages: 128 (peak: 128)
                                            i40e_vsi_configure i40e (0xc0080000083a8e4c) Pages: 8 (peak: 8)
                                              i40e_configure_tx_ring i40e (0xc0080000083a8b10) Pages: 8 (peak: 8)
                                                netif_set_xps_queue (0xc00000000100d5b4) Pages: 8 (peak: 8)
                                                  __netif_set_xps_queue (0xc00000000100c77c) Pages: 6 (peak: 6)
                                                    __kmalloc (0xc00000000051cc54) Pages: 6 (peak: 6)
                                                      ___slab_alloc (0xc00000000051c4a0) Pages: 6 (peak: 6)
                                                        allocate_slab (0xc000000000517f94) Pages: 6 (peak: 6)
                                                          alloc_pages_current (0xc0000000005054f0) Pages: 6 (peak: 6)
                                                            __alloc_pages_nodemask (0xc0000000004d66c0) Pages: 6 (peak: 6)
                                                              (null) Pages: 6 (peak: 6)
                                                                __traceiter_mm_page_alloc (0xc00000000047bab4) Pages: 12 (peak: 12)
    ...
    ======== Report format module_top END ========

Coiby Xu (4):
  i40e: use minimal tx and rx pairs for kdump
  i40e: use minimal rx and tx ring buffers for kdump
  i40e: use minimal admin queue for kdump
  i40e: don't start i40iw client for kdump

 drivers/net/ethernet/intel/i40e/i40e.h        |  2 ++
 drivers/net/ethernet/intel/i40e/i40e_client.c |  7 ++++++
 drivers/net/ethernet/intel/i40e/i40e_main.c   | 23 +++++++++++++++++--
 3 files changed, 30 insertions(+), 2 deletions(-)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/4] i40e: use minimal tx and rx pairs for kdump
  2021-02-22  7:06 [RFC PATCH 0/4] Reducing memory usage of i40e for kdump Coiby Xu
@ 2021-02-22  7:06 ` Coiby Xu
  2021-02-22  7:06 ` [RFC PATCH 2/4] i40e: use minimal rx and tx ring buffers " Coiby Xu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Coiby Xu @ 2021-02-22  7:06 UTC (permalink / raw)
  To: netdev
  Cc: kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, open list

Set the number of the MSI-X vectors to 1. When MSI-X is enabled,
it's not allowed to use more TC queue pairs than MSI-X vectors
(pf->num_lan_msix) exist. Thus the number of tx and rx pairs
(vsi->num_queue_pairs) will be equal to the number of MSI-X vectors,
i.e., 1.

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 1db482d310c2..069c86e2f69d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -6,6 +6,7 @@
 #include <linux/pci.h>
 #include <linux/bpf.h>
 #include <generated/utsrelease.h>
+#include <linux/crash_dump.h>
 
 /* Local includes */
 #include "i40e.h"
@@ -15000,6 +15001,14 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	if (err)
 		goto err_switch_setup;
 
+	/* Reduce tx and rx pairs for kdump
+	 * When MSI-X is enabled, it's not allowed to use more TC queue
+	 * pairs than MSI-X vectors (pf->num_lan_msix) exist. Thus
+	 * vsi->num_queue_pairs will be equal to pf->num_lan_msix, i.e., 1.
+	 */
+	if (is_kdump_kernel())
+		pf->num_lan_msix = 1;
+
 	pf->udp_tunnel_nic.set_port = i40e_udp_tunnel_set_port;
 	pf->udp_tunnel_nic.unset_port = i40e_udp_tunnel_unset_port;
 	pf->udp_tunnel_nic.flags = UDP_TUNNEL_NIC_INFO_MAY_SLEEP;
-- 
2.30.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/4] i40e: use minimal rx and tx ring buffers for kdump
  2021-02-22  7:06 [RFC PATCH 0/4] Reducing memory usage of i40e for kdump Coiby Xu
  2021-02-22  7:06 ` [RFC PATCH 1/4] i40e: use minimal tx and rx pairs " Coiby Xu
@ 2021-02-22  7:06 ` Coiby Xu
  2021-02-22  7:07 ` [RFC PATCH 3/4] i40e: use minimal admin queue " Coiby Xu
  2021-02-22  7:07 ` [RFC PATCH 4/4] i40e: don't open i40iw client " Coiby Xu
  3 siblings, 0 replies; 11+ messages in thread
From: Coiby Xu @ 2021-02-22  7:06 UTC (permalink / raw)
  To: netdev
  Cc: kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, open list

Use the minimum of the number of descriptors thus we will allocate the
minimal ring buffers for kdump.

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 069c86e2f69d..5307f1744766 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10552,6 +10552,11 @@ static int i40e_set_num_rings_in_vsi(struct i40e_vsi *vsi)
 		return -ENODATA;
 	}
 
+	if (is_kdump_kernel()) {
+		vsi->num_tx_desc = I40E_MIN_NUM_DESCRIPTORS;
+		vsi->num_rx_desc = I40E_MIN_NUM_DESCRIPTORS;
+	}
+
 	return 0;
 }
 
-- 
2.30.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 3/4] i40e: use minimal admin queue for kdump
  2021-02-22  7:06 [RFC PATCH 0/4] Reducing memory usage of i40e for kdump Coiby Xu
  2021-02-22  7:06 ` [RFC PATCH 1/4] i40e: use minimal tx and rx pairs " Coiby Xu
  2021-02-22  7:06 ` [RFC PATCH 2/4] i40e: use minimal rx and tx ring buffers " Coiby Xu
@ 2021-02-22  7:07 ` Coiby Xu
  2021-02-22  7:07 ` [RFC PATCH 4/4] i40e: don't open i40iw client " Coiby Xu
  3 siblings, 0 replies; 11+ messages in thread
From: Coiby Xu @ 2021-02-22  7:07 UTC (permalink / raw)
  To: netdev
  Cc: kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, open list

The minimum size of admin send/receive queue is 1 and 2 respectively.
The admin send queue can't be set to 1 because in that case, the
firmware would fail to init.

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h      | 2 ++
 drivers/net/ethernet/intel/i40e/i40e_main.c | 9 +++++++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index 118473dfdcbd..e106c33ff958 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -66,6 +66,8 @@
 #define I40E_FDIR_RING_COUNT		32
 #define I40E_MAX_AQ_BUF_SIZE		4096
 #define I40E_AQ_LEN			256
+#define I40E_MIN_ARQ_LEN		1
+#define I40E_MIN_ASQ_LEN		2
 #define I40E_AQ_WORK_LIMIT		66 /* max number of VFs + a little */
 #define I40E_MAX_USER_PRIORITY		8
 #define I40E_DEFAULT_TRAFFIC_CLASS	BIT(0)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 5307f1744766..2fd8db80b585 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -14847,8 +14847,13 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	i40e_check_recovery_mode(pf);
 
-	hw->aq.num_arq_entries = I40E_AQ_LEN;
-	hw->aq.num_asq_entries = I40E_AQ_LEN;
+	if (is_kdump_kernel()) {
+		hw->aq.num_arq_entries = I40E_MIN_ARQ_LEN;
+		hw->aq.num_asq_entries = I40E_MIN_ASQ_LEN;
+	} else {
+		hw->aq.num_arq_entries = I40E_AQ_LEN;
+		hw->aq.num_asq_entries = I40E_AQ_LEN;
+	}
 	hw->aq.arq_buf_size = I40E_MAX_AQ_BUF_SIZE;
 	hw->aq.asq_buf_size = I40E_MAX_AQ_BUF_SIZE;
 	pf->adminq_work_limit = I40E_AQ_WORK_LIMIT;
-- 
2.30.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 4/4] i40e: don't open i40iw client for kdump
  2021-02-22  7:06 [RFC PATCH 0/4] Reducing memory usage of i40e for kdump Coiby Xu
                   ` (2 preceding siblings ...)
  2021-02-22  7:07 ` [RFC PATCH 3/4] i40e: use minimal admin queue " Coiby Xu
@ 2021-02-22  7:07 ` Coiby Xu
  2021-02-23 20:22   ` Jakub Kicinski
  2021-02-25 10:11   ` Bhupesh SHARMA
  3 siblings, 2 replies; 11+ messages in thread
From: Coiby Xu @ 2021-02-22  7:07 UTC (permalink / raw)
  To: netdev
  Cc: kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, Jakub Kicinski, open list

i40iw consumes huge amounts of memory. For example, on a x86_64 machine,
i40iw consumed 1.5GB for Intel Corporation Ethernet Connection X722 for
for 1GbE while "craskernel=auto" only reserved 160M. With the module
parameter "resource_profile=2", we can reduce the memory usage of i40iw
to ~300M which is still too much for kdump.

Disabling the client registration would spare us the client interface
operation open , i.e., i40iw_open for iwarp/uda device. Thus memory is
saved for kdump.

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 drivers/net/ethernet/intel/i40e/i40e_client.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_client.c b/drivers/net/ethernet/intel/i40e/i40e_client.c
index a2dba32383f6..aafc2587f389 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_client.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_client.c
@@ -4,6 +4,7 @@
 #include <linux/list.h>
 #include <linux/errno.h>
 #include <linux/net/intel/i40e_client.h>
+#include <linux/crash_dump.h>
 
 #include "i40e.h"
 #include "i40e_prototype.h"
@@ -741,6 +742,12 @@ int i40e_register_client(struct i40e_client *client)
 {
 	int ret = 0;
 
+	/* Don't open i40iw client for kdump because i40iw will consume huge
+	 * amounts of memory.
+	 */
+	if (is_kdump_kernel())
+		return ret;
+
 	if (!client) {
 		ret = -EIO;
 		goto out;
-- 
2.30.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 4/4] i40e: don't open i40iw client for kdump
  2021-02-22  7:07 ` [RFC PATCH 4/4] i40e: don't open i40iw client " Coiby Xu
@ 2021-02-23 20:22   ` Jakub Kicinski
  2021-02-24 11:41     ` Coiby Xu
  2021-02-25 10:11   ` Bhupesh SHARMA
  1 sibling, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2021-02-23 20:22 UTC (permalink / raw)
  To: Coiby Xu
  Cc: netdev, kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, open list

On Mon, 22 Feb 2021 15:07:01 +0800 Coiby Xu wrote:
> i40iw consumes huge amounts of memory. For example, on a x86_64 machine,
> i40iw consumed 1.5GB for Intel Corporation Ethernet Connection X722 for
> for 1GbE while "craskernel=auto" only reserved 160M. With the module
> parameter "resource_profile=2", we can reduce the memory usage of i40iw
> to ~300M which is still too much for kdump.
> 
> Disabling the client registration would spare us the client interface
> operation open , i.e., i40iw_open for iwarp/uda device. Thus memory is
> saved for kdump.
> 
> Signed-off-by: Coiby Xu <coxu@redhat.com>

Is i40iw or whatever the client is not itself under a CONFIG which
kdump() kernels could be reasonably expected to disable?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 4/4] i40e: don't open i40iw client for kdump
  2021-02-23 20:22   ` Jakub Kicinski
@ 2021-02-24 11:41     ` Coiby Xu
  2021-02-24 16:48       ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Coiby Xu @ 2021-02-24 11:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, open list

Hi Jakub,

Thank you for reviewing the patch!

On Tue, Feb 23, 2021 at 12:22:07PM -0800, Jakub Kicinski wrote:
>On Mon, 22 Feb 2021 15:07:01 +0800 Coiby Xu wrote:
>> i40iw consumes huge amounts of memory. For example, on a x86_64 machine,
>> i40iw consumed 1.5GB for Intel Corporation Ethernet Connection X722 for
>> for 1GbE while "craskernel=auto" only reserved 160M. With the module
>> parameter "resource_profile=2", we can reduce the memory usage of i40iw
>> to ~300M which is still too much for kdump.
>>
>> Disabling the client registration would spare us the client interface
>> operation open , i.e., i40iw_open for iwarp/uda device. Thus memory is
>> saved for kdump.
>>
>> Signed-off-by: Coiby Xu <coxu@redhat.com>
>
>Is i40iw or whatever the client is not itself under a CONFIG which
>kdump() kernels could be reasonably expected to disable?
>

I'm not sure if I understand you correctly. Do you mean we shouldn't
disable i40iw for kdump?

-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 4/4] i40e: don't open i40iw client for kdump
  2021-02-24 11:41     ` Coiby Xu
@ 2021-02-24 16:48       ` Jakub Kicinski
  2021-02-25  0:21         ` Coiby Xu
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2021-02-24 16:48 UTC (permalink / raw)
  To: Coiby Xu
  Cc: netdev, kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, open list

On Wed, 24 Feb 2021 19:41:41 +0800 Coiby Xu wrote:
> On Tue, Feb 23, 2021 at 12:22:07PM -0800, Jakub Kicinski wrote:
> >On Mon, 22 Feb 2021 15:07:01 +0800 Coiby Xu wrote:  
> >> i40iw consumes huge amounts of memory. For example, on a x86_64 machine,
> >> i40iw consumed 1.5GB for Intel Corporation Ethernet Connection X722 for
> >> for 1GbE while "craskernel=auto" only reserved 160M. With the module
> >> parameter "resource_profile=2", we can reduce the memory usage of i40iw
> >> to ~300M which is still too much for kdump.
> >>
> >> Disabling the client registration would spare us the client interface
> >> operation open , i.e., i40iw_open for iwarp/uda device. Thus memory is
> >> saved for kdump.
> >>
> >> Signed-off-by: Coiby Xu <coxu@redhat.com>  
> >
> >Is i40iw or whatever the client is not itself under a CONFIG which
> >kdump() kernels could be reasonably expected to disable?
> >  
> 
> I'm not sure if I understand you correctly. Do you mean we shouldn't
> disable i40iw for kdump?

Forgive my ignorance - are the kdump kernels separate builds?

If they are it'd be better to leave the choice of enabling RDMA 
to the user - through appropriate Kconfig options.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 4/4] i40e: don't open i40iw client for kdump
  2021-02-24 16:48       ` Jakub Kicinski
@ 2021-02-25  0:21         ` Coiby Xu
  2021-02-25  0:47           ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Coiby Xu @ 2021-02-25  0:21 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, open list

On Wed, Feb 24, 2021 at 08:48:41AM -0800, Jakub Kicinski wrote:
>On Wed, 24 Feb 2021 19:41:41 +0800 Coiby Xu wrote:
>> On Tue, Feb 23, 2021 at 12:22:07PM -0800, Jakub Kicinski wrote:
>> >On Mon, 22 Feb 2021 15:07:01 +0800 Coiby Xu wrote:
>> >> i40iw consumes huge amounts of memory. For example, on a x86_64 machine,
>> >> i40iw consumed 1.5GB for Intel Corporation Ethernet Connection X722 for
>> >> for 1GbE while "craskernel=auto" only reserved 160M. With the module
>> >> parameter "resource_profile=2", we can reduce the memory usage of i40iw
>> >> to ~300M which is still too much for kdump.
>> >>
>> >> Disabling the client registration would spare us the client interface
>> >> operation open , i.e., i40iw_open for iwarp/uda device. Thus memory is
>> >> saved for kdump.
>> >>
>> >> Signed-off-by: Coiby Xu <coxu@redhat.com>
>> >
>> >Is i40iw or whatever the client is not itself under a CONFIG which
>> >kdump() kernels could be reasonably expected to disable?
>> >
>>
>> I'm not sure if I understand you correctly. Do you mean we shouldn't
>> disable i40iw for kdump?
>
>Forgive my ignorance - are the kdump kernels separate builds?
>

AFAIK we don't build a kernel exclusively for kdump. 

>If they are it'd be better to leave the choice of enabling RDMA
>to the user - through appropriate Kconfig options.
>

i40iw is usually built as a loadable module. So if we want to leave the
choce of enabling RDMA to the user, we could exclude this driver when
building the initramfs for kdump, for example, dracut provides the 
omit_drivers option for this purpose. 

On the other hand, the users expect "crashkernel=auto" to work out of
the box. So i40iw defeats this purpose. 

I'll discuss with my Red Hat team and the Intel team about whether RDMA
is needed for kdump. Thanks for bringing up this issue!

-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 4/4] i40e: don't open i40iw client for kdump
  2021-02-25  0:21         ` Coiby Xu
@ 2021-02-25  0:47           ` Jakub Kicinski
  0 siblings, 0 replies; 11+ messages in thread
From: Jakub Kicinski @ 2021-02-25  0:47 UTC (permalink / raw)
  To: Coiby Xu
  Cc: netdev, kexec, intel-wired-lan, Jesse Brandeburg, Tony Nguyen,
	David S. Miller, open list

On Thu, 25 Feb 2021 08:21:01 +0800 Coiby Xu wrote:
> On Wed, Feb 24, 2021 at 08:48:41AM -0800, Jakub Kicinski wrote:
> >On Wed, 24 Feb 2021 19:41:41 +0800 Coiby Xu wrote:  
> >> I'm not sure if I understand you correctly. Do you mean we shouldn't
> >> disable i40iw for kdump?  
> >
> >Forgive my ignorance - are the kdump kernels separate builds?
> 
> AFAIK we don't build a kernel exclusively for kdump. 
> 
> >If they are it'd be better to leave the choice of enabling RDMA
> >to the user - through appropriate Kconfig options.
> 
> i40iw is usually built as a loadable module. So if we want to leave the
> choce of enabling RDMA to the user, we could exclude this driver when
> building the initramfs for kdump, for example, dracut provides the 
> omit_drivers option for this purpose. 
> 
> On the other hand, the users expect "crashkernel=auto" to work out of
> the box. So i40iw defeats this purpose. 
> 
> I'll discuss with my Red Hat team and the Intel team about whether RDMA
> is needed for kdump. Thanks for bringing up this issue!

Great, talking to experts here at FB it seems that building a cut-down
kernel for kdump is easier than chasing all the drivers to react to
is_kdump_kernel(). But if you guys need it and Intel is fine with 
the change I won't complain.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 4/4] i40e: don't open i40iw client for kdump
  2021-02-22  7:07 ` [RFC PATCH 4/4] i40e: don't open i40iw client " Coiby Xu
  2021-02-23 20:22   ` Jakub Kicinski
@ 2021-02-25 10:11   ` Bhupesh SHARMA
  1 sibling, 0 replies; 11+ messages in thread
From: Bhupesh SHARMA @ 2021-02-25 10:11 UTC (permalink / raw)
  To: Coiby Xu
  Cc: netdev, kexec, Jesse Brandeburg, open list, intel-wired-lan,
	Jakub Kicinski, Tony Nguyen, David S. Miller

Hello Coiby,

On Mon, Feb 22, 2021 at 12:40 PM Coiby Xu <coxu@redhat.com> wrote:
>
> i40iw consumes huge amounts of memory. For example, on a x86_64 machine,
> i40iw consumed 1.5GB for Intel Corporation Ethernet Connection X722 for
> for 1GbE while "craskernel=auto" only reserved 160M. With the module
> parameter "resource_profile=2", we can reduce the memory usage of i40iw
> to ~300M which is still too much for kdump.
>
> Disabling the client registration would spare us the client interface
> operation open , i.e., i40iw_open for iwarp/uda device. Thus memory is
> saved for kdump.
>
> Signed-off-by: Coiby Xu <coxu@redhat.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_client.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_client.c b/drivers/net/ethernet/intel/i40e/i40e_client.c
> index a2dba32383f6..aafc2587f389 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_client.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_client.c
> @@ -4,6 +4,7 @@
>  #include <linux/list.h>
>  #include <linux/errno.h>
>  #include <linux/net/intel/i40e_client.h>
> +#include <linux/crash_dump.h>
>
>  #include "i40e.h"
>  #include "i40e_prototype.h"
> @@ -741,6 +742,12 @@ int i40e_register_client(struct i40e_client *client)
>  {
>         int ret = 0;
>
> +       /* Don't open i40iw client for kdump because i40iw will consume huge
> +        * amounts of memory.
> +        */
> +       if (is_kdump_kernel())
> +               return ret;
> +

Since crashkernel size can be manually set on the command line by a
user, and some users might be fine with a ~300M memory usage by i40iw
client [with resource_profile=2"], in my view, disabling the client
for all kdump cases seems too restrictive.

We can probably check the crash kernel size allocated (
$ cat /sys/kernel/kexec_crash_size) and then make a decision
accordingly, so for example something like:

 +       if (is_kdump_kernel() && kexec_crash_size < 512M)
 +               return ret;

What do you think?

Regards,
Bhupesh

>         if (!client) {
>                 ret = -EIO;
>                 goto out;
> --
> 2.30.1
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-22  7:06 [RFC PATCH 0/4] Reducing memory usage of i40e for kdump Coiby Xu
2021-02-22  7:06 ` [RFC PATCH 1/4] i40e: use minimal tx and rx pairs " Coiby Xu
2021-02-22  7:06 ` [RFC PATCH 2/4] i40e: use minimal rx and tx ring buffers " Coiby Xu
2021-02-22  7:07 ` [RFC PATCH 3/4] i40e: use minimal admin queue " Coiby Xu
2021-02-22  7:07 ` [RFC PATCH 4/4] i40e: don't open i40iw client " Coiby Xu
2021-02-23 20:22   ` Jakub Kicinski
2021-02-24 11:41     ` Coiby Xu
2021-02-24 16:48       ` Jakub Kicinski
2021-02-25  0:21         ` Coiby Xu
2021-02-25  0:47           ` Jakub Kicinski
2021-02-25 10:11   ` Bhupesh SHARMA

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git