[PATCH 00/41] Memory Hotplug for DPDK

* [PATCH 00/41] Memory Hotplug for DPDK
@ 2018-03-03 13:45 Anatoly Burakov
  2018-03-03 13:45 ` [PATCH 01/41] eal: move get_virtual_area out of linuxapp eal_memory.c Anatoly Burakov
                   ` (87 more replies)
  0 siblings, 88 replies; 471+ messages in thread
From: Anatoly Burakov @ 2018-03-03 13:45 UTC (permalink / raw)
  To: dev
  Cc: keith.wiles, jianfeng.tan, andras.kovacs, laszlo.vadkeri,
	benjamin.walker, bruce.richardson, thomas, konstantin.ananyev,
	kuralamudhan.ramakrishnan, louise.m.daly, nelio.laranjeiro,
	yskoh, pepperjo, jerin.jacob, hemant.agrawal, olivier.matz

This patchset introduces dynamic memory allocation for DPDK (aka memory
hotplug). Based upon RFC submitted in December [1].

Dependencies (to be applied in specified order):
- IPC bugfixes patchset [2]
- IPC improvements patchset [3]
- IPC asynchronous request API patch [4]
- Function to return number of sockets [5]

Deprecation notices relevant to this patchset:
- General outline of memory hotplug changes [6]
- EAL NUMA node count changes [7]

The vast majority of changes are in the EAL and malloc, the external API
disruption is minimal: a new set of API's are added for contiguous memory
allocation for rte_memzone, and a few API additions in rte_memory due to
switch to memseg_lists as opposed to memsegs. Every other API change is
internal to EAL, and all of the memory allocation/freeing is handled
through rte_malloc, with no externally visible API changes.

Quick outline of all changes done as part of this patchset:

 * Malloc heap adjusted to handle holes in address space
 * Single memseg list replaced by multiple memseg lists
 * VA space for hugepages is preallocated in advance
 * Added alloc/free for pages happening as needed on rte_malloc/rte_free
 * Added contiguous memory allocation API's for rte_memzone
 * Integrated Pawel Wodkowski's patch for registering/unregistering memory
   with VFIO [8]
 * Callbacks for registering memory allocations
 * Multiprocess support done via DPDK IPC introduced in 18.02

The biggest difference is a "memseg" now represents a single page (as opposed to
being a big contiguous block of pages). As a consequence, both memzones and
malloc elements are no longer guaranteed to be physically contiguous, unless
the user asks for it at reserve time. To preserve whatever functionality that
was dependent on previous behavior, a legacy memory option is also provided,
however it is expected (or perhaps vainly hoped) to be temporary solution.

Why multiple memseg lists instead of one? Since memseg is a single page now,
the list of memsegs will get quite big, and we need to locate pages somehow
when we allocate and free them. We could of course just walk the list and
allocate one contiguous chunk of VA space for memsegs, but this
implementation uses separate lists instead in order to speed up many
operations with memseg lists.

For v1, the following limitations are present:
- FreeBSD does not even compile, let alone run
- No 32-bit support
- There are some minor quality-of-life improvements planned that aren't
  ready yet and will be part of v2
- VFIO support is only smoke-tested (but is expected to work), VFIO support
  with secondary processes is not tested; work is ongoing to validate VFIO
  for all use cases
- Dynamic mapping/unmapping memory with VFIO is not supported in sPAPR
  IOMMU mode - help from sPAPR maintainers requested

Nevertheless, this patchset should be testable under 64-bit Linux, and
should work for all use cases bar those mentioned above.

[1] http://dpdk.org/dev/patchwork/bundle/aburakov/Memory_RFC/
[2] http://dpdk.org/dev/patchwork/bundle/aburakov/IPC_Fixes/
[3] http://dpdk.org/dev/patchwork/bundle/aburakov/IPC_Improvements/
[4] http://dpdk.org/dev/patchwork/bundle/aburakov/IPC_Async_Request/
[5] http://dpdk.org/dev/patchwork/bundle/aburakov/Num_Sockets/
[6] http://dpdk.org/dev/patchwork/patch/34002/
[7] http://dpdk.org/dev/patchwork/patch/33853/
[8] http://dpdk.org/dev/patchwork/patch/24484/

Anatoly Burakov (41):
  eal: move get_virtual_area out of linuxapp eal_memory.c
  eal: move all locking to heap
  eal: make malloc heap a doubly-linked list
  eal: add function to dump malloc heap contents
  test: add command to dump malloc heap contents
  eal: make malloc_elem_join_adjacent_free public
  eal: make malloc free list remove public
  eal: make malloc free return resulting malloc element
  eal: add rte_fbarray
  eal: add "single file segments" command-line option
  eal: add "legacy memory" option
  eal: read hugepage counts from node-specific sysfs path
  eal: replace memseg with memseg lists
  eal: add support for mapping hugepages at runtime
  eal: add support for unmapping pages at runtime
  eal: make use of memory hotplug for init
  eal: enable memory hotplug support in rte_malloc
  test: fix malloc autotest to support memory hotplug
  eal: add API to check if memory is contiguous
  eal: add backend support for contiguous allocation
  eal: enable reserving physically contiguous memzones
  eal: replace memzone array with fbarray
  mempool: add support for the new allocation methods
  vfio: allow to map other memory regions
  eal: map/unmap memory with VFIO when alloc/free pages
  eal: prepare memseg lists for multiprocess sync
  eal: add multiprocess init with memory hotplug
  eal: add support for multiprocess memory hotplug
  eal: add support for callbacks on memory hotplug
  eal: enable callbacks on malloc/free and mp sync
  ethdev: use contiguous allocation for DMA memory
  crypto/qat: use contiguous allocation for DMA memory
  net/avf: use contiguous allocation for DMA memory
  net/bnx2x: use contiguous allocation for DMA memory
  net/cxgbe: use contiguous allocation for DMA memory
  net/ena: use contiguous allocation for DMA memory
  net/enic: use contiguous allocation for DMA memory
  net/i40e: use contiguous allocation for DMA memory
  net/qede: use contiguous allocation for DMA memory
  net/virtio: use contiguous allocation for DMA memory
  net/vmxnet3: use contiguous allocation for DMA memory

 config/common_base                                |   15 +-
 drivers/bus/pci/linux/pci.c                       |   29 +-
 drivers/crypto/qat/qat_qp.c                       |    4 +-
 drivers/net/avf/avf_ethdev.c                      |    2 +-
 drivers/net/bnx2x/bnx2x.c                         |    2 +-
 drivers/net/bnx2x/bnx2x_rxtx.c                    |    3 +-
 drivers/net/cxgbe/sge.c                           |    3 +-
 drivers/net/ena/base/ena_plat_dpdk.h              |    7 +-
 drivers/net/ena/ena_ethdev.c                      |   10 +-
 drivers/net/enic/enic_main.c                      |    4 +-
 drivers/net/i40e/i40e_ethdev.c                    |    2 +-
 drivers/net/i40e/i40e_rxtx.c                      |    2 +-
 drivers/net/qede/base/bcm_osal.c                  |    5 +-
 drivers/net/virtio/virtio_ethdev.c                |    8 +-
 drivers/net/virtio/virtio_user/vhost_kernel.c     |  108 ++-
 drivers/net/vmxnet3/vmxnet3_ethdev.c              |    7 +-
 lib/librte_eal/bsdapp/eal/Makefile                |    4 +
 lib/librte_eal/bsdapp/eal/eal.c                   |   25 +
 lib/librte_eal/bsdapp/eal/eal_hugepage_info.c     |    7 +
 lib/librte_eal/bsdapp/eal/eal_memalloc.c          |   33 +
 lib/librte_eal/bsdapp/eal/meson.build             |    1 +
 lib/librte_eal/common/Makefile                    |    2 +-
 lib/librte_eal/common/eal_common_fbarray.c        |  859 +++++++++++++++++
 lib/librte_eal/common/eal_common_memalloc.c       |  181 ++++
 lib/librte_eal/common/eal_common_memory.c         |  512 +++++++++-
 lib/librte_eal/common/eal_common_memzone.c        |  275 ++++--
 lib/librte_eal/common/eal_common_options.c        |    8 +
 lib/librte_eal/common/eal_filesystem.h            |   13 +
 lib/librte_eal/common/eal_hugepages.h             |    7 +
 lib/librte_eal/common/eal_internal_cfg.h          |   10 +-
 lib/librte_eal/common/eal_memalloc.h              |   41 +
 lib/librte_eal/common/eal_options.h               |    4 +
 lib/librte_eal/common/eal_private.h               |   33 +
 lib/librte_eal/common/include/rte_eal_memconfig.h |   29 +-
 lib/librte_eal/common/include/rte_fbarray.h       |  352 +++++++
 lib/librte_eal/common/include/rte_malloc.h        |    9 +
 lib/librte_eal/common/include/rte_malloc_heap.h   |    6 +
 lib/librte_eal/common/include/rte_memory.h        |   79 +-
 lib/librte_eal/common/include/rte_memzone.h       |  155 ++-
 lib/librte_eal/common/include/rte_vfio.h          |   39 +
 lib/librte_eal/common/malloc_elem.c               |  436 +++++++--
 lib/librte_eal/common/malloc_elem.h               |   41 +-
 lib/librte_eal/common/malloc_heap.c               |  694 +++++++++++++-
 lib/librte_eal/common/malloc_heap.h               |   15 +-
 lib/librte_eal/common/malloc_mp.c                 |  723 ++++++++++++++
 lib/librte_eal/common/malloc_mp.h                 |   86 ++
 lib/librte_eal/common/meson.build                 |    4 +
 lib/librte_eal/common/rte_malloc.c                |   75 +-
 lib/librte_eal/linuxapp/eal/Makefile              |    5 +
 lib/librte_eal/linuxapp/eal/eal.c                 |  102 +-
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c   |  155 ++-
 lib/librte_eal/linuxapp/eal/eal_memalloc.c        | 1049 +++++++++++++++++++++
 lib/librte_eal/linuxapp/eal/eal_memory.c          |  516 ++++++----
 lib/librte_eal/linuxapp/eal/eal_vfio.c            |  318 +++++--
 lib/librte_eal/linuxapp/eal/eal_vfio.h            |   11 +
 lib/librte_eal/linuxapp/eal/meson.build           |    1 +
 lib/librte_eal/rte_eal_version.map                |   23 +-
 lib/librte_ether/rte_ethdev.c                     |    3 +-
 lib/librte_mempool/rte_mempool.c                  |   87 +-
 test/test/commands.c                              |    3 +
 test/test/test_malloc.c                           |   71 +-
 test/test/test_memory.c                           |   43 +-
 test/test/test_memzone.c                          |   26 +-
 63 files changed, 6631 insertions(+), 751 deletions(-)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_memalloc.c
 create mode 100644 lib/librte_eal/common/eal_common_fbarray.c
 create mode 100644 lib/librte_eal/common/eal_common_memalloc.c
 create mode 100644 lib/librte_eal/common/eal_memalloc.h
 create mode 100644 lib/librte_eal/common/include/rte_fbarray.h
 create mode 100644 lib/librte_eal/common/malloc_mp.c
 create mode 100644 lib/librte_eal/common/malloc_mp.h
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_memalloc.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 471+ messages in thread