All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/35] mempool: rework memory allocation
@ 2016-03-09 16:19 Olivier Matz
  2016-03-09 16:19 ` [RFC 01/35] mempool: fix comments and style Olivier Matz
                   ` (37 more replies)
  0 siblings, 38 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

This series is a rework of mempool. For those who don't want to read
all the cover letter, here is a sumary:

- it is not possible to allocate large mempools if there is not enough
  contiguous memory, this series solves this issue
- introduce new APIs with less arguments: "create, populate, obj_init"
- allow to free a mempool
- split code in smaller functions, will ease the introduction of ext_handler
- remove test-pmd anonymous mempool creation
- remove most of dom0-specific mempool code
- opens the door for a eal_memory rework: we probably don't need large
  contiguous memory area anymore, working with pages would work.

This will clearly break the ABI, but as there are already 2 other changes that
will break it for 16.07, the target for this series is 16.07. I plan to send a
deprecation notice for 16.04 soon.

The API stays almost the same, no modification is needed in examples app
or in test-pmd. Only kni and mellanox drivers are slightly modified.

Description of the initial issue
--------------------------------

The allocation of mbuf pool can fail even if there is enough memory.
The problem is related to the way the memory is allocated and used in
dpdk. It is particularly annoying with mbuf pools, but it can also fail
in other use cases allocating a large amount of memory.

- rte_malloc() allocates physically contiguous memory, which is needed
  for mempools, but useless most of the time.

  Allocating a large physically contiguous zone is often impossible
  because the system provide hugepages which may not be contiguous.

- rte_mempool_create() (and therefore rte_pktmbuf_pool_create())
  requires a physically contiguous zone.

- rte_mempool_xmem_create() does not solve the issue as it still
  needs the memory to be virtually contiguous, and there is no
  way in dpdk to allocate a virtually contiguous memory that is
  not also physically contiguous.

How to reproduce the issue
--------------------------

- start the dpdk with some 2MB hugepages (it can also occur with 1GB)
- allocate a large mempool
- even if there is enough memory, the allocation can fail

Example:

  git clone http://dpdk.org/git/dpdk
  cd dpdk
  make config T=x86_64-native-linuxapp-gcc
  make -j32
  mkdir -p /mnt/huge
  mount -t hugetlbfs nodev /mnt/huge
  echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

  # we try to allocate a mempool whose size is ~450MB, it fails
  ./build/app/testpmd -l 2,4 -- --total-num-mbufs=200000 -i

The EAL logs "EAL: Virtual area found at..." shows that there are
several zones, but all smaller than 450MB.

Workarounds:

- Use 1GB hugepages: it sometimes work, but for very large
  pools (millions of mbufs) there is the same issue. Moreover,
  it would consume 1GB memory at least which can be a lot
  in some cases.

- Reboot the machine or allocate hugepages at boot time: this increases
  the chances to have more contiguous memory, but does not completely
  solve the issue

Solutions
---------

Below is a list of proposed solutions. I implemented a quick and dirty
PoC of solution 1, but it's not working in all conditions and it's
really an ugly hack.  This series implement the solution 4 which looks
the best to me, knowing it does not prevent to do more enhancements
in dpdk memory in the future (solution 3 for instance).

Solution 1: in application
--------------------------

- allocate several hugepages using rte_malloc() or rte_memzone_reserve()
  (only keeping complete hugepages)
- parse memsegs and /proc/maps to check which files mmaps these pages
- mmap the files in a contiguous virtual area
- use rte_mempool_xmem_create()

Cons:

- 1a. parsing the memsegs of rte config in the application does not
  use a public API, and can be broken if internal dpdk code changes
- 1b. some memory is lost due to malloc headers. Also, if the memory is
  very fragmented (ex: all 2MB pages are physically separated), it does
  not work at all because we cannot get any complete page. It is not
  possible to use a lower level allocator since commit fafcc11985a.
- 1c. we cannot use rte_pktmbuf_pool_create(), so we need to use mempool
  api and do a part of the job manually
- 1d. it breaks secondary processes as the virtual addresses won't be
  mmap'd at the same place in secondary process
- 1e. it only fixes the issue for the mbuf pool of the application,
  internal pools in dpdk libraries are not modified
- 1f. this is a pure linux solution (rte_map files)
- 1g. The application has to be aware of RTE_EAL_SINGLE_SEGMENTS option
  that changes the way hugepages are mapped. By the way, it's strange
  to have such a compile-time option, we should probably have only
  one behavior that works all the time.

Solution 2: in dpdk memory allocator
------------------------------------

- do the same than solution 1 in a new function rte_malloc_non_contig():
  allocate several chunks and mmap them in a contiguous virtual memory
- a flag has to be added in malloc header to do the proper cleanup in
  rte_free() (free all the chunks, munmap the memory)
- introduce a new rte_mem_get_physmap(*physmap,addr, len) that returns
  the virt2phys mapping of a virtual area in dpdk
- add a mempool flag MEMPOOL_F_NON_PHYS_CONTIG to use
  rte_malloc_non_contig() to allocate the area storing the objects

Cons:

- 2a. same than 1b: it breaks secondary processes if the mempool flag is
  used.
- 2b. same as 1d: some memory is lost due to malloc headers, and it
  cannot work if memory is too fragmented.
- 2c. rte_malloc_virt2phy() cannot be used on these zones. It would
  return the physical address of the first page. It would be better to
  return an error in this case.
- 2d. need to check how to implement this on bsd (TBD)

Solution 3: in dpdk eal memory
------------------------------

- Rework the way hugepages are mmap'd in dpdk: instead of having several
  rte_map* files, just mmap one file per node. It may drastically
  simplify EAL memory management in dpdk.
- An API should be added to retrieve the physical mapping of a virtual
  area (ex: rte_mem_get_physmap(*physmap, addr, len))
- rte_malloc() and rte_memzone_reserve() won't allocate physically
  contiguous memory anymore (TBD)
- Update mempool to always use the rte_mempool_xmem_create() version

Cons:

- 3a. lot of rework in eal memory, it will induce some behavior changes
  and maybe api changes
- 3b. possible conflicts with xen_dom0 mempool

Solution 4: in mempool
----------------------

- Introduce a new API to fill a mempool with zones that are not
  virtually contiguous. It requires to add new functions to create and
  populate a mempool. Example (TBD):

  - rte_mempool_create_empty(name, n, elt_size, cache_size, priv_size)
  - rte_mempool_populate(mp, addr, len): add virtual memory for objects
  - rte_mempool_mempool_obj_iter(mp, obj_cb, arg): call a cb for each object

- update rte_mempool_create() to allocate objects in several memory
  chunks by default if there is no large enough physically contiguous
  memory.

Tests done
----------

Compilation
~~~~~~~~~~~

The following targets:

 x86_64-native-linuxapp-gcc
 i686-native-linuxapp-gcc
 x86_x32-native-linuxapp-gcc
 x86_64-native-linuxapp-clang

Libraries ith and without debug, in static and shared mode + examples.

autotests
~~~~~~~~~

cd /root/dpdk.org
make config T=x86_64-native-linuxapp-gcc O=x86_64-native-linuxapp-gcc
make -j4 O=x86_64-native-linuxapp-gcc EXTRA_CFLAGS="-g -O0"
modprobe uio_pci_generic
python tools/dpdk_nic_bind.py -b uio_pci_generic 0000:03:00.0
python tools/dpdk_nic_bind.py -b uio_pci_generic 0000:08:00.0
mkdir -p /mnt/huge
mount -t hugetlbfs nodev /mnt/huge
echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

./x86_64-native-linuxapp-gcc/app/test -l 0,2,4 -n 4
  memory_autotest       OK
  memzone_autotest      OK
  ring_autotest         OK
  ring_perf_autotest    OK
  mempool_autotest      OK
  mempool_perf_autotest OK

same with --no-huge
  memory_autotest       OK
  memzone_autotest      KO  (was already KO)
  mempool_autotest      OK
  mempool_perf_autotest OK
  ring_autotest         OK
  ring_perf_autotest    OK

test-pmd
~~~~~~~~

# now starts fine, was failing before if mempool was too fragmented
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -- -i --port-topology=chained

# still ok
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -m 256 -- -i --port-topology=chained --mp-anon
set fwd txonly
start
stop

# fail, but was failing before too. The problem is because the physical
# addresses are not properly set when using --no-huge. The mempool phys addr
# are now correct, but the zones allocated through memzone_reserve() are
# still wrong. This could be fixed in a future series.
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -m 256 --no-huge -- -i ---port-topology=chained
set fwd txonly
start
stop

Tests not done (for now)
------------------------

- mellanox driver (it is slightly modified)
- kni (it is slightly modified)
- compilation and test with freebsd
- compilation and test with xen
- compilation and test on other archs


Olivier Matz (35):
  mempool: fix comments and style
  mempool: replace elt_size by total_elt_size
  mempool: uninline function to check cookies
  mempool: use sizeof to get the size of header and trailer
  mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t
  mempool: update library version
  mempool: list objects when added in the mempool
  mempool: remove const attribute in mempool_walk
  mempool: use the list to iterate the mempool elements
  eal: introduce RTE_DECONST macro
  mempool: use the list to audit all elements
  mempool: use the list to initialize mempool objects
  mempool: create the internal ring in a specific function
  mempool: store physaddr in mempool objects
  mempool: remove MEMPOOL_IS_CONTIG()
  mempool: store memory chunks in a list
  mempool: new function to iterate the memory chunks
  mempool: simplify xmem_usage
  mempool: introduce a free callback for memory chunks
  mempool: make page size optional when getting xmem size
  mempool: default allocation in several memory chunks
  eal: lock memory when using no-huge
  mempool: support no-hugepage mode
  mempool: replace mempool physaddr by a memzone pointer
  mempool: introduce a function to free a mempool
  mempool: introduce a function to create an empty mempool
  eal/xen: return machine address without knowing memseg id
  mempool: rework support of xen dom0
  mempool: create the internal ring when populating
  mempool: populate a mempool with anonymous memory
  test-pmd: remove specific anon mempool code
  mempool: make mempool populate and free api public
  mem: avoid memzone/mempool/ring name truncation
  mempool: new flag when phys contig mem is not needed
  mempool: update copyright

 app/test-pmd/Makefile                        |    4 -
 app/test-pmd/mempool_anon.c                  |  201 -----
 app/test-pmd/mempool_osdep.h                 |   54 --
 app/test-pmd/testpmd.c                       |   17 +-
 app/test/test_mempool.c                      |   21 +-
 doc/guides/rel_notes/release_16_04.rst       |    2 +-
 drivers/net/mlx4/mlx4.c                      |   71 +-
 drivers/net/mlx5/mlx5_rxq.c                  |    9 +-
 drivers/net/mlx5/mlx5_rxtx.c                 |   62 +-
 drivers/net/mlx5/mlx5_rxtx.h                 |    2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.h        |    2 +-
 drivers/net/xenvirt/rte_mempool_gntalloc.c   |    4 +-
 lib/librte_eal/common/eal_common_log.c       |    2 +-
 lib/librte_eal/common/eal_common_memzone.c   |   10 +-
 lib/librte_eal/common/include/rte_common.h   |    9 +
 lib/librte_eal/common/include/rte_memory.h   |   11 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c     |    2 +-
 lib/librte_eal/linuxapp/eal/eal_xen_memory.c |   17 +-
 lib/librte_kni/rte_kni.c                     |   12 +-
 lib/librte_mempool/Makefile                  |    5 +-
 lib/librte_mempool/rte_dom0_mempool.c        |  133 ----
 lib/librte_mempool/rte_mempool.c             | 1031 +++++++++++++++++---------
 lib/librte_mempool/rte_mempool.h             |  590 +++++++--------
 lib/librte_mempool/rte_mempool_version.map   |   18 +-
 lib/librte_ring/rte_ring.c                   |   16 +-
 25 files changed, 1121 insertions(+), 1184 deletions(-)
 delete mode 100644 app/test-pmd/mempool_anon.c
 delete mode 100644 app/test-pmd/mempool_osdep.h
 delete mode 100644 lib/librte_mempool/rte_dom0_mempool.c

-- 
2.1.4

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [RFC 01/35] mempool: fix comments and style
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 02/35] mempool: replace elt_size by total_elt_size Olivier Matz
                   ` (36 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

No functional change, just fix some comments and styling issues.
Also avoid to duplicate comments between rte_mempool_create()
and rte_mempool_xmem_create().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++++++++---
 lib/librte_mempool/rte_mempool.h | 59 +++++++++-------------------------------
 2 files changed, 26 insertions(+), 50 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 73ca770..6db02ee 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -152,6 +152,13 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	rte_ring_sp_enqueue(mp->ring, obj);
 }
 
+/* Iterate through objects at the given address
+ *
+ * Given the pointer to the memory, and its topology in physical memory
+ * (the physical addresses table), iterate through the "elt_num" objects
+ * of size "total_elt_sz" aligned at "align". For each object in this memory
+ * chunk, invoke a callback. It returns the effective number of objects
+ * in this memory. */
 uint32_t
 rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
@@ -341,10 +348,8 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
 	return sz;
 }
 
-/*
- * Calculate how much memory would be actually required with the
- * given memory footprint to store required number of elements.
- */
+/* Callback used by rte_mempool_xmem_usage(): it sets the opaque
+ * argument to the end of the object. */
 static void
 mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
 	__rte_unused uint32_t idx)
@@ -352,6 +357,10 @@ mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
 	*(uintptr_t *)arg = (uintptr_t)end;
 }
 
+/*
+ * Calculate how much memory would be actually required with the
+ * given memory footprint to store required number of elements.
+ */
 ssize_t
 rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 8595e77..bd78df5 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -214,7 +214,7 @@ struct rte_mempool {
 
 }  __rte_cache_aligned;
 
-#define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread in memory. */
+#define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread among memory channels. */
 #define MEMPOOL_F_NO_CACHE_ALIGN 0x0002 /**< Do not align objs on cache lines.*/
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
@@ -270,7 +270,8 @@ struct rte_mempool {
 /* return the header of a mempool object (internal) */
 static inline struct rte_mempool_objhdr *__mempool_get_header(void *obj)
 {
-	return (struct rte_mempool_objhdr *)RTE_PTR_SUB(obj, sizeof(struct rte_mempool_objhdr));
+	return (struct rte_mempool_objhdr *)RTE_PTR_SUB(obj,
+		sizeof(struct rte_mempool_objhdr));
 }
 
 /**
@@ -544,8 +545,9 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 /**
  * Create a new mempool named *name* in memory.
  *
- * This function uses ``memzone_reserve()`` to allocate memory. The
- * pool contains n elements of elt_size. Its size is set to n.
+ * The pool contains n elements of elt_size. Its size is set to n.
+ * This function uses ``memzone_reserve()`` to allocate the mempool header
+ * (and the objects if vaddr is NULL).
  * Depending on the input parameters, mempool elements can be either allocated
  * together with the mempool header, or an externally provided memory buffer
  * could be used to store mempool objects. In later case, that external
@@ -560,18 +562,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  * @param elt_size
  *   The size of each element.
  * @param cache_size
- *   If cache_size is non-zero, the rte_mempool library will try to
- *   limit the accesses to the common lockless pool, by maintaining a
- *   per-lcore object cache. This argument must be lower or equal to
- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
- *   cache_size to have "n modulo cache_size == 0": if this is
- *   not the case, some elements will always stay in the pool and will
- *   never be used. The access to the per-lcore table is of course
- *   faster than the multi-producer/consumer pool. The cache can be
- *   disabled if the cache_size argument is set to 0; it can be useful to
- *   avoid losing objects in cache. Note that even if not used, the
- *   memory space for cache is always reserved in a mempool structure,
- *   except if CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE is set to 0.
+ *   Size of the cache. See rte_mempool_create() for details.
  * @param private_data_size
  *   The size of the private data appended after the mempool
  *   structure. This is useful for storing some private data after the
@@ -585,35 +576,17 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  *   An opaque pointer to data that can be used in the mempool
  *   constructor function.
  * @param obj_init
- *   A function pointer that is called for each object at
- *   initialization of the pool. The user can set some meta data in
- *   objects if needed. This parameter can be NULL if not needed.
- *   The obj_init() function takes the mempool pointer, the init_arg,
- *   the object pointer and the object number as parameters.
+ *   A function called for each object at initialization of the pool.
+ *   See rte_mempool_create() for details.
  * @param obj_init_arg
- *   An opaque pointer to data that can be used as an argument for
- *   each call to the object constructor function.
+ *   An opaque pointer passed to the object constructor function.
  * @param socket_id
  *   The *socket_id* argument is the socket identifier in the case of
  *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
  *   constraint for the reserved zone.
  * @param flags
- *   The *flags* arguments is an OR of following flags:
- *   - MEMPOOL_F_NO_SPREAD: By default, objects addresses are spread
- *     between channels in RAM: the pool allocator will add padding
- *     between objects depending on the hardware configuration. See
- *     Memory alignment constraints for details. If this flag is set,
- *     the allocator will just align them to a cache line.
- *   - MEMPOOL_F_NO_CACHE_ALIGN: By default, the returned objects are
- *     cache-aligned. This flag removes this constraint, and no
- *     padding will be present between objects. This flag implies
- *     MEMPOOL_F_NO_SPREAD.
- *   - MEMPOOL_F_SP_PUT: If this flag is set, the default behavior
- *     when using rte_mempool_put() or rte_mempool_put_bulk() is
- *     "single-producer". Otherwise, it is "multi-producers".
- *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
- *     when using rte_mempool_get() or rte_mempool_get_bulk() is
- *     "single-consumer". Otherwise, it is "multi-consumers".
+ *   Flags controlling the behavior of the mempool. See
+ *   rte_mempool_create() for details.
  * @param vaddr
  *   Virtual address of the externally allocated memory buffer.
  *   Will be used to store mempool objects.
@@ -626,13 +599,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  *   LOG2 of the physical pages size.
  * @return
  *   The pointer to the new allocated mempool, on success. NULL on error
- *   with rte_errno set appropriately. Possible rte_errno values include:
- *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
- *    - E_RTE_SECONDARY - function was called from a secondary process instance
- *    - EINVAL - cache size provided is too large
- *    - ENOSPC - the maximum number of memzones has already been allocated
- *    - EEXIST - a memzone with the same name already exists
- *    - ENOMEM - no appropriate memory area found in which to create memzone
+ *   with rte_errno set appropriately. See rte_mempool_create() for details.
  */
 struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 02/35] mempool: replace elt_size by total_elt_size
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
  2016-03-09 16:19 ` [RFC 01/35] mempool: fix comments and style Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 03/35] mempool: uninline function to check cookies Olivier Matz
                   ` (35 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

In some mempool functions, we use the size of the elements as arguments or in
variables. There is a confusion between the size including or not including
the header and trailer.

To avoid this confusion:
- update the API documentation
- rename the variables and argument names as "elt_size" when the size does not
  include the header and trailer, or else as "total_elt_size".

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 21 +++++++++++----------
 lib/librte_mempool/rte_mempool.h | 19 +++++++++++--------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 6db02ee..25181d4 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -156,13 +156,13 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
  *
  * Given the pointer to the memory, and its topology in physical memory
  * (the physical addresses table), iterate through the "elt_num" objects
- * of size "total_elt_sz" aligned at "align". For each object in this memory
+ * of size "elt_sz" aligned at "align". For each object in this memory
  * chunk, invoke a callback. It returns the effective number of objects
  * in this memory. */
 uint32_t
-rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
-	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
+rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
+	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
 {
 	uint32_t i, j, k;
 	uint32_t pgn, pgf;
@@ -178,7 +178,7 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
 	while (i != elt_num && j != pg_num) {
 
 		start = RTE_ALIGN_CEIL(va, align);
-		end = start + elt_sz;
+		end = start + total_elt_sz;
 
 		/* index of the first page for the next element. */
 		pgf = (end >> pg_shift) - (start >> pg_shift);
@@ -255,6 +255,7 @@ mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
 		mempool_obj_populate, &arg);
 }
 
+/* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 	struct rte_mempool_objsz *sz)
@@ -332,17 +333,17 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  * Calculate maximum amount of memory required to store given number of objects.
  */
 size_t
-rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
+rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 {
 	size_t n, pg_num, pg_sz, sz;
 
 	pg_sz = (size_t)1 << pg_shift;
 
-	if ((n = pg_sz / elt_sz) > 0) {
+	if ((n = pg_sz / total_elt_sz) > 0) {
 		pg_num = (elt_num + n - 1) / n;
 		sz = pg_num << pg_shift;
 	} else {
-		sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num;
+		sz = RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
 	}
 
 	return sz;
@@ -362,7 +363,7 @@ mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
  * given memory footprint to store required number of elements.
  */
 ssize_t
-rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
+rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
 	uint32_t n;
@@ -373,7 +374,7 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
 	va = (uintptr_t)vaddr;
 	uv = va;
 
-	if ((n = rte_mempool_obj_iter(vaddr, elt_num, elt_sz, 1,
+	if ((n = rte_mempool_obj_iter(vaddr, elt_num, total_elt_sz, 1,
 			paddr, pg_num, pg_shift, mempool_lelem_iter,
 			&uv)) != elt_num) {
 		return -(ssize_t)n;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index bd78df5..ca4657f 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1289,7 +1289,7 @@ struct rte_mempool *rte_mempool_lookup(const char *name);
  * calculates header, trailer, body and total sizes of the mempool object.
  *
  * @param elt_size
- *   The size of each element.
+ *   The size of each element, without header and trailer.
  * @param flags
  *   The flags used for the mempool creation.
  *   Consult rte_mempool_create() for more information about possible values.
@@ -1315,14 +1315,15 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  *
  * @param elt_num
  *   Number of elements.
- * @param elt_sz
- *   The size of each element.
+ * @param total_elt_sz
+ *   The size of each element, including header and trailer, as returned
+ *   by rte_mempool_calc_obj_size().
  * @param pg_shift
  *   LOG2 of the physical pages size.
  * @return
  *   Required memory size aligned at page boundary.
  */
-size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
+size_t rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz,
 	uint32_t pg_shift);
 
 /**
@@ -1336,8 +1337,9 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
  *   Will be used to store mempool objects.
  * @param elt_num
  *   Number of elements.
- * @param elt_sz
- *   The size of each element.
+ * @param total_elt_sz
+ *   The size of each element, including header and trailer, as returned
+ *   by rte_mempool_calc_obj_size().
  * @param paddr
  *   Array of physical addresses of the pages that comprises given memory
  *   buffer.
@@ -1351,8 +1353,9 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
  *   buffer is too small, return a negative value whose absolute value
  *   is the actual number of elements that can be stored in that buffer.
  */
-ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
+ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num,
+	size_t total_elt_sz, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift);
 
 /**
  * Walk list of all memory pools
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 03/35] mempool: uninline function to check cookies
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
  2016-03-09 16:19 ` [RFC 01/35] mempool: fix comments and style Olivier Matz
  2016-03-09 16:19 ` [RFC 02/35] mempool: replace elt_size by total_elt_size Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 04/35] mempool: use sizeof to get the size of header and trailer Olivier Matz
                   ` (34 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

There's no reason to keep this function inlined. Move it to
rte_mempool.c.

Note: we don't see it in the patch, but the #pragma ignoring
"-Wcast-qual" is still there in the C file.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 68 +++++++++++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool.h | 77 ++--------------------------------------
 2 files changed, 71 insertions(+), 74 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 25181d4..8188442 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -709,6 +709,74 @@ struct mempool_audit_arg {
 	uint32_t obj_num;
 };
 
+/* check and update cookies or panic (internal) */
+void __mempool_check_cookies(const struct rte_mempool *mp,
+	void * const *obj_table_const, unsigned n, int free)
+{
+	struct rte_mempool_objhdr *hdr;
+	struct rte_mempool_objtlr *tlr;
+	uint64_t cookie;
+	void *tmp;
+	void *obj;
+	void **obj_table;
+
+	/* Force to drop the "const" attribute. This is done only when
+	 * DEBUG is enabled */
+	tmp = (void *) obj_table_const;
+	obj_table = (void **) tmp;
+
+	while (n--) {
+		obj = obj_table[n];
+
+		if (rte_mempool_from_obj(obj) != mp)
+			rte_panic("MEMPOOL: object is owned by another "
+				  "mempool\n");
+
+		hdr = __mempool_get_header(obj);
+		cookie = hdr->cookie;
+
+		if (free == 0) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (put)\n");
+			}
+			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
+		}
+		else if (free == 1) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (get)\n");
+			}
+			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE1;
+		}
+		else if (free == 2) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1 &&
+			    cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (audit)\n");
+			}
+		}
+		tlr = __mempool_get_trailer(obj);
+		cookie = tlr->cookie;
+		if (cookie != RTE_MEMPOOL_TRAILER_COOKIE) {
+			rte_log_set_history(0);
+			RTE_LOG(CRIT, MEMPOOL,
+				"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+				obj, (const void *) mp, cookie);
+			rte_panic("MEMPOOL: bad trailer cookie\n");
+		}
+	}
+}
+
 static void
 mempool_obj_audit(void *arg, void *start, void *end, uint32_t idx)
 {
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index ca4657f..6d98cdf 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -296,6 +296,7 @@ static inline struct rte_mempool_objtlr *__mempool_get_trailer(void *obj)
 	return (struct rte_mempool_objtlr *)RTE_PTR_ADD(obj, mp->elt_size);
 }
 
+#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 /**
  * @internal Check and update cookies or panic.
  *
@@ -310,80 +311,8 @@ static inline struct rte_mempool_objtlr *__mempool_get_trailer(void *obj)
  *   - 1: object is supposed to be free, mark it as allocated
  *   - 2: just check that cookie is valid (free or allocated)
  */
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#ifndef __INTEL_COMPILER
-#pragma GCC diagnostic ignored "-Wcast-qual"
-#endif
-static inline void __mempool_check_cookies(const struct rte_mempool *mp,
-					   void * const *obj_table_const,
-					   unsigned n, int free)
-{
-	struct rte_mempool_objhdr *hdr;
-	struct rte_mempool_objtlr *tlr;
-	uint64_t cookie;
-	void *tmp;
-	void *obj;
-	void **obj_table;
-
-	/* Force to drop the "const" attribute. This is done only when
-	 * DEBUG is enabled */
-	tmp = (void *) obj_table_const;
-	obj_table = (void **) tmp;
-
-	while (n--) {
-		obj = obj_table[n];
-
-		if (rte_mempool_from_obj(obj) != mp)
-			rte_panic("MEMPOOL: object is owned by another "
-				  "mempool\n");
-
-		hdr = __mempool_get_header(obj);
-		cookie = hdr->cookie;
-
-		if (free == 0) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (put)\n");
-			}
-			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
-		}
-		else if (free == 1) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (get)\n");
-			}
-			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE1;
-		}
-		else if (free == 2) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1 &&
-			    cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (audit)\n");
-			}
-		}
-		tlr = __mempool_get_trailer(obj);
-		cookie = tlr->cookie;
-		if (cookie != RTE_MEMPOOL_TRAILER_COOKIE) {
-			rte_log_set_history(0);
-			RTE_LOG(CRIT, MEMPOOL,
-				"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-				obj, (const void *) mp, cookie);
-			rte_panic("MEMPOOL: bad trailer cookie\n");
-		}
-	}
-}
-#ifndef __INTEL_COMPILER
-#pragma GCC diagnostic error "-Wcast-qual"
-#endif
+void __mempool_check_cookies(const struct rte_mempool *mp,
+	void * const *obj_table_const, unsigned n, int free);
 #else
 #define __mempool_check_cookies(mp, obj_table_const, n, free) do {} while(0)
 #endif /* RTE_LIBRTE_MEMPOOL_DEBUG */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 04/35] mempool: use sizeof to get the size of header and trailer
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (2 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 03/35] mempool: uninline function to check cookies Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 05/35] mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t Olivier Matz
                   ` (33 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Since commits d2e0ca22f and 97e7e685b the headers and trailers
of the mempool are defined as a structure. We can get their
size using a sizeof instead of doing a calculation that will
become wrong at the first structure update.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 8188442..ce0470d 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -264,24 +264,13 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 
 	sz = (sz != NULL) ? sz : &lsz;
 
-	/*
-	 * In header, we have at least the pointer to the pool, and
-	 * optionaly a 64 bits cookie.
-	 */
-	sz->header_size = 0;
-	sz->header_size += sizeof(struct rte_mempool *); /* ptr to pool */
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-	sz->header_size += sizeof(uint64_t); /* cookie */
-#endif
+	sz->header_size = sizeof(struct rte_mempool_objhdr);
 	if ((flags & MEMPOOL_F_NO_CACHE_ALIGN) == 0)
 		sz->header_size = RTE_ALIGN_CEIL(sz->header_size,
 			RTE_MEMPOOL_ALIGN);
 
-	/* trailer contains the cookie in debug mode */
-	sz->trailer_size = 0;
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-	sz->trailer_size += sizeof(uint64_t); /* cookie */
-#endif
+	sz->trailer_size = sizeof(struct rte_mempool_objtlr);
+
 	/* element size is 8 bytes-aligned at least */
 	sz->elt_size = RTE_ALIGN_CEIL(elt_size, sizeof(uint64_t));
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 05/35] mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (3 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 04/35] mempool: use sizeof to get the size of header and trailer Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 06/35] mempool: update library version Olivier Matz
                   ` (32 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

In next commits, we will add the ability to populate the
mempool and iterate through objects using the same function.
We will use the same callback type for that. As the callback is
not a constructor anymore, rename it into rte_mempool_obj_cb_t.

The rte_mempool_obj_iter_t that was used to iterate over objects
will be removed in next commits.

No functional change.
In this commit, the API is preserved through a compat typedef.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test-pmd/mempool_anon.c                |  4 ++--
 app/test-pmd/mempool_osdep.h               |  2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.h      |  2 +-
 drivers/net/xenvirt/rte_mempool_gntalloc.c |  4 ++--
 lib/librte_mempool/rte_dom0_mempool.c      |  2 +-
 lib/librte_mempool/rte_mempool.c           |  8 ++++----
 lib/librte_mempool/rte_mempool.h           | 27 ++++++++++++++-------------
 7 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/app/test-pmd/mempool_anon.c b/app/test-pmd/mempool_anon.c
index 4730432..5e23848 100644
--- a/app/test-pmd/mempool_anon.c
+++ b/app/test-pmd/mempool_anon.c
@@ -86,7 +86,7 @@ struct rte_mempool *
 mempool_anon_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	struct rte_mempool *mp;
@@ -190,7 +190,7 @@ mempool_anon_create(__rte_unused const char *name,
 	__rte_unused unsigned private_data_size,
 	__rte_unused rte_mempool_ctor_t *mp_init,
 	__rte_unused void *mp_init_arg,
-	__rte_unused rte_mempool_obj_ctor_t *obj_init,
+	__rte_unused rte_mempool_obj_cb_t *obj_init,
 	__rte_unused void *obj_init_arg,
 	__rte_unused int socket_id, __rte_unused unsigned flags)
 {
diff --git a/app/test-pmd/mempool_osdep.h b/app/test-pmd/mempool_osdep.h
index 6b8df68..7ce7297 100644
--- a/app/test-pmd/mempool_osdep.h
+++ b/app/test-pmd/mempool_osdep.h
@@ -48,7 +48,7 @@ struct rte_mempool *
 mempool_anon_create(const char *name, unsigned n, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 	int socket_id, unsigned flags);
 
 #endif /*_RTE_MEMPOOL_OSDEP_H_ */
diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.h b/drivers/net/xenvirt/rte_eth_xenvirt.h
index fc15a63..4995a9b 100644
--- a/drivers/net/xenvirt/rte_eth_xenvirt.h
+++ b/drivers/net/xenvirt/rte_eth_xenvirt.h
@@ -51,7 +51,7 @@ struct rte_mempool *
 rte_mempool_gntalloc_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags);
 
 
diff --git a/drivers/net/xenvirt/rte_mempool_gntalloc.c b/drivers/net/xenvirt/rte_mempool_gntalloc.c
index 7bfbfda..69b9231 100644
--- a/drivers/net/xenvirt/rte_mempool_gntalloc.c
+++ b/drivers/net/xenvirt/rte_mempool_gntalloc.c
@@ -78,7 +78,7 @@ static struct _mempool_gntalloc_info
 _create_mempool(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	struct _mempool_gntalloc_info mgi;
@@ -253,7 +253,7 @@ struct rte_mempool *
 rte_mempool_gntalloc_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	int rv;
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
index 0d6d750..0051bd5 100644
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ b/lib/librte_mempool/rte_dom0_mempool.c
@@ -83,7 +83,7 @@ struct rte_mempool *
 rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 	int socket_id, unsigned flags)
 {
 	struct rte_mempool *mp = NULL;
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index ce0470d..83e7ed6 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -128,7 +128,7 @@ static unsigned optimize_object_size(unsigned obj_size)
 
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg)
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
@@ -224,7 +224,7 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 
 struct mempool_populate_arg {
 	struct rte_mempool     *mp;
-	rte_mempool_obj_ctor_t *obj_init;
+	rte_mempool_obj_cb_t   *obj_init;
 	void                   *obj_init_arg;
 };
 
@@ -239,7 +239,7 @@ mempool_obj_populate(void *arg, void *start, void *end, uint32_t idx)
 
 static void
 mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg)
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
 {
 	uint32_t elt_sz;
 	struct mempool_populate_arg arg;
@@ -429,7 +429,7 @@ struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags, void *vaddr,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 6d98cdf..da04021 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -318,6 +318,17 @@ void __mempool_check_cookies(const struct rte_mempool *mp,
 #endif /* RTE_LIBRTE_MEMPOOL_DEBUG */
 
 /**
+ * An object callback function for mempool.
+ *
+ * Arguments are the mempool, the opaque pointer given by the user in
+ * rte_mempool_create(), the pointer to the element and the index of
+ * the element in the pool.
+ */
+typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
+		void *opaque, void *obj, unsigned obj_idx);
+typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
+
+/**
  * A mempool object iterator callback function.
  */
 typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
@@ -366,16 +377,6 @@ uint32_t rte_mempool_obj_iter(void *vaddr,
 	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg);
 
 /**
- * An object constructor callback function for mempool.
- *
- * Arguments are the mempool, the opaque pointer given by the user in
- * rte_mempool_create(), the pointer to the element and the index of
- * the element in the pool.
- */
-typedef void (rte_mempool_obj_ctor_t)(struct rte_mempool *, void *,
-				      void *, unsigned);
-
-/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -468,7 +469,7 @@ struct rte_mempool *
 rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags);
 
 /**
@@ -534,7 +535,7 @@ struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags, void *vaddr,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
@@ -623,7 +624,7 @@ struct rte_mempool *
 rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags);
 
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 06/35] mempool: update library version
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (4 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 05/35] mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 07/35] mempool: list objects when added in the mempool Olivier Matz
                   ` (31 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Next changes of this patch series are too heavy to keep a compat
layer. So bump the version number of the library.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/rel_notes/release_16_04.rst     | 2 +-
 lib/librte_mempool/Makefile                | 2 +-
 lib/librte_mempool/rte_mempool_version.map | 6 ++++++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 8273817..1ef8fa4 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -137,7 +137,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_kvargs.so.1
      librte_lpm.so.2
      librte_mbuf.so.2
-     librte_mempool.so.1
+   + librte_mempool.so.2
      librte_meter.so.1
      librte_pipeline.so.2
      librte_pmd_bond.so.1
diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index a6898ef..706f844 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -38,7 +38,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
 
 EXPORT_MAP := rte_mempool_version.map
 
-LIBABIVER := 1
+LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 17151e0..8c157d0 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -17,3 +17,9 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_16.07 {
+	global:
+
+	local: *;
+} DPDK_2.0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 07/35] mempool: list objects when added in the mempool
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (5 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 06/35] mempool: update library version Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 08/35] mempool: remove const attribute in mempool_walk Olivier Matz
                   ` (30 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Introduce a list entry in object header so they can be listed and
browsed. The objective is to introduce a more simple way to browse the
elements of a mempool.

The next commits will update rte_mempool_obj_iter() to use this list,
and remove the previous complex implementation.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c |  2 ++
 lib/librte_mempool/rte_mempool.h | 15 ++++++++++++---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 83e7ed6..1fe102f 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -138,6 +138,7 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
+	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
@@ -585,6 +586,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->cache_size = cache_size;
 	mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
 	mp->private_data_size = private_data_size;
+	STAILQ_INIT(&mp->elt_list);
 
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index da04021..469bcbc 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -150,11 +150,13 @@ struct rte_mempool_objsz {
  * Mempool object header structure
  *
  * Each object stored in mempools are prefixed by this header structure,
- * it allows to retrieve the mempool pointer from the object. When debug
- * is enabled, a cookie is also added in this structure preventing
- * corruptions and double-frees.
+ * it allows to retrieve the mempool pointer from the object and to
+ * iterate on all objects attached to a mempool. When debug is enabled,
+ * a cookie is also added in this structure preventing corruptions and
+ * double-frees.
  */
 struct rte_mempool_objhdr {
+	STAILQ_ENTRY(rte_mempool_objhdr) next; /**< Next in list. */
 	struct rte_mempool *mp;          /**< The mempool owning the object. */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	uint64_t cookie;                 /**< Debug cookie. */
@@ -162,6 +164,11 @@ struct rte_mempool_objhdr {
 };
 
 /**
+ * A list of object headers type
+ */
+STAILQ_HEAD(rte_mempool_objhdr_list, rte_mempool_objhdr);
+
+/**
  * Mempool object trailer structure
  *
  * In debug mode, each object stored in mempools are suffixed by this
@@ -194,6 +201,8 @@ struct rte_mempool {
 
 	struct rte_mempool_cache *local_cache; /**< Per-lcore local cache */
 
+	struct rte_mempool_objhdr_list elt_list; /**< List of objects in pool */
+
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	/** Per-lcore statistics. */
 	struct rte_mempool_debug_stats stats[RTE_MAX_LCORE];
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 08/35] mempool: remove const attribute in mempool_walk
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (6 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 07/35] mempool: list objects when added in the mempool Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 09/35] mempool: use the list to iterate the mempool elements Olivier Matz
                   ` (29 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Most functions that can be done on a mempool require a non-const mempool
pointer, except the dump and the audit. Therefore, the mempool_walk()
is more useful if the mempool pointer is not const.

This is required by next commit where the mellanox drivers use
rte_mempool_walk() to iterate the mempools, then rte_mempool_obj_iter()
to iterate the objects in each mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 2 +-
 lib/librte_mempool/rte_mempool.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 1fe102f..237ba69 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -965,7 +965,7 @@ rte_mempool_lookup(const char *name)
 	return mp;
 }
 
-void rte_mempool_walk(void (*func)(const struct rte_mempool *, void *),
+void rte_mempool_walk(void (*func)(struct rte_mempool *, void *),
 		      void *arg)
 {
 	struct rte_tailq_entry *te = NULL;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 469bcbc..54a5917 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1304,7 +1304,7 @@ ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num,
  * @param arg
  *   Argument passed to iterator
  */
-void rte_mempool_walk(void (*func)(const struct rte_mempool *, void *arg),
+void rte_mempool_walk(void (*func)(struct rte_mempool *, void *arg),
 		      void *arg);
 
 #ifdef __cplusplus
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 09/35] mempool: use the list to iterate the mempool elements
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (7 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 08/35] mempool: remove const attribute in mempool_walk Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 10/35] eal: introduce RTE_DECONST macro Olivier Matz
                   ` (28 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Now that the mempool objects are chained into a list, we can use it to
browse them. This implies a rework of rte_mempool_obj_iter() API, that
does not need to take as many arguments as before. The previous function
is kept as a private function, and renamed in this commit. It will be
removed in a next commit of the patch series.

The only internal users of this function are the mellanox drivers. The
code is updated accordingly.

Introducing an API compatibility for this function has been considered,
but it is not easy to do without keeping the old code, as the previous
function could also be used to browse elements that were not added in a
mempool. Moreover, the API is already be broken by other patches in this
version.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx4/mlx4.c                    | 53 +++++++---------------
 drivers/net/mlx5/mlx5_rxtx.c               | 53 +++++++---------------
 drivers/net/mlx5/mlx5_rxtx.h               |  2 +-
 lib/librte_mempool/rte_mempool.c           | 36 ++++++++++++---
 lib/librte_mempool/rte_mempool.h           | 70 ++++++++----------------------
 lib/librte_mempool/rte_mempool_version.map |  3 +-
 6 files changed, 85 insertions(+), 132 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index ee00151..d9b2291 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1265,7 +1265,6 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 }
 
 struct txq_mp2mr_mbuf_check_data {
-	const struct rte_mempool *mp;
 	int ret;
 };
 
@@ -1273,34 +1272,26 @@ struct txq_mp2mr_mbuf_check_data {
  * Callback function for rte_mempool_obj_iter() to check whether a given
  * mempool object looks like a mbuf.
  *
- * @param[in, out] arg
- *   Context data (struct txq_mp2mr_mbuf_check_data). Contains mempool pointer
- *   and return value.
- * @param[in] start
- *   Object start address.
- * @param[in] end
- *   Object end address.
+ * @param[in] mp
+ *   The mempool pointer
+ * @param[in] arg
+ *   Context data (struct txq_mp2mr_mbuf_check_data). Contains the
+ *   return value.
+ * @param[in] obj
+ *   Object address.
  * @param index
- *   Unused.
- *
- * @return
- *   Nonzero value when object is not a mbuf.
+ *   Object index, unused.
  */
 static void
-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
-		     uint32_t index __rte_unused)
+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
+	__rte_unused uint32_t index)
 {
 	struct txq_mp2mr_mbuf_check_data *data = arg;
-	struct rte_mbuf *buf =
-		(void *)((uintptr_t)start + data->mp->header_size);
+	struct rte_mbuf *buf = obj;
 
-	(void)index;
 	/* Check whether mbuf structure fits element size and whether mempool
 	 * pointer is valid. */
-	if (((uintptr_t)end >= (uintptr_t)(buf + 1)) &&
-	    (buf->pool == data->mp))
-		data->ret = 0;
-	else
+	if (sizeof(*buf) > mp->elt_size || buf->pool != mp)
 		data->ret = -1;
 }
 
@@ -1314,28 +1305,16 @@ txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
  *   Pointer to TX queue structure.
  */
 static void
-txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
+txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
-		.mp = mp,
-		.ret = -1,
+		.ret = 0,
 	};
 
-	/* Discard empty mempools. */
-	if (mp->size == 0)
-		return;
 	/* Register mempool only if the first element looks like a mbuf. */
-	rte_mempool_obj_iter((void *)mp->elt_va_start,
-			     1,
-			     mp->header_size + mp->elt_size + mp->trailer_size,
-			     1,
-			     mp->elt_pa,
-			     mp->pg_num,
-			     mp->pg_shift,
-			     txq_mp2mr_mbuf_check,
-			     &data);
-	if (data.ret)
+	if (rte_mempool_obj_iter(mp, txq_mp2mr_mbuf_check, &data) == 0 ||
+			data.ret == -1)
 		return;
 	txq_mp2mr(txq, mp);
 }
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index fa5e648..f002ca2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -193,7 +193,6 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 }
 
 struct txq_mp2mr_mbuf_check_data {
-	const struct rte_mempool *mp;
 	int ret;
 };
 
@@ -201,34 +200,26 @@ struct txq_mp2mr_mbuf_check_data {
  * Callback function for rte_mempool_obj_iter() to check whether a given
  * mempool object looks like a mbuf.
  *
- * @param[in, out] arg
- *   Context data (struct txq_mp2mr_mbuf_check_data). Contains mempool pointer
- *   and return value.
- * @param[in] start
- *   Object start address.
- * @param[in] end
- *   Object end address.
+ * @param[in] mp
+ *   The mempool pointer
+ * @param[in] arg
+ *   Context data (struct txq_mp2mr_mbuf_check_data). Contains the
+ *   return value.
+ * @param[in] obj
+ *   Object address.
  * @param index
- *   Unused.
- *
- * @return
- *   Nonzero value when object is not a mbuf.
+ *   Object index, unused.
  */
 static void
-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
-		     uint32_t index __rte_unused)
+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
+	__rte_unused uint32_t index)
 {
 	struct txq_mp2mr_mbuf_check_data *data = arg;
-	struct rte_mbuf *buf =
-		(void *)((uintptr_t)start + data->mp->header_size);
+	struct rte_mbuf *buf = obj;
 
-	(void)index;
 	/* Check whether mbuf structure fits element size and whether mempool
 	 * pointer is valid. */
-	if (((uintptr_t)end >= (uintptr_t)(buf + 1)) &&
-	    (buf->pool == data->mp))
-		data->ret = 0;
-	else
+	if (sizeof(*buf) > mp->elt_size || buf->pool == mp)
 		data->ret = -1;
 }
 
@@ -242,28 +233,16 @@ txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
  *   Pointer to TX queue structure.
  */
 void
-txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
+txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
-		.mp = mp,
-		.ret = -1,
+		.ret = 0,
 	};
 
-	/* Discard empty mempools. */
-	if (mp->size == 0)
-		return;
 	/* Register mempool only if the first element looks like a mbuf. */
-	rte_mempool_obj_iter((void *)mp->elt_va_start,
-			     1,
-			     mp->header_size + mp->elt_size + mp->trailer_size,
-			     1,
-			     mp->elt_pa,
-			     mp->pg_num,
-			     mp->pg_shift,
-			     txq_mp2mr_mbuf_check,
-			     &data);
-	if (data.ret)
+	if (rte_mempool_obj_iter(mp, txq_mp2mr_mbuf_check, &data) == 0 ||
+			data.ret == -1)
 		return;
 	txq_mp2mr(txq, mp);
 }
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index e1e1925..4b1b88c 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -264,7 +264,7 @@ void mlx5_tx_queue_release(void *);
 
 /* mlx5_rxtx.c */
 
-void txq_mp2mr_iter(const struct rte_mempool *, void *);
+void txq_mp2mr_iter(struct rte_mempool *, void *);
 uint16_t mlx5_tx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst_sp(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst(void *, struct rte_mbuf **, uint16_t);
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 237ba69..0f7c41f 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -126,6 +126,14 @@ static unsigned optimize_object_size(unsigned obj_size)
 	return new_obj_size * RTE_MEMPOOL_ALIGN;
 }
 
+/**
+ * A mempool object iterator callback function.
+ */
+typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
+	void * /*obj_start*/,
+	void * /*obj_end*/,
+	uint32_t /*obj_index */);
+
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
@@ -160,8 +168,8 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
  * of size "elt_sz" aligned at "align". For each object in this memory
  * chunk, invoke a callback. It returns the effective number of objects
  * in this memory. */
-uint32_t
-rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
+static uint32_t
+rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
 	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
 {
@@ -219,6 +227,24 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	return i;
 }
 
+/* call obj_cb() for each mempool element */
+uint32_t
+rte_mempool_obj_iter(struct rte_mempool *mp,
+	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg)
+{
+	struct rte_mempool_objhdr *hdr;
+	void *obj;
+	unsigned n = 0;
+
+	STAILQ_FOREACH(hdr, &mp->elt_list, next) {
+		obj = (char *)hdr + sizeof(*hdr);
+		obj_cb(mp, obj_cb_arg, obj, n);
+		n++;
+	}
+
+	return n;
+}
+
 /*
  * Populate  mempool with the objects.
  */
@@ -250,7 +276,7 @@ mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
 	arg.obj_init = obj_init;
 	arg.obj_init_arg = obj_init_arg;
 
-	mp->size = rte_mempool_obj_iter((void *)mp->elt_va_start,
+	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		num, elt_sz, align,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
 		mempool_obj_populate, &arg);
@@ -364,7 +390,7 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	va = (uintptr_t)vaddr;
 	uv = va;
 
-	if ((n = rte_mempool_obj_iter(vaddr, elt_num, total_elt_sz, 1,
+	if ((n = rte_mempool_obj_mem_iter(vaddr, elt_num, total_elt_sz, 1,
 			paddr, pg_num, pg_shift, mempool_lelem_iter,
 			&uv)) != elt_num) {
 		return -(ssize_t)n;
@@ -792,7 +818,7 @@ mempool_audit_cookies(const struct rte_mempool *mp)
 	arg.obj_end = mp->elt_va_start;
 	arg.obj_num = 0;
 
-	num = rte_mempool_obj_iter((void *)mp->elt_va_start,
+	num = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		mp->size, elt_sz, 1,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
 		mempool_obj_audit, &arg);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 54a5917..ca37120 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -329,63 +329,13 @@ void __mempool_check_cookies(const struct rte_mempool *mp,
 /**
  * An object callback function for mempool.
  *
- * Arguments are the mempool, the opaque pointer given by the user in
- * rte_mempool_create(), the pointer to the element and the index of
- * the element in the pool.
+ * Used by rte_mempool_create() and rte_mempool_obj_iter().
  */
 typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
 		void *opaque, void *obj, unsigned obj_idx);
 typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
 
 /**
- * A mempool object iterator callback function.
- */
-typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
-	void * /*obj_start*/,
-	void * /*obj_end*/,
-	uint32_t /*obj_index */);
-
-/**
- * Call a function for each mempool object in a memory chunk
- *
- * Iterate across objects of the given size and alignment in the
- * provided chunk of memory. The given memory buffer can consist of
- * disjointed physical pages.
- *
- * For each object, call the provided callback (if any). This function
- * is used to populate a mempool, or walk through all the elements of a
- * mempool, or estimate how many elements of the given size could be
- * created in the given memory buffer.
- *
- * @param vaddr
- *   Virtual address of the memory buffer.
- * @param elt_num
- *   Maximum number of objects to iterate through.
- * @param elt_sz
- *   Size of each object.
- * @param align
- *   Alignment of each object.
- * @param paddr
- *   Array of physical addresses of the pages that comprises given memory
- *   buffer.
- * @param pg_num
- *   Number of elements in the paddr array.
- * @param pg_shift
- *   LOG2 of the physical pages size.
- * @param obj_iter
- *   Object iterator callback function (could be NULL).
- * @param obj_iter_arg
- *   User defined parameter for the object iterator callback function.
- *
- * @return
- *   Number of objects iterated through.
- */
-uint32_t rte_mempool_obj_iter(void *vaddr,
-	uint32_t elt_num, size_t elt_sz, size_t align,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
-	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg);
-
-/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -638,6 +588,24 @@ rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
 
 
 /**
+ * Call a function for each mempool element
+ *
+ * Iterate across all objects attached to a rte_mempool and call the
+ * callback function on it.
+ *
+ * @param mp
+ *   A pointer to an initialized mempool.
+ * @param obj_cb
+ *   A function pointer that is called for each object.
+ * @param obj_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   Number of objects iterated.
+ */
+uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
+	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg);
+
+/**
  * Dump the status of the mempool to the console.
  *
  * @param f
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 8c157d0..4db75ca 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -9,7 +9,6 @@ DPDK_2.0 {
 	rte_mempool_dump;
 	rte_mempool_list_dump;
 	rte_mempool_lookup;
-	rte_mempool_obj_iter;
 	rte_mempool_walk;
 	rte_mempool_xmem_create;
 	rte_mempool_xmem_size;
@@ -21,5 +20,7 @@ DPDK_2.0 {
 DPDK_16.07 {
 	global:
 
+	rte_mempool_obj_iter;
+
 	local: *;
 } DPDK_2.0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (8 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 09/35] mempool: use the list to iterate the mempool elements Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 18:53   ` Stephen Hemminger
  2016-03-09 16:19 ` [RFC 11/35] mempool: use the list to audit all elements Olivier Matz
                   ` (27 subsequent siblings)
  37 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

This macro removes the const attribute of a variable. It must be used
with care in specific situations. It's better to use this macro instead
of a manual cast, as it explicitly shows the intention of the developer.

This macro is used in the next commit of the series.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/include/rte_common.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_common.h b/lib/librte_eal/common/include/rte_common.h
index 332f2a4..dc0fc83 100644
--- a/lib/librte_eal/common/include/rte_common.h
+++ b/lib/librte_eal/common/include/rte_common.h
@@ -285,6 +285,15 @@ rte_align64pow2(uint64_t v)
 
 /*********** Other general functions / macros ********/
 
+/**
+ * Remove the const attribute of a variable
+ *
+ * This must be used with care in specific situations. It's better to
+ * use this macro instead of a manual cast, as it explicitly shows the
+ * intention of the developer.
+ */
+#define RTE_DECONST(type, var) ((type)(uintptr_t)(const void *)(var))
+
 #ifdef __SSE2__
 #include <emmintrin.h>
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 11/35] mempool: use the list to audit all elements
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (9 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 10/35] eal: introduce RTE_DECONST macro Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 12/35] mempool: use the list to initialize mempool objects Olivier Matz
                   ` (26 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Use the new rte_mempool_obj_iter() instead the old rte_mempool_obj_iter()
to iterate among objects to audit them (check for cookies).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 42 +++++++---------------------------------
 1 file changed, 7 insertions(+), 35 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0f7c41f..a9af2fc 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -720,12 +720,6 @@ rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
 
-struct mempool_audit_arg {
-	const struct rte_mempool *mp;
-	uintptr_t obj_end;
-	uint32_t obj_num;
-};
-
 /* check and update cookies or panic (internal) */
 void __mempool_check_cookies(const struct rte_mempool *mp,
 	void * const *obj_table_const, unsigned n, int free)
@@ -795,45 +789,23 @@ void __mempool_check_cookies(const struct rte_mempool *mp,
 }
 
 static void
-mempool_obj_audit(void *arg, void *start, void *end, uint32_t idx)
+mempool_obj_audit(struct rte_mempool *mp, __rte_unused void *opaque,
+	void *obj, __rte_unused unsigned idx)
 {
-	struct mempool_audit_arg *pa = arg;
-	void *obj;
-
-	obj = (char *)start + pa->mp->header_size;
-	pa->obj_end = (uintptr_t)end;
-	pa->obj_num = idx + 1;
-	__mempool_check_cookies(pa->mp, &obj, 1, 2);
+	__mempool_check_cookies(mp, &obj, 1, 2);
 }
 
 static void
 mempool_audit_cookies(const struct rte_mempool *mp)
 {
-	uint32_t elt_sz, num;
-	struct mempool_audit_arg arg;
-
-	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-
-	arg.mp = mp;
-	arg.obj_end = mp->elt_va_start;
-	arg.obj_num = 0;
-
-	num = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
-		mp->size, elt_sz, 1,
-		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_audit, &arg);
+	unsigned num;
 
+	num = rte_mempool_obj_iter(RTE_DECONST(void *, mp),
+		mempool_obj_audit, NULL);
 	if (num != mp->size) {
-			rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
+		rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
 			"iterated only over %u elements\n",
 			mp, mp->size, num);
-	} else if (arg.obj_end != mp->elt_va_end || arg.obj_num != mp->size) {
-			rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
-			"last callback va_end: %#tx (%#tx expeceted), "
-			"num of objects: %u (%u expected)\n",
-			mp, mp->size,
-			arg.obj_end, mp->elt_va_end,
-			arg.obj_num, mp->size);
 	}
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 12/35] mempool: use the list to initialize mempool objects
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (10 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 11/35] mempool: use the list to audit all elements Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 13/35] mempool: create the internal ring in a specific function Olivier Matz
                   ` (25 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Before this patch, the mempool elements were initialized at the time
they were added to the mempool. This patch changes this to do the
initialization of all objects once the mempool is populated, using
rte_mempool_obj_iter() introduced in previous commits.

Thanks to this modification, we are getting closer to a new API
that would allow us to do:
  mempool_init()
  mempool_populate(mem1)
  mempool_populate(mem2)
  mempool_populate(mem3)
  mempool_init_obj()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 36 +++++++++++++-----------------------
 1 file changed, 13 insertions(+), 23 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index a9af2fc..4145e2e 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -135,8 +135,7 @@ typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
 	uint32_t /*obj_index */);
 
 static void
-mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
+mempool_add_elem(struct rte_mempool *mp, void *obj)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
@@ -153,9 +152,6 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	tlr = __mempool_get_trailer(obj);
 	tlr->cookie = RTE_MEMPOOL_TRAILER_COOKIE;
 #endif
-	/* call the initializer */
-	if (obj_init)
-		obj_init(mp, obj_init_arg, obj, obj_idx);
 
 	/* enqueue in ring */
 	rte_ring_sp_enqueue(mp->ring, obj);
@@ -249,37 +245,27 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
  * Populate  mempool with the objects.
  */
 
-struct mempool_populate_arg {
-	struct rte_mempool     *mp;
-	rte_mempool_obj_cb_t   *obj_init;
-	void                   *obj_init_arg;
-};
-
 static void
-mempool_obj_populate(void *arg, void *start, void *end, uint32_t idx)
+mempool_obj_populate(void *arg, void *start, void *end,
+	__rte_unused uint32_t idx)
 {
-	struct mempool_populate_arg *pa = arg;
+	struct rte_mempool *mp = arg;
 
-	mempool_add_elem(pa->mp, start, idx, pa->obj_init, pa->obj_init_arg);
-	pa->mp->elt_va_end = (uintptr_t)end;
+	mempool_add_elem(mp, start);
+	mp->elt_va_end = (uintptr_t)end;
 }
 
 static void
-mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
+mempool_populate(struct rte_mempool *mp, size_t num, size_t align)
 {
 	uint32_t elt_sz;
-	struct mempool_populate_arg arg;
 
 	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-	arg.mp = mp;
-	arg.obj_init = obj_init;
-	arg.obj_init_arg = obj_init_arg;
 
 	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		num, elt_sz, align,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_populate, &arg);
+		mempool_obj_populate, mp);
 }
 
 /* get the header, trailer and total size of a mempool element. */
@@ -648,7 +634,11 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (mp_init)
 		mp_init(mp, mp_init_arg);
 
-	mempool_populate(mp, n, 1, obj_init, obj_init_arg);
+	mempool_populate(mp, n, 1);
+
+	/* call the initializer */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
 
 	te->data = (void *) mp;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 13/35] mempool: create the internal ring in a specific function
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (11 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 12/35] mempool: use the list to initialize mempool objects Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 14/35] mempool: store physaddr in mempool objects Olivier Matz
                   ` (24 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

This makes the code of rte_mempool_create() clearer, and it will make
the introduction of external mempool handler easier (in another patch
series). Indeed, this function contains the specific part when a ring is
used, but it could be replaced by something else in the future.

This commit also adds a socket_id field in the mempool structure that
is used by this new function.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 55 +++++++++++++++++++++++++---------------
 lib/librte_mempool/rte_mempool.h |  1 +
 2 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 4145e2e..d533484 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -431,6 +431,35 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 					       MEMPOOL_PG_SHIFT_MAX);
 }
 
+/* create the internal ring */
+static int
+rte_mempool_ring_create(struct rte_mempool *mp)
+{
+	int rg_flags = 0;
+	char rg_name[RTE_RING_NAMESIZE];
+	struct rte_ring *r;
+
+	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, mp->name);
+
+	/* ring flags */
+	if (mp->flags & MEMPOOL_F_SP_PUT)
+		rg_flags |= RING_F_SP_ENQ;
+	if (mp->flags & MEMPOOL_F_SC_GET)
+		rg_flags |= RING_F_SC_DEQ;
+
+	/* Allocate the ring that will be used to store objects.
+	 * Ring functions will return appropriate errors if we are
+	 * running as a secondary process etc., so no checks made
+	 * in this function for that condition. */
+	r = rte_ring_create(rg_name, rte_align32pow2(mp->size + 1),
+		mp->socket_id, rg_flags);
+	if (r == NULL)
+		return -rte_errno;
+
+	mp->ring = r;
+	return 0;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -447,15 +476,12 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
 	char mz_name[RTE_MEMZONE_NAMESIZE];
-	char rg_name[RTE_RING_NAMESIZE];
 	struct rte_mempool_list *mempool_list;
 	struct rte_mempool *mp = NULL;
 	struct rte_tailq_entry *te = NULL;
-	struct rte_ring *r = NULL;
 	const struct rte_memzone *mz;
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-	int rg_flags = 0;
 	void *obj;
 	struct rte_mempool_objsz objsz;
 	void *startaddr;
@@ -498,12 +524,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		flags |= MEMPOOL_F_NO_SPREAD;
 
-	/* ring flags */
-	if (flags & MEMPOOL_F_SP_PUT)
-		rg_flags |= RING_F_SP_ENQ;
-	if (flags & MEMPOOL_F_SC_GET)
-		rg_flags |= RING_F_SC_DEQ;
-
 	/* calculate mempool object sizes. */
 	if (!rte_mempool_calc_obj_size(elt_size, flags, &objsz)) {
 		rte_errno = EINVAL;
@@ -512,15 +532,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 	rte_rwlock_write_lock(RTE_EAL_MEMPOOL_RWLOCK);
 
-	/* allocate the ring that will be used to store objects */
-	/* Ring functions will return appropriate errors if we are
-	 * running as a secondary process etc., so no checks made
-	 * in this function for that condition */
-	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, name);
-	r = rte_ring_create(rg_name, rte_align32pow2(n+1), socket_id, rg_flags);
-	if (r == NULL)
-		goto exit_unlock;
-
 	/*
 	 * reserve a memory zone for this mempool: private data is
 	 * cache-aligned
@@ -589,7 +600,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->phys_addr = mz->phys_addr;
-	mp->ring = r;
+	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
 	mp->elt_size = objsz.elt_size;
@@ -600,6 +611,9 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->private_data_size = private_data_size;
 	STAILQ_INIT(&mp->elt_list);
 
+	if (rte_mempool_ring_create(mp) < 0)
+		goto exit_unlock;
+
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
 	 * The local_cache points to just past the elt_pa[] array.
@@ -651,7 +665,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	rte_ring_free(r);
+	if (mp != NULL)
+		rte_ring_free(mp->ring);
 	rte_free(te);
 
 	return NULL;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index ca37120..5b760f0 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -188,6 +188,7 @@ struct rte_mempool {
 	struct rte_ring *ring;           /**< Ring to store objects. */
 	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
 	int flags;                       /**< Flags of the mempool. */
+	int socket_id;                   /**< Socket id passed at mempool creation. */
 	uint32_t size;                   /**< Size of the mempool. */
 	uint32_t cache_size;             /**< Size of per-lcore local cache. */
 	uint32_t cache_flushthresh;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 14/35] mempool: store physaddr in mempool objects
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (12 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 13/35] mempool: create the internal ring in a specific function Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 15/35] mempool: remove MEMPOOL_IS_CONTIG() Olivier Matz
                   ` (23 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Store the physical address of the object in its header. It simplifies
rte_mempool_virt2phy() and prepares the removing of the paddr[] table
in the mempool header.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++++++++++------
 lib/librte_mempool/rte_mempool.h | 10 ++++++----
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index d533484..7aedc89 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -132,19 +132,22 @@ static unsigned optimize_object_size(unsigned obj_size)
 typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
 	void * /*obj_start*/,
 	void * /*obj_end*/,
-	uint32_t /*obj_index */);
+	uint32_t /*obj_index */,
+	phys_addr_t /*physaddr*/);
 
 static void
-mempool_add_elem(struct rte_mempool *mp, void *obj)
+mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
 
 	obj = (char *)obj + mp->header_size;
+	physaddr += mp->header_size;
 
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
+	hdr->physaddr = physaddr;
 	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
@@ -173,6 +176,7 @@ rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	uint32_t pgn, pgf;
 	uintptr_t end, start, va;
 	uintptr_t pg_sz;
+	phys_addr_t physaddr;
 
 	pg_sz = (uintptr_t)1 << pg_shift;
 	va = (uintptr_t)vaddr;
@@ -208,9 +212,10 @@ rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 		 * otherwise, just skip that chunk unused.
 		 */
 		if (k == pgn) {
+			physaddr = paddr[k] + (start & (pg_sz - 1));
 			if (obj_iter != NULL)
 				obj_iter(obj_iter_arg, (void *)start,
-					(void *)end, i);
+					(void *)end, i, physaddr);
 			va = end;
 			j += pgf;
 			i++;
@@ -247,11 +252,11 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 
 static void
 mempool_obj_populate(void *arg, void *start, void *end,
-	__rte_unused uint32_t idx)
+	__rte_unused uint32_t idx, phys_addr_t physaddr)
 {
 	struct rte_mempool *mp = arg;
 
-	mempool_add_elem(mp, start);
+	mempool_add_elem(mp, start, physaddr);
 	mp->elt_va_end = (uintptr_t)end;
 }
 
@@ -355,7 +360,7 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
  * argument to the end of the object. */
 static void
 mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
-	__rte_unused uint32_t idx)
+	__rte_unused uint32_t idx, __rte_unused phys_addr_t physaddr)
 {
 	*(uintptr_t *)arg = (uintptr_t)end;
 }
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 5b760f0..f32d705 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -158,6 +158,7 @@ struct rte_mempool_objsz {
 struct rte_mempool_objhdr {
 	STAILQ_ENTRY(rte_mempool_objhdr) next; /**< Next in list. */
 	struct rte_mempool *mp;          /**< The mempool owning the object. */
+	phys_addr_t physaddr;            /**< Physical address of the object. */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	uint64_t cookie;                 /**< Debug cookie. */
 #endif
@@ -1125,13 +1126,14 @@ rte_mempool_empty(const struct rte_mempool *mp)
  *   The physical address of the elt element.
  */
 static inline phys_addr_t
-rte_mempool_virt2phy(const struct rte_mempool *mp, const void *elt)
+rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
 {
 	if (rte_eal_has_hugepages()) {
-		uintptr_t off;
+		const struct rte_mempool_objhdr *hdr;
 
-		off = (const char *)elt - (const char *)mp->elt_va_start;
-		return mp->elt_pa[off >> mp->pg_shift] + (off & mp->pg_mask);
+		hdr = (const struct rte_mempool_objhdr *)
+			((const char *)elt - sizeof(*hdr));
+		return hdr->physaddr;
 	} else {
 		/*
 		 * If huge pages are disabled, we cannot assume the
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 15/35] mempool: remove MEMPOOL_IS_CONTIG()
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (13 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 14/35] mempool: store physaddr in mempool objects Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 16/35] mempool: store memory chunks in a list Olivier Matz
                   ` (22 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

The next commits will change the behavior of the mempool library so that
the objects will never be allocated in the same memzone than the mempool
header. Therefore, there is no reason to keep this macro that would
always return 0.

This macro was only used in app/test.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c          | 7 +++----
 lib/librte_mempool/rte_mempool.h | 7 -------
 2 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 10e1fa4..1503bcf 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -126,12 +126,11 @@ test_mempool_basic(void)
 			MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size))
 		return -1;
 
+#ifndef RTE_EXEC_ENV_BSD /* rte_mem_virt2phy() not supported on bsd */
 	printf("get physical address of an object\n");
-	if (MEMPOOL_IS_CONTIG(mp) &&
-			rte_mempool_virt2phy(mp, obj) !=
-			(phys_addr_t) (mp->phys_addr +
-			(phys_addr_t) ((char*) obj - (char*) mp)))
+	if (rte_mempool_virt2phy(mp, obj) != rte_mem_virt2phy(obj))
 		return -1;
+#endif
 
 	printf("put the object back\n");
 	rte_mempool_put(mp, obj);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index f32d705..3bfdf4d 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -271,13 +271,6 @@ struct rte_mempool {
 	(sizeof(*(mp)) + __PA_SIZE(mp, pgn) + (((cs) == 0) ? 0 : \
 	(sizeof(struct rte_mempool_cache) * RTE_MAX_LCORE)))
 
-/**
- * Return true if the whole mempool is in contiguous memory.
- */
-#define	MEMPOOL_IS_CONTIG(mp)                      \
-	((mp)->pg_num == MEMPOOL_PG_NUM_DEFAULT && \
-	(mp)->phys_addr == (mp)->elt_pa[0])
-
 /* return the header of a mempool object (internal) */
 static inline struct rte_mempool_objhdr *__mempool_get_header(void *obj)
 {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 16/35] mempool: store memory chunks in a list
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (14 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 15/35] mempool: remove MEMPOOL_IS_CONTIG() Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 17/35] mempool: new function to iterate the memory chunks Olivier Matz
                   ` (21 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Do not use paddr table to store the mempool memory chunks.
This will allow to have several chunks with different virtual addresses.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c          |   2 +-
 lib/librte_mempool/rte_mempool.c | 205 ++++++++++++++++++++++++++-------------
 lib/librte_mempool/rte_mempool.h |  51 +++++-----
 3 files changed, 165 insertions(+), 93 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 1503bcf..80d95d5 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -123,7 +123,7 @@ test_mempool_basic(void)
 
 	printf("get private data\n");
 	if (rte_mempool_get_priv(mp) != (char *)mp +
-			MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size))
+			MEMPOOL_HEADER_SIZE(mp, mp->cache_size))
 		return -1;
 
 #ifndef RTE_EXEC_ENV_BSD /* rte_mem_virt2phy() not supported on bsd */
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 7aedc89..ff84f81 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -141,14 +141,12 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
 
-	obj = (char *)obj + mp->header_size;
-	physaddr += mp->header_size;
-
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
 	hdr->physaddr = physaddr;
 	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
+	mp->populated_size++;
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
@@ -246,33 +244,6 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 	return n;
 }
 
-/*
- * Populate  mempool with the objects.
- */
-
-static void
-mempool_obj_populate(void *arg, void *start, void *end,
-	__rte_unused uint32_t idx, phys_addr_t physaddr)
-{
-	struct rte_mempool *mp = arg;
-
-	mempool_add_elem(mp, start, physaddr);
-	mp->elt_va_end = (uintptr_t)end;
-}
-
-static void
-mempool_populate(struct rte_mempool *mp, size_t num, size_t align)
-{
-	uint32_t elt_sz;
-
-	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-
-	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
-		num, elt_sz, align,
-		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_populate, mp);
-}
-
 /* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
@@ -465,6 +436,108 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 	return 0;
 }
 
+/* Free memory chunks used by a mempool. Objects must be in pool */
+static void
+rte_mempool_free_memchunks(struct rte_mempool *mp)
+{
+	struct rte_mempool_memhdr *memhdr;
+	void *elt;
+
+	while (!STAILQ_EMPTY(&mp->elt_list)) {
+		rte_ring_sc_dequeue(mp->ring, &elt);
+		(void)elt;
+		STAILQ_REMOVE_HEAD(&mp->elt_list, next);
+		mp->populated_size--;
+	}
+
+	while (!STAILQ_EMPTY(&mp->mem_list)) {
+		memhdr = STAILQ_FIRST(&mp->mem_list);
+		STAILQ_REMOVE_HEAD(&mp->mem_list, next);
+		rte_free(memhdr);
+		mp->nb_mem_chunks--;
+	}
+}
+
+/* Add objects in the pool, using a physically contiguous memory
+ * zone. Return the number of objects added, or a negative value
+ * on error. */
+static int
+rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
+	phys_addr_t paddr, size_t len)
+{
+	unsigned total_elt_sz;
+	unsigned i = 0;
+	size_t off;
+	struct rte_mempool_memhdr *memhdr;
+
+	/* mempool is already populated */
+	if (mp->populated_size >= mp->size)
+		return -ENOSPC;
+
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+
+	memhdr = rte_zmalloc("MEMPOOL_MEMHDR", sizeof(*memhdr), 0);
+	if (memhdr == NULL)
+		return -ENOMEM;
+
+	memhdr->mp = mp;
+	memhdr->addr = vaddr;
+	memhdr->phys_addr = paddr;
+	memhdr->len = len;
+
+	if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
+		off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
+	else
+		off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - vaddr;
+
+	while (off + total_elt_sz <= len && mp->populated_size < mp->size) {
+		off += mp->header_size;
+		mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
+		off += mp->elt_size + mp->trailer_size;
+		i++;
+	}
+
+	/* not enough room to store one object */
+	if (i == 0)
+		return -EINVAL;
+
+	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
+	mp->nb_mem_chunks++;
+	return i;
+}
+
+/* Add objects in the pool, using a table of physical pages. Return the
+ * number of objects added, or a negative value on error. */
+static int
+rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+{
+	uint32_t i, n;
+	int ret, cnt = 0;
+	size_t pg_sz = (size_t)1 << pg_shift;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+
+	for (i = 0; i < pg_num && mp->populated_size < mp->size; i += n) {
+
+		/* populate with the largest group of contiguous pages */
+		for (n = 1; (i + n) < pg_num &&
+			     paddr[i] + pg_sz == paddr[i+n]; n++)
+			;
+
+		ret = rte_mempool_populate_phys(mp, vaddr + i * pg_sz,
+			paddr[i], n * pg_sz);
+		if (ret < 0) {
+			rte_mempool_free_memchunks(mp);
+			return ret;
+		}
+		cnt += ret;
+	}
+	return cnt;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -491,6 +564,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	struct rte_mempool_objsz objsz;
 	void *startaddr;
 	int page_size = getpagesize();
+	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -520,7 +594,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	}
 
 	/* Check that pg_num and pg_shift parameters are valid. */
-	if (pg_num < RTE_DIM(mp->elt_pa) || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
+	if (pg_num == 0 || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
@@ -567,7 +641,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	 * store mempool objects. Otherwise reserve a memzone that is large
 	 * enough to hold mempool header and metadata plus mempool objects.
 	 */
-	mempool_size = MEMPOOL_HEADER_SIZE(mp, pg_num, cache_size);
+	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
 	if (vaddr == NULL)
@@ -615,6 +689,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
 	mp->private_data_size = private_data_size;
 	STAILQ_INIT(&mp->elt_list);
+	STAILQ_INIT(&mp->mem_list);
 
 	if (rte_mempool_ring_create(mp) < 0)
 		goto exit_unlock;
@@ -624,37 +699,31 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	 * The local_cache points to just past the elt_pa[] array.
 	 */
 	mp->local_cache = (struct rte_mempool_cache *)
-			((char *)mp + MEMPOOL_HEADER_SIZE(mp, pg_num, 0));
+			((char *)mp + MEMPOOL_HEADER_SIZE(mp, 0));
 
-	/* calculate address of the first element for continuous mempool. */
-	obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, pg_num, cache_size) +
-		private_data_size;
-	obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
-
-	/* populate address translation fields. */
-	mp->pg_num = pg_num;
-	mp->pg_shift = pg_shift;
-	mp->pg_mask = RTE_LEN2MASK(mp->pg_shift, typeof(mp->pg_mask));
+	/* call the initializer */
+	if (mp_init)
+		mp_init(mp, mp_init_arg);
 
 	/* mempool elements allocated together with mempool */
 	if (vaddr == NULL) {
-		mp->elt_va_start = (uintptr_t)obj;
-		mp->elt_pa[0] = mp->phys_addr +
-			(mp->elt_va_start - (uintptr_t)mp);
+		/* calculate address of the first element for continuous mempool. */
+		obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, cache_size) +
+			private_data_size;
+		obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
+
+		ret = rte_mempool_populate_phys(mp, obj,
+			mp->phys_addr + ((char *)obj - (char *)mp),
+			objsz.total_size * n);
+		if (ret != (int)mp->size)
+			goto exit_unlock;
 	} else {
-		/* mempool elements in a separate chunk of memory. */
-		mp->elt_va_start = (uintptr_t)vaddr;
-		memcpy(mp->elt_pa, paddr, sizeof (mp->elt_pa[0]) * pg_num);
+		ret = rte_mempool_populate_phys_tab(mp, vaddr,
+			paddr, pg_num, pg_shift);
+		if (ret != (int)mp->size)
+			goto exit_unlock;
 	}
 
-	mp->elt_va_end = mp->elt_va_start;
-
-	/* call the initializer */
-	if (mp_init)
-		mp_init(mp, mp_init_arg);
-
-	mempool_populate(mp, n, 1);
-
 	/* call the initializer */
 	if (obj_init)
 		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
@@ -670,8 +739,10 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	if (mp != NULL)
+	if (mp != NULL) {
+		rte_mempool_free_memchunks(mp);
 		rte_ring_free(mp->ring);
+	}
 	rte_free(te);
 
 	return NULL;
@@ -864,8 +935,10 @@ rte_mempool_dump(FILE *f, const struct rte_mempool *mp)
 	struct rte_mempool_debug_stats sum;
 	unsigned lcore_id;
 #endif
+	struct rte_mempool_memhdr *memhdr;
 	unsigned common_count;
 	unsigned cache_count;
+	size_t mem_len = 0;
 
 	RTE_VERIFY(f != NULL);
 	RTE_VERIFY(mp != NULL);
@@ -874,7 +947,9 @@ rte_mempool_dump(FILE *f, const struct rte_mempool *mp)
 	fprintf(f, "  flags=%x\n", mp->flags);
 	fprintf(f, "  ring=<%s>@%p\n", mp->ring->name, mp->ring);
 	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->phys_addr);
+	fprintf(f, "  nb_mem_chunks=%u\n", mp->nb_mem_chunks);
 	fprintf(f, "  size=%"PRIu32"\n", mp->size);
+	fprintf(f, "  populated_size=%"PRIu32"\n", mp->populated_size);
 	fprintf(f, "  header_size=%"PRIu32"\n", mp->header_size);
 	fprintf(f, "  elt_size=%"PRIu32"\n", mp->elt_size);
 	fprintf(f, "  trailer_size=%"PRIu32"\n", mp->trailer_size);
@@ -882,17 +957,13 @@ rte_mempool_dump(FILE *f, const struct rte_mempool *mp)
 	       mp->header_size + mp->elt_size + mp->trailer_size);
 
 	fprintf(f, "  private_data_size=%"PRIu32"\n", mp->private_data_size);
-	fprintf(f, "  pg_num=%"PRIu32"\n", mp->pg_num);
-	fprintf(f, "  pg_shift=%"PRIu32"\n", mp->pg_shift);
-	fprintf(f, "  pg_mask=%#tx\n", mp->pg_mask);
-	fprintf(f, "  elt_va_start=%#tx\n", mp->elt_va_start);
-	fprintf(f, "  elt_va_end=%#tx\n", mp->elt_va_end);
-	fprintf(f, "  elt_pa[0]=0x%" PRIx64 "\n", mp->elt_pa[0]);
-
-	if (mp->size != 0)
+
+	STAILQ_FOREACH(memhdr, &mp->mem_list, next)
+		mem_len += memhdr->len;
+	if (mem_len != 0) {
 		fprintf(f, "  avg bytes/object=%#Lf\n",
-			(long double)(mp->elt_va_end - mp->elt_va_start) /
-			mp->size);
+			(long double)mem_len / mp->size);
+	}
 
 	cache_count = rte_mempool_dump_cache(f, mp);
 	common_count = rte_ring_count(mp->ring);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 3bfdf4d..08bfe05 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -182,6 +182,25 @@ struct rte_mempool_objtlr {
 };
 
 /**
+ * A list of memory where objects are stored
+ */
+STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr);
+
+/**
+ * Mempool objects memory header structure
+ *
+ * The memory chunks where objects are stored. Each chunk is virtually
+ * and physically contiguous.
+ */
+struct rte_mempool_memhdr {
+	STAILQ_ENTRY(rte_mempool_memhdr) next; /**< Next in list. */
+	struct rte_mempool *mp;  /**< The mempool owning the chunk */
+	void *addr;              /**< Virtual address of the chunk */
+	phys_addr_t phys_addr;   /**< Physical address of the chunk */
+	size_t len;              /**< length of the chunk */
+};
+
+/**
  * The RTE mempool structure.
  */
 struct rte_mempool {
@@ -190,7 +209,7 @@ struct rte_mempool {
 	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
 	int flags;                       /**< Flags of the mempool. */
 	int socket_id;                   /**< Socket id passed at mempool creation. */
-	uint32_t size;                   /**< Size of the mempool. */
+	uint32_t size;                   /**< Max size of the mempool. */
 	uint32_t cache_size;             /**< Size of per-lcore local cache. */
 	uint32_t cache_flushthresh;
 	/**< Threshold before we flush excess elements. */
@@ -203,26 +222,15 @@ struct rte_mempool {
 
 	struct rte_mempool_cache *local_cache; /**< Per-lcore local cache */
 
+	uint32_t populated_size;         /**< Number of populated objects. */
 	struct rte_mempool_objhdr_list elt_list; /**< List of objects in pool */
+	uint32_t nb_mem_chunks;          /**< Number of memory chunks */
+	struct rte_mempool_memhdr_list mem_list; /**< List of memory chunks */
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	/** Per-lcore statistics. */
 	struct rte_mempool_debug_stats stats[RTE_MAX_LCORE];
 #endif
-
-	/* Address translation support, starts from next cache line. */
-
-	/** Number of elements in the elt_pa array. */
-	uint32_t    pg_num __rte_cache_aligned;
-	uint32_t    pg_shift;     /**< LOG2 of the physical pages. */
-	uintptr_t   pg_mask;      /**< physical page mask value. */
-	uintptr_t   elt_va_start;
-	/**< Virtual address of the first mempool object. */
-	uintptr_t   elt_va_end;
-	/**< Virtual address of the <size + 1> mempool object. */
-	phys_addr_t elt_pa[MEMPOOL_PG_NUM_DEFAULT];
-	/**< Array of physical page addresses for the mempool objects buffer. */
-
 }  __rte_cache_aligned;
 
 #define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread among memory channels. */
@@ -253,13 +261,6 @@ struct rte_mempool {
 #endif
 
 /**
- * Size of elt_pa array size based on number of pages. (Internal use)
- */
-#define __PA_SIZE(mp, pgn) \
-	RTE_ALIGN_CEIL((((pgn) - RTE_DIM((mp)->elt_pa)) * \
-	sizeof((mp)->elt_pa[0])), RTE_CACHE_LINE_SIZE)
-
-/**
  * Calculate the size of the mempool header.
  *
  * @param mp
@@ -267,8 +268,8 @@ struct rte_mempool {
  * @param pgn
  *   Number of pages used to store mempool objects.
  */
-#define MEMPOOL_HEADER_SIZE(mp, pgn, cs) \
-	(sizeof(*(mp)) + __PA_SIZE(mp, pgn) + (((cs) == 0) ? 0 : \
+#define MEMPOOL_HEADER_SIZE(mp, cs) \
+	(sizeof(*(mp)) + (((cs) == 0) ? 0 : \
 	(sizeof(struct rte_mempool_cache) * RTE_MAX_LCORE)))
 
 /* return the header of a mempool object (internal) */
@@ -1160,7 +1161,7 @@ void rte_mempool_audit(const struct rte_mempool *mp);
 static inline void *rte_mempool_get_priv(struct rte_mempool *mp)
 {
 	return (char *)mp +
-		MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size);
+		MEMPOOL_HEADER_SIZE(mp, mp->cache_size);
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 17/35] mempool: new function to iterate the memory chunks
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (15 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 16/35] mempool: store memory chunks in a list Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 18/35] mempool: simplify xmem_usage Olivier Matz
                   ` (20 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

In the same model than rte_mempool_obj_iter(), introduce
rte_mempool_mem_iter() to iterate the memory chunks attached
to the mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c           | 18 ++++++++++++++++++
 lib/librte_mempool/rte_mempool.h           | 26 ++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool_version.map |  1 +
 3 files changed, 45 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index ff84f81..0220fa3 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -244,6 +244,24 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 	return n;
 }
 
+/* call mem_cb() for each mempool memory chunk */
+uint32_t
+rte_mempool_mem_iter(struct rte_mempool *mp,
+	rte_mempool_mem_cb_t *mem_cb, void *mem_cb_arg)
+{
+	struct rte_mempool_memhdr *hdr;
+	void *mem;
+	unsigned n = 0;
+
+	STAILQ_FOREACH(hdr, &mp->mem_list, next) {
+		mem = (char *)hdr + sizeof(*hdr);
+		mem_cb(mp, mem_cb_arg, mem, n);
+		n++;
+	}
+
+	return n;
+}
+
 /* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 08bfe05..184d40d 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -332,6 +332,14 @@ typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
 typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
 
 /**
+ * A memory callback function for mempool.
+ *
+ * Used by rte_mempool_mem_iter().
+ */
+typedef void (rte_mempool_mem_cb_t)(struct rte_mempool *mp,
+		void *opaque, void *mem, unsigned mem_idx);
+
+/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -602,6 +610,24 @@ uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
 	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg);
 
 /**
+ * Call a function for each mempool memory chunk
+ *
+ * Iterate across all memory chunks attached to a rte_mempool and call
+ * the callback function on it.
+ *
+ * @param mp
+ *   A pointer to an initialized mempool.
+ * @param mem_cb
+ *   A function pointer that is called for each memory chunk.
+ * @param mem_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   Number of memory chunks iterated.
+ */
+uint32_t rte_mempool_mem_iter(struct rte_mempool *mp,
+	rte_mempool_mem_cb_t *mem_cb, void *mem_cb_arg);
+
+/**
  * Dump the status of the mempool to the console.
  *
  * @param f
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 4db75ca..ca887b5 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -21,6 +21,7 @@ DPDK_16.07 {
 	global:
 
 	rte_mempool_obj_iter;
+	rte_mempool_mem_iter;
 
 	local: *;
 } DPDK_2.0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 18/35] mempool: simplify xmem_usage
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (16 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 17/35] mempool: new function to iterate the memory chunks Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 19/35] mempool: introduce a free callback for memory chunks Olivier Matz
                   ` (19 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Since previous commit, the function rte_mempool_xmem_usage() is
now the last user of rte_mempool_obj_mem_iter(). This complex
code can now be moved inside the function. We can get rid of the
callback and do some simplification to make the code more readable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 138 +++++++++++----------------------------
 1 file changed, 37 insertions(+), 101 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0220fa3..905387f 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -126,15 +126,6 @@ static unsigned optimize_object_size(unsigned obj_size)
 	return new_obj_size * RTE_MEMPOOL_ALIGN;
 }
 
-/**
- * A mempool object iterator callback function.
- */
-typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
-	void * /*obj_start*/,
-	void * /*obj_end*/,
-	uint32_t /*obj_index */,
-	phys_addr_t /*physaddr*/);
-
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 {
@@ -158,74 +149,6 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 	rte_ring_sp_enqueue(mp->ring, obj);
 }
 
-/* Iterate through objects at the given address
- *
- * Given the pointer to the memory, and its topology in physical memory
- * (the physical addresses table), iterate through the "elt_num" objects
- * of size "elt_sz" aligned at "align". For each object in this memory
- * chunk, invoke a callback. It returns the effective number of objects
- * in this memory. */
-static uint32_t
-rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
-	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
-	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
-{
-	uint32_t i, j, k;
-	uint32_t pgn, pgf;
-	uintptr_t end, start, va;
-	uintptr_t pg_sz;
-	phys_addr_t physaddr;
-
-	pg_sz = (uintptr_t)1 << pg_shift;
-	va = (uintptr_t)vaddr;
-
-	i = 0;
-	j = 0;
-
-	while (i != elt_num && j != pg_num) {
-
-		start = RTE_ALIGN_CEIL(va, align);
-		end = start + total_elt_sz;
-
-		/* index of the first page for the next element. */
-		pgf = (end >> pg_shift) - (start >> pg_shift);
-
-		/* index of the last page for the current element. */
-		pgn = ((end - 1) >> pg_shift) - (start >> pg_shift);
-		pgn += j;
-
-		/* do we have enough space left for the element. */
-		if (pgn >= pg_num)
-			break;
-
-		for (k = j;
-				k != pgn &&
-				paddr[k] + pg_sz == paddr[k + 1];
-				k++)
-			;
-
-		/*
-		 * if next pgn chunks of memory physically continuous,
-		 * use it to create next element.
-		 * otherwise, just skip that chunk unused.
-		 */
-		if (k == pgn) {
-			physaddr = paddr[k] + (start & (pg_sz - 1));
-			if (obj_iter != NULL)
-				obj_iter(obj_iter_arg, (void *)start,
-					(void *)end, i, physaddr);
-			va = end;
-			j += pgf;
-			i++;
-		} else {
-			va = RTE_ALIGN_CEIL((va + 1), pg_sz);
-			j++;
-		}
-	}
-
-	return i;
-}
-
 /* call obj_cb() for each mempool element */
 uint32_t
 rte_mempool_obj_iter(struct rte_mempool *mp,
@@ -345,40 +268,53 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 	return sz;
 }
 
-/* Callback used by rte_mempool_xmem_usage(): it sets the opaque
- * argument to the end of the object. */
-static void
-mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
-	__rte_unused uint32_t idx, __rte_unused phys_addr_t physaddr)
-{
-	*(uintptr_t *)arg = (uintptr_t)end;
-}
-
 /*
  * Calculate how much memory would be actually required with the
  * given memory footprint to store required number of elements.
  */
 ssize_t
-rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
+	size_t total_elt_sz, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift)
 {
-	uint32_t n;
-	uintptr_t va, uv;
-	size_t pg_sz, usz;
+	uint32_t elt_cnt = 0;
+	phys_addr_t start, end;
+	uint32_t paddr_idx;
+	size_t pg_sz = (size_t)1 << pg_shift;
 
-	pg_sz = (size_t)1 << pg_shift;
-	va = (uintptr_t)vaddr;
-	uv = va;
+	/* if paddr is NULL, assume contiguous memory */
+	if (paddr == NULL) {
+		start = 0;
+		end = pg_sz * pg_num;
+		paddr_idx = pg_num;
+	} else {
+		start = paddr[0];
+		end = paddr[0] + pg_sz;
+		paddr_idx = 1;
+	}
+	while (elt_cnt < elt_num) {
+
+		if (end - start >= total_elt_sz) {
+			/* enough contiguous memory, add an object */
+			start += total_elt_sz;
+			elt_cnt++;
+		} else if (paddr_idx < pg_num) {
+			/* no room to store one obj, add a page */
+			if (end == paddr[paddr_idx]) {
+				end += pg_sz;
+			} else {
+				start = paddr[paddr_idx];
+				end = paddr[paddr_idx] + pg_sz;
+			}
+			paddr_idx++;
 
-	if ((n = rte_mempool_obj_mem_iter(vaddr, elt_num, total_elt_sz, 1,
-			paddr, pg_num, pg_shift, mempool_lelem_iter,
-			&uv)) != elt_num) {
-		return -(ssize_t)n;
+		} else {
+			/* no more page, return how many elements fit */
+			return -(size_t)elt_cnt;
+		}
 	}
 
-	uv = RTE_ALIGN_CEIL(uv, pg_sz);
-	usz = uv - va;
-	return usz;
+	return (size_t)paddr_idx << pg_shift;
 }
 
 #ifndef RTE_LIBRTE_XEN_DOM0
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 19/35] mempool: introduce a free callback for memory chunks
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (17 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 18/35] mempool: simplify xmem_usage Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 20/35] mempool: make page size optional when getting xmem size Olivier Matz
                   ` (18 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Introduce a free callback that is passed to the populate* functions,
which is used when freeing a mempool. This is unused now, but as next
commits will populate the mempool with several chunks of memory, we
need a way to free them properly on error.

Later in the series, we will also introduce a public rte_mempool_free()
and the ability for the user to populate a mempool with its own memory.
For that, we also need a free callback.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 27 ++++++++++++++++++++++-----
 lib/librte_mempool/rte_mempool.h |  8 ++++++++
 2 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 905387f..5bfe4cb 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -390,6 +390,15 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 	return 0;
 }
 
+/* free a memchunk allocated with rte_memzone_reserve() */
+__rte_unused static void
+rte_mempool_memchunk_mz_free(__rte_unused struct rte_mempool_memhdr *memhdr,
+	void *opaque)
+{
+	const struct rte_memzone *mz = opaque;
+	rte_memzone_free(mz);
+}
+
 /* Free memory chunks used by a mempool. Objects must be in pool */
 static void
 rte_mempool_free_memchunks(struct rte_mempool *mp)
@@ -407,6 +416,8 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
 	while (!STAILQ_EMPTY(&mp->mem_list)) {
 		memhdr = STAILQ_FIRST(&mp->mem_list);
 		STAILQ_REMOVE_HEAD(&mp->mem_list, next);
+		if (memhdr->free_cb != NULL)
+			memhdr->free_cb(memhdr, memhdr->opaque);
 		rte_free(memhdr);
 		mp->nb_mem_chunks--;
 	}
@@ -417,7 +428,8 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
  * on error. */
 static int
 rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
-	phys_addr_t paddr, size_t len)
+	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque)
 {
 	unsigned total_elt_sz;
 	unsigned i = 0;
@@ -438,6 +450,8 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	memhdr->addr = vaddr;
 	memhdr->phys_addr = paddr;
 	memhdr->len = len;
+	memhdr->free_cb = free_cb;
+	memhdr->opaque = opaque;
 
 	if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
@@ -464,7 +478,8 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
  * number of objects added, or a negative value on error. */
 static int
 rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
+	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque)
 {
 	uint32_t i, n;
 	int ret, cnt = 0;
@@ -482,11 +497,13 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 			;
 
 		ret = rte_mempool_populate_phys(mp, vaddr + i * pg_sz,
-			paddr[i], n * pg_sz);
+			paddr[i], n * pg_sz, free_cb, opaque);
 		if (ret < 0) {
 			rte_mempool_free_memchunks(mp);
 			return ret;
 		}
+		/* no need to call the free callback for next chunks */
+		free_cb = NULL;
 		cnt += ret;
 	}
 	return cnt;
@@ -668,12 +685,12 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 		ret = rte_mempool_populate_phys(mp, obj,
 			mp->phys_addr + ((char *)obj - (char *)mp),
-			objsz.total_size * n);
+			objsz.total_size * n, NULL, NULL);
 		if (ret != (int)mp->size)
 			goto exit_unlock;
 	} else {
 		ret = rte_mempool_populate_phys_tab(mp, vaddr,
-			paddr, pg_num, pg_shift);
+			paddr, pg_num, pg_shift, NULL, NULL);
 		if (ret != (int)mp->size)
 			goto exit_unlock;
 	}
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 184d40d..dacdf6c 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -187,6 +187,12 @@ struct rte_mempool_objtlr {
 STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr);
 
 /**
+ * Callback used to free a memory chunk
+ */
+typedef void (rte_mempool_memchunk_free_cb_t)(struct rte_mempool_memhdr *memhdr,
+	void *opaque);
+
+/**
  * Mempool objects memory header structure
  *
  * The memory chunks where objects are stored. Each chunk is virtually
@@ -198,6 +204,8 @@ struct rte_mempool_memhdr {
 	void *addr;              /**< Virtual address of the chunk */
 	phys_addr_t phys_addr;   /**< Physical address of the chunk */
 	size_t len;              /**< length of the chunk */
+	rte_mempool_memchunk_free_cb_t *free_cb; /**< Free callback */
+	void *opaque;            /**< Argument passed to the free callback */
 };
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 20/35] mempool: make page size optional when getting xmem size
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (18 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 19/35] mempool: introduce a free callback for memory chunks Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 21/35] mempool: default allocation in several memory chunks Olivier Matz
                   ` (17 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Update rte_mempool_xmem_size() so that when the page_shift argument is
set to 0, assume that memory is physically contiguous, allowing to
ignore page boundaries. This will be used in the next commits.

By the way, rename the variable 'n' as 'obj_per_page' and avoid the
affectation inside the if().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 18 +++++++++---------
 lib/librte_mempool/rte_mempool.h |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 5bfe4cb..805ac19 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -254,18 +254,18 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 size_t
 rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 {
-	size_t n, pg_num, pg_sz, sz;
+	size_t obj_per_page, pg_num, pg_sz;
 
-	pg_sz = (size_t)1 << pg_shift;
+	if (pg_shift == 0)
+		return total_elt_sz * elt_num;
 
-	if ((n = pg_sz / total_elt_sz) > 0) {
-		pg_num = (elt_num + n - 1) / n;
-		sz = pg_num << pg_shift;
-	} else {
-		sz = RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
-	}
+	pg_sz = (size_t)1 << pg_shift;
+	obj_per_page = pg_sz / total_elt_sz;
+	if (obj_per_page == 0)
+		return RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
 
-	return sz;
+	pg_num = (elt_num + obj_per_page - 1) / obj_per_page;
+	return pg_num << pg_shift;
 }
 
 /*
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index dacdf6c..2cce7ee 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1257,7 +1257,7 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  *   The size of each element, including header and trailer, as returned
  *   by rte_mempool_calc_obj_size().
  * @param pg_shift
- *   LOG2 of the physical pages size.
+ *   LOG2 of the physical pages size. If set to 0, ignore page boundaries.
  * @return
  *   Required memory size aligned at page boundary.
  */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 21/35] mempool: default allocation in several memory chunks
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (19 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 20/35] mempool: make page size optional when getting xmem size Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 22/35] eal: lock memory when using no-huge Olivier Matz
                   ` (16 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Introduce rte_mempool_populate_default() which allocates
mempool objects in several memzones.

The mempool header is now always allocated in a specific memzone
(not with its objects). Thanks to this modification, we can remove
many specific behavior that was required when hugepages are not
enabled in case we are using rte_mempool_xmem_create().

This change requires to update how kni and mellanox drivers lookup for
mbuf memory. This will only work if there is only one memory chunk (like
today), but we could make use of rte_mempool_mem_iter() to support more
memory chunks.

We can also remove RTE_MEMPOOL_OBJ_NAME that is not required anymore for
the lookup, as memory chunks are referenced by the mempool.

Note that rte_mempool_create() is still broken (it was the case before)
when there is no hugepages support (rte_mempool_create_xmem() has to be
used). This is fixed in next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx4/mlx4.c               |  18 ++++--
 drivers/net/mlx5/mlx5_rxq.c           |   9 ++-
 drivers/net/mlx5/mlx5_rxtx.c          |   9 ++-
 lib/librte_kni/rte_kni.c              |  12 +++-
 lib/librte_mempool/rte_dom0_mempool.c |   2 +-
 lib/librte_mempool/rte_mempool.c      | 116 +++++++++++++++++++---------------
 lib/librte_mempool/rte_mempool.h      |  11 ----
 7 files changed, 102 insertions(+), 75 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index d9b2291..405324c 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1237,9 +1237,14 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	/* Add a new entry, register MR first. */
 	DEBUG("%p: discovered new memory pool \"%s\" (%p)",
 	      (void *)txq, mp->name, (const void *)mp);
+	if (mp->nb_mem_chunks != 1) {
+		DEBUG("%p: only 1 memory chunk is supported in mempool",
+			(void *)txq);
+		return (uint32_t)-1;
+	}
 	mr = ibv_reg_mr(txq->priv->pd,
-			(void *)mp->elt_va_start,
-			(mp->elt_va_end - mp->elt_va_start),
+			(void *)STAILQ_FIRST(&mp->mem_list)->addr,
+			STAILQ_FIRST(&mp->mem_list)->len,
 			(IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
 	if (unlikely(mr == NULL)) {
 		DEBUG("%p: unable to configure MR, ibv_reg_mr() failed.",
@@ -3675,6 +3680,11 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 		      " multiple of %d)", (void *)dev, MLX4_PMD_SGE_WR_N);
 		return EINVAL;
 	}
+	if (mp->nb_mem_chunks != 1) {
+		ERROR("%p: only 1 memory chunk is supported in mempool",
+			(void *)dev);
+		return EINVAL;
+	}
 	/* Get mbuf length. */
 	buf = rte_pktmbuf_alloc(mp);
 	if (buf == NULL) {
@@ -3702,8 +3712,8 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	      (void *)dev, (tmpl.sp ? "enabling" : "disabling"), desc);
 	/* Use the entire RX mempool as the memory region. */
 	tmpl.mr = ibv_reg_mr(priv->pd,
-			     (void *)mp->elt_va_start,
-			     (mp->elt_va_end - mp->elt_va_start),
+			     (void *)STAILQ_FIRST(&mp->mem_list)->addr,
+			     STAILQ_FIRST(&mp->mem_list)->len,
 			     (IBV_ACCESS_LOCAL_WRITE |
 			      IBV_ACCESS_REMOTE_WRITE));
 	if (tmpl.mr == NULL) {
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index ebbe186..1513b37 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1113,6 +1113,11 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 		      " multiple of %d)", (void *)dev, MLX5_PMD_SGE_WR_N);
 		return EINVAL;
 	}
+	if (mp->nb_mem_chunks != 1) {
+		ERROR("%p: only 1 memory chunk is supported in mempool",
+			(void *)dev);
+		return EINVAL;
+	}
 	/* Get mbuf length. */
 	buf = rte_pktmbuf_alloc(mp);
 	if (buf == NULL) {
@@ -1140,8 +1145,8 @@ rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
 	      (void *)dev, (tmpl.sp ? "enabling" : "disabling"), desc);
 	/* Use the entire RX mempool as the memory region. */
 	tmpl.mr = ibv_reg_mr(priv->pd,
-			     (void *)mp->elt_va_start,
-			     (mp->elt_va_end - mp->elt_va_start),
+			     (void *)STAILQ_FIRST(&mp->mem_list)->addr,
+			     STAILQ_FIRST(&mp->mem_list)->len,
 			     (IBV_ACCESS_LOCAL_WRITE |
 			      IBV_ACCESS_REMOTE_WRITE));
 	if (tmpl.mr == NULL) {
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index f002ca2..4ff88fc 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -165,9 +165,14 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	/* Add a new entry, register MR first. */
 	DEBUG("%p: discovered new memory pool \"%s\" (%p)",
 	      (void *)txq, mp->name, (const void *)mp);
+	if (mp->nb_mem_chunks != 1) {
+		DEBUG("%p: only 1 memory chunk is supported in mempool",
+			(void *)txq);
+		return (uint32_t)-1;
+	}
 	mr = ibv_reg_mr(txq->priv->pd,
-			(void *)mp->elt_va_start,
-			(mp->elt_va_end - mp->elt_va_start),
+			(void *)STAILQ_FIRST(&mp->mem_list)->addr,
+			STAILQ_FIRST(&mp->mem_list)->len,
 			(IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE));
 	if (unlikely(mr == NULL)) {
 		DEBUG("%p: unable to configure MR, ibv_reg_mr() failed.",
diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index ea9baf4..3028fd4 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -323,6 +323,7 @@ rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
 	char intf_name[RTE_KNI_NAMESIZE];
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
+	const struct rte_mempool *mp;
 	struct rte_kni_memzone_slot *slot = NULL;
 
 	if (!pktmbuf_pool || !conf || !conf->name[0])
@@ -415,12 +416,17 @@ rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
 
 
 	/* MBUF mempool */
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME,
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT,
 		pktmbuf_pool->name);
 	mz = rte_memzone_lookup(mz_name);
 	KNI_MEM_CHECK(mz == NULL);
-	dev_info.mbuf_va = mz->addr;
-	dev_info.mbuf_phys = mz->phys_addr;
+	mp = (struct rte_mempool *)mz->addr;
+	/* KNI currently requires to have only one memory chunk */
+	if (mp->nb_mem_chunks != 1)
+		goto kni_fail;
+
+	dev_info.mbuf_va = STAILQ_FIRST(&mp->mem_list)->addr;
+	dev_info.mbuf_phys = STAILQ_FIRST(&mp->mem_list)->phys_addr;
 	ctx->pktmbuf_pool = pktmbuf_pool;
 	ctx->group_id = conf->group_id;
 	ctx->slot_id = slot->id;
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
index 0051bd5..dad755c 100644
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ b/lib/librte_mempool/rte_dom0_mempool.c
@@ -110,7 +110,7 @@ rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
 	if (pa == NULL)
 		return mp;
 
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME, name);
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT "_elt", name);
 	mz = rte_memzone_reserve(mz_name, sz, socket_id, mz_flags);
 	if (mz == NULL) {
 		free(pa);
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 805ac19..7fd2bb4 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -391,7 +391,7 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 }
 
 /* free a memchunk allocated with rte_memzone_reserve() */
-__rte_unused static void
+static void
 rte_mempool_memchunk_mz_free(__rte_unused struct rte_mempool_memhdr *memhdr,
 	void *opaque)
 {
@@ -509,6 +509,59 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	return cnt;
 }
 
+/* Default function to populate the mempool: allocate memory in mezones,
+ * and populate them. Return the number of objects added, or a negative
+ * value on error. */
+static int rte_mempool_populate_default(struct rte_mempool *mp)
+{
+	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
+	char mz_name[RTE_MEMZONE_NAMESIZE];
+	const struct rte_memzone *mz;
+	size_t size, total_elt_sz, align;
+	unsigned mz_id, n;
+	int ret;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+
+	align = RTE_CACHE_LINE_SIZE;
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
+		size = rte_mempool_xmem_size(n, total_elt_sz, 0);
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			ret = -ENAMETOOLONG;
+			goto fail;
+		}
+
+		mz = rte_memzone_reserve_aligned(mz_name, size,
+			mp->socket_id, mz_flags, align);
+		/* not enough memory, retry with the biggest zone we have */
+		if (mz == NULL)
+			mz = rte_memzone_reserve_aligned(mz_name, 0,
+				mp->socket_id, mz_flags, align);
+		if (mz == NULL) {
+			ret = -rte_errno;
+			goto fail;
+		}
+
+		ret = rte_mempool_populate_phys(mp, mz->addr, mz->phys_addr,
+			mz->len, rte_mempool_memchunk_mz_free,
+			RTE_DECONST(void *, mz));
+		if (ret < 0)
+			goto fail;
+	}
+
+	return mp->size;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return ret;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -531,10 +584,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	const struct rte_memzone *mz;
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-	void *obj;
 	struct rte_mempool_objsz objsz;
-	void *startaddr;
-	int page_size = getpagesize();
 	int ret;
 
 	/* compilation-time checks */
@@ -589,16 +639,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	private_data_size = (private_data_size +
 			     RTE_MEMPOOL_ALIGN_MASK) & (~RTE_MEMPOOL_ALIGN_MASK);
 
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * expand private data size to a whole page, so that the
-		 * first pool element will start on a new standard page
-		 */
-		int head = sizeof(struct rte_mempool);
-		int new_size = (private_data_size + head) % page_size;
-		if (new_size)
-			private_data_size += page_size - new_size;
-	}
 
 	/* try to allocate tailq entry */
 	te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
@@ -615,17 +655,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
-	if (vaddr == NULL)
-		mempool_size += (size_t)objsz.total_size * n;
-
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * we want the memory pool to start on a page boundary,
-		 * because pool elements crossing page boundaries would
-		 * result in discontiguous physical addresses
-		 */
-		mempool_size += page_size;
-	}
 
 	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
 
@@ -633,20 +662,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (mz == NULL)
 		goto exit_unlock;
 
-	if (rte_eal_has_hugepages()) {
-		startaddr = (void*)mz->addr;
-	} else {
-		/* align memory pool start address on a page boundary */
-		unsigned long addr = (unsigned long)mz->addr;
-		if (addr & (page_size - 1)) {
-			addr += page_size;
-			addr &= ~(page_size - 1);
-		}
-		startaddr = (void*)addr;
-	}
-
 	/* init the mempool structure */
-	mp = startaddr;
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->phys_addr = mz->phys_addr;
@@ -677,22 +693,17 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		mp_init(mp, mp_init_arg);
 
 	/* mempool elements allocated together with mempool */
-	if (vaddr == NULL) {
-		/* calculate address of the first element for continuous mempool. */
-		obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, cache_size) +
-			private_data_size;
-		obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
-
-		ret = rte_mempool_populate_phys(mp, obj,
-			mp->phys_addr + ((char *)obj - (char *)mp),
-			objsz.total_size * n, NULL, NULL);
-		if (ret != (int)mp->size)
-			goto exit_unlock;
-	} else {
+	if (vaddr == NULL)
+		ret = rte_mempool_populate_default(mp);
+	else
 		ret = rte_mempool_populate_phys_tab(mp, vaddr,
 			paddr, pg_num, pg_shift, NULL, NULL);
-		if (ret != (int)mp->size)
-			goto exit_unlock;
+	if (ret < 0) {
+		rte_errno = -ret;
+		goto exit_unlock;
+	} else if (ret != (int)mp->size) {
+		rte_errno = EINVAL;
+		goto exit_unlock;
 	}
 
 	/* call the initializer */
@@ -715,6 +726,7 @@ exit_unlock:
 		rte_ring_free(mp->ring);
 	}
 	rte_free(te);
+	rte_memzone_free(mz);
 
 	return NULL;
 }
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 2cce7ee..2770d80 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -124,17 +124,6 @@ struct rte_mempool_objsz {
 /* "MP_<name>" */
 #define	RTE_MEMPOOL_MZ_FORMAT	RTE_MEMPOOL_MZ_PREFIX "%s"
 
-#ifdef RTE_LIBRTE_XEN_DOM0
-
-/* "<name>_MP_elt" */
-#define	RTE_MEMPOOL_OBJ_NAME	"%s_" RTE_MEMPOOL_MZ_PREFIX "elt"
-
-#else
-
-#define	RTE_MEMPOOL_OBJ_NAME	RTE_MEMPOOL_MZ_FORMAT
-
-#endif /* RTE_LIBRTE_XEN_DOM0 */
-
 #define	MEMPOOL_PG_SHIFT_MAX	(sizeof(uintptr_t) * CHAR_BIT - 1)
 
 /** Mempool over one chunk of physically continuous memory */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 22/35] eal: lock memory when using no-huge
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (20 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 21/35] mempool: default allocation in several memory chunks Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 23/35] mempool: support no-hugepage mode Olivier Matz
                   ` (15 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Although the physical address won't be correct in memory segment,
this allows at least to retrieve the physical address using
rte_mem_virt2phy(). Indeed, if the page is not locked, the page
may not be present in physical memory.

With next commit, it allows a mempool to have properly filled physical
addresses when using --no-huge option.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 6008533..c2a5799 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1105,7 +1105,7 @@ rte_eal_hugepage_init(void)
 	/* hugetlbfs can be disabled */
 	if (internal_config.no_hugetlbfs) {
 		addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE,
-				MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
+			MAP_LOCKED | MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
 		if (addr == MAP_FAILED) {
 			RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__,
 					strerror(errno));
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 23/35] mempool: support no-hugepage mode
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (21 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 22/35] eal: lock memory when using no-huge Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 24/35] mempool: replace mempool physaddr by a memzone pointer Olivier Matz
                   ` (14 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Introduce a new function rte_mempool_populate_virt() that is now called
by default when hugepages are not supported. This function populate the
mempool with several physically contiguous chunks whose minimum size is
the page size of the system.

Thanks to this, rte_mempool_create() will work properly in without
hugepages (if the object size is smaller than a page size), and 2
specific workarouds can be removed:

- trailer_size was artificially extended to a page size
- rte_mempool_virt2phy() did not rely on object physical address

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 106 ++++++++++++++++++++++++++++++---------
 lib/librte_mempool/rte_mempool.h |  19 ++-----
 2 files changed, 86 insertions(+), 39 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 7fd2bb4..7ec6709 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -224,23 +224,6 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 		sz->trailer_size = new_size - sz->header_size - sz->elt_size;
 	}
 
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * compute trailer size so that pool elements fit exactly in
-		 * a standard page
-		 */
-		int page_size = getpagesize();
-		int new_size = page_size - sz->header_size - sz->elt_size;
-		if (new_size < 0 || (unsigned int)new_size < sz->trailer_size) {
-			printf("When hugepages are disabled, pool objects "
-			       "can't exceed PAGE_SIZE: %d + %d + %d > %d\n",
-			       sz->header_size, sz->elt_size, sz->trailer_size,
-			       page_size);
-			return 0;
-		}
-		sz->trailer_size = new_size;
-	}
-
 	/* this is the size of an object, including header and trailer */
 	sz->total_size = sz->header_size + sz->elt_size + sz->trailer_size;
 
@@ -509,15 +492,72 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	return cnt;
 }
 
-/* Default function to populate the mempool: allocate memory in mezones,
+/* Populate the mempool with a virtual area. Return the number of
+ * objects added, or a negative value on error. */
+static int
+rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
+	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque)
+{
+	phys_addr_t paddr;
+	size_t off, phys_len;
+	int ret, cnt = 0;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+	/* address and len must be page-aligned */
+	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
+		return -EINVAL;
+	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
+		return -EINVAL;
+
+	for (off = 0; off + pg_sz <= len &&
+		     mp->populated_size < mp->size; off += phys_len) {
+
+		paddr = rte_mem_virt2phy(addr + off);
+		if (paddr == RTE_BAD_PHYS_ADDR) {
+			ret = -EINVAL;
+			goto fail;
+		}
+
+		/* populate with the largest group of contiguous pages */
+		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
+			phys_addr_t paddr_tmp;
+
+			paddr_tmp = rte_mem_virt2phy(addr + off + phys_len);
+			paddr_tmp = rte_mem_phy2mch(-1, paddr_tmp);
+
+			if (paddr_tmp != paddr + phys_len)
+				break;
+		}
+
+		ret = rte_mempool_populate_phys(mp, addr + off, paddr,
+			phys_len, free_cb, opaque);
+		if (ret < 0)
+			goto fail;
+		/* no need to call the free callback for next chunks */
+		free_cb = NULL;
+		cnt += ret;
+	}
+
+	return cnt;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return ret;
+}
+
+/* Default function to populate the mempool: allocate memory in memzones,
  * and populate them. Return the number of objects added, or a negative
  * value on error. */
-static int rte_mempool_populate_default(struct rte_mempool *mp)
+static int
+rte_mempool_populate_default(struct rte_mempool *mp)
 {
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
-	size_t size, total_elt_sz, align;
+	size_t size, total_elt_sz, align, pg_sz, pg_shift;
 	unsigned mz_id, n;
 	int ret;
 
@@ -525,10 +565,19 @@ static int rte_mempool_populate_default(struct rte_mempool *mp)
 	if (mp->nb_mem_chunks != 0)
 		return -EEXIST;
 
-	align = RTE_CACHE_LINE_SIZE;
+	if (rte_eal_has_hugepages()) {
+		pg_shift = 0; /* not needed, zone is physically contiguous */
+		pg_sz = 0;
+		align = RTE_CACHE_LINE_SIZE;
+	} else {
+		pg_sz = getpagesize();
+		pg_shift = rte_bsf32(pg_sz);
+		align = pg_sz;
+	}
+
 	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
 	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
-		size = rte_mempool_xmem_size(n, total_elt_sz, 0);
+		size = rte_mempool_xmem_size(n, total_elt_sz, pg_shift);
 
 		ret = snprintf(mz_name, sizeof(mz_name),
 			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
@@ -548,9 +597,16 @@ static int rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		ret = rte_mempool_populate_phys(mp, mz->addr, mz->phys_addr,
-			mz->len, rte_mempool_memchunk_mz_free,
-			RTE_DECONST(void *, mz));
+		if (rte_eal_has_hugepages())
+			ret = rte_mempool_populate_phys(mp, mz->addr,
+				mz->phys_addr, mz->len,
+				rte_mempool_memchunk_mz_free,
+				RTE_DECONST(void *, mz));
+		else
+			ret = rte_mempool_populate_virt(mp, mz->addr,
+				mz->len, pg_sz,
+				rte_mempool_memchunk_mz_free,
+				RTE_DECONST(void *, mz));
 		if (ret < 0)
 			goto fail;
 	}
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 2770d80..7222c14 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1145,20 +1145,11 @@ rte_mempool_empty(const struct rte_mempool *mp)
 static inline phys_addr_t
 rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
 {
-	if (rte_eal_has_hugepages()) {
-		const struct rte_mempool_objhdr *hdr;
-
-		hdr = (const struct rte_mempool_objhdr *)
-			((const char *)elt - sizeof(*hdr));
-		return hdr->physaddr;
-	} else {
-		/*
-		 * If huge pages are disabled, we cannot assume the
-		 * memory region to be physically contiguous.
-		 * Lookup for each element.
-		 */
-		return rte_mem_virt2phy(elt);
-	}
+	const struct rte_mempool_objhdr *hdr;
+
+	hdr = (const struct rte_mempool_objhdr *)
+		((const char *)elt - sizeof(*hdr));
+	return hdr->physaddr;
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 24/35] mempool: replace mempool physaddr by a memzone pointer
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (22 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 23/35] mempool: support no-hugepage mode Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 25/35] mempool: introduce a function to free a mempool Olivier Matz
                   ` (13 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Storing the pointer to the memzone instead of the physical address
provides more information than just the physical address: for instance,
the memzone flags.

Moreover, keeping the memzone pointer will allow us to free the mempool
(this is done later in the series).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 4 ++--
 lib/librte_mempool/rte_mempool.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 7ec6709..9e2b72b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -721,7 +721,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	/* init the mempool structure */
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
-	mp->phys_addr = mz->phys_addr;
+	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
@@ -985,7 +985,7 @@ rte_mempool_dump(FILE *f, const struct rte_mempool *mp)
 	fprintf(f, "mempool <%s>@%p\n", mp->name, mp);
 	fprintf(f, "  flags=%x\n", mp->flags);
 	fprintf(f, "  ring=<%s>@%p\n", mp->ring->name, mp->ring);
-	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->phys_addr);
+	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->mz->phys_addr);
 	fprintf(f, "  nb_mem_chunks=%u\n", mp->nb_mem_chunks);
 	fprintf(f, "  size=%"PRIu32"\n", mp->size);
 	fprintf(f, "  populated_size=%"PRIu32"\n", mp->populated_size);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 7222c14..05241e1 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -203,7 +203,7 @@ struct rte_mempool_memhdr {
 struct rte_mempool {
 	char name[RTE_MEMPOOL_NAMESIZE]; /**< Name of mempool. */
 	struct rte_ring *ring;           /**< Ring to store objects. */
-	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
+	const struct rte_memzone *mz;    /**< Memzone where mempool is allocated */
 	int flags;                       /**< Flags of the mempool. */
 	int socket_id;                   /**< Socket id passed at mempool creation. */
 	uint32_t size;                   /**< Max size of the mempool. */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 25/35] mempool: introduce a function to free a mempool
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (23 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 24/35] mempool: replace mempool physaddr by a memzone pointer Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 26/35] mempool: introduce a function to create an empty mempool Olivier Matz
                   ` (12 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Introduce rte_mempool_free() that:

- unlink the mempool from the global list if it is found
- free all the memory chunks using their free callbacks
- free the internal ring
- free the memzone containing the mempool

Currently this function is only used in error cases when
creating a new mempool, but it will be made public later
in the patch series.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 36 ++++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 9e2b72b..4b74ffd 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -618,6 +618,35 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	return ret;
 }
 
+/* free a mempool */
+static void
+rte_mempool_free(struct rte_mempool *mp)
+{
+	struct rte_mempool_list *mempool_list = NULL;
+	struct rte_tailq_entry *te;
+
+	if (mp == NULL)
+		return;
+
+	mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+	/* find out tailq entry */
+	TAILQ_FOREACH(te, mempool_list, next) {
+		if (te->data == (void *)mp)
+			break;
+	}
+
+	if (te != NULL) {
+		TAILQ_REMOVE(mempool_list, te, next);
+		rte_free(te);
+	}
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	rte_mempool_free_memchunks(mp);
+	rte_ring_free(mp->ring);
+	rte_memzone_free(mp->mz);
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -777,12 +806,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	if (mp != NULL) {
-		rte_mempool_free_memchunks(mp);
-		rte_ring_free(mp->ring);
-	}
-	rte_free(te);
-	rte_memzone_free(mz);
+	rte_mempool_free(mp);
 
 	return NULL;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 26/35] mempool: introduce a function to create an empty mempool
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (24 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 25/35] mempool: introduce a function to free a mempool Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 27/35] eal/xen: return machine address without knowing memseg id Olivier Matz
                   ` (11 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Introduce a new function rte_mempool_create_empty()
that allocates a mempool that is not populated.

The functions rte_mempool_create() and rte_mempool_xmem_create()
now make use of it, making their code much easier to read.
Currently, they are the only users of rte_mempool_create_empty()
but the function will be made public in next commits.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 185 ++++++++++++++++++++++-----------------
 1 file changed, 107 insertions(+), 78 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 4b74ffd..afb2992 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -320,30 +320,6 @@ rte_dom0_mempool_create(const char *name __rte_unused,
 }
 #endif
 
-/* create the mempool */
-struct rte_mempool *
-rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
-		   unsigned cache_size, unsigned private_data_size,
-		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
-		   int socket_id, unsigned flags)
-{
-	if (rte_xen_dom0_supported())
-		return rte_dom0_mempool_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags);
-	else
-		return rte_mempool_xmem_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags,
-					       NULL, NULL, MEMPOOL_PG_NUM_DEFAULT,
-					       MEMPOOL_PG_SHIFT_MAX);
-}
-
 /* create the internal ring */
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
@@ -647,20 +623,11 @@ rte_mempool_free(struct rte_mempool *mp)
 	rte_memzone_free(mp->mz);
 }
 
-/*
- * Create the mempool over already allocated chunk of memory.
- * That external memory buffer can consists of physically disjoint pages.
- * Setting vaddr to NULL, makes mempool to fallback to original behaviour
- * and allocate space for mempool and it's elements as one big chunk of
- * physically continuos memory.
- * */
-struct rte_mempool *
-rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
-		unsigned cache_size, unsigned private_data_size,
-		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		int socket_id, unsigned flags, void *vaddr,
-		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+/* create an empty mempool */
+static struct rte_mempool *
+rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	int socket_id, unsigned flags)
 {
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	struct rte_mempool_list *mempool_list;
@@ -670,7 +637,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	struct rte_mempool_objsz objsz;
-	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -693,18 +659,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		return NULL;
 	}
 
-	/* check that we have both VA and PA */
-	if (vaddr != NULL && paddr == NULL) {
-		rte_errno = EINVAL;
-		return NULL;
-	}
-
-	/* Check that pg_num and pg_shift parameters are valid. */
-	if (pg_num == 0 || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
-		rte_errno = EINVAL;
-		return NULL;
-	}
-
 	/* "no cache align" imply "no spread" */
 	if (flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		flags |= MEMPOOL_F_NO_SPREAD;
@@ -732,11 +686,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		goto exit_unlock;
 	}
 
-	/*
-	 * If user provided an external memory buffer, then use it to
-	 * store mempool objects. Otherwise reserve a memzone that is large
-	 * enough to hold mempool header and metadata plus mempool objects.
-	 */
 	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
@@ -748,12 +697,14 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		goto exit_unlock;
 
 	/* init the mempool structure */
+	mp = mz->addr;
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
+	mp->socket_id = socket_id;
 	mp->elt_size = objsz.elt_size;
 	mp->header_size = objsz.header_size;
 	mp->trailer_size = objsz.trailer_size;
@@ -773,41 +724,119 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->local_cache = (struct rte_mempool_cache *)
 			((char *)mp + MEMPOOL_HEADER_SIZE(mp, 0));
 
-	/* call the initializer */
+	te->data = mp;
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+	TAILQ_INSERT_TAIL(mempool_list, te, next);
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+
+	return mp;
+
+exit_unlock:
+	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+	rte_free(te);
+	rte_mempool_free(mp);
+	return NULL;
+}
+
+/* create the mempool */
+struct rte_mempool *
+rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
+	int socket_id, unsigned flags)
+{
+	struct rte_mempool *mp;
+
+	if (rte_xen_dom0_supported())
+		return rte_dom0_mempool_create(name, n, elt_size,
+					       cache_size, private_data_size,
+					       mp_init, mp_init_arg,
+					       obj_init, obj_init_arg,
+					       socket_id, flags);
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		private_data_size, socket_id, flags);
+	if (mp == NULL)
+		return NULL;
+
+	/* call the mempool priv initializer */
 	if (mp_init)
 		mp_init(mp, mp_init_arg);
 
-	/* mempool elements allocated together with mempool */
+	if (rte_mempool_populate_default(mp) < 0)
+		goto fail;
+
+	/* call the object initializers */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
+
+	return mp;
+
+ fail:
+	rte_mempool_free(mp);
+	return NULL;
+}
+
+/*
+ * Create the mempool over already allocated chunk of memory.
+ * That external memory buffer can consists of physically disjoint pages.
+ * Setting vaddr to NULL, makes mempool to fallback to original behaviour
+ * and allocate space for mempool and it's elements as one big chunk of
+ * physically continuos memory.
+ * */
+struct rte_mempool *
+rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
+		unsigned cache_size, unsigned private_data_size,
+		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
+		int socket_id, unsigned flags, void *vaddr,
+		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+{
+	struct rte_mempool *mp = NULL;
+	int ret;
+
+	/* no virtual address supplied, use rte_mempool_create() */
 	if (vaddr == NULL)
-		ret = rte_mempool_populate_default(mp);
-	else
-		ret = rte_mempool_populate_phys_tab(mp, vaddr,
-			paddr, pg_num, pg_shift, NULL, NULL);
-	if (ret < 0) {
-		rte_errno = -ret;
-		goto exit_unlock;
-	} else if (ret != (int)mp->size) {
+		return rte_mempool_create(name, n, elt_size, cache_size,
+			private_data_size, mp_init, mp_init_arg,
+			obj_init, obj_init_arg, socket_id, flags);
+
+	/* check that we have both VA and PA */
+	if (paddr == NULL) {
 		rte_errno = EINVAL;
-		goto exit_unlock;
+		return NULL;
 	}
 
-	/* call the initializer */
-	if (obj_init)
-		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
+	/* Check that pg_shift parameter is valid. */
+	if (pg_shift > MEMPOOL_PG_SHIFT_MAX) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
 
-	te->data = (void *) mp;
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		private_data_size, socket_id, flags);
+	if (mp == NULL)
+		return NULL;
 
-	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
-	TAILQ_INSERT_TAIL(mempool_list, te, next);
-	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
-	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+	/* call the mempool priv initializer */
+	if (mp_init)
+		mp_init(mp, mp_init_arg);
+
+	ret = rte_mempool_populate_phys_tab(mp, vaddr, paddr, pg_num, pg_shift,
+		NULL, NULL);
+	if (ret < 0 || ret != (int)mp->size)
+		goto fail;
+
+	/* call the object initializers */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
 
 	return mp;
 
-exit_unlock:
-	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+ fail:
 	rte_mempool_free(mp);
-
 	return NULL;
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 27/35] eal/xen: return machine address without knowing memseg id
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (25 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 26/35] mempool: introduce a function to create an empty mempool Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 28/35] mempool: rework support of xen dom0 Olivier Matz
                   ` (10 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

The conversion from guest physical address to machine physical address
is fast when the caller knows the memseg corresponding to the gpa.

But in case the user does not know this information, just find it
by browsing the segments. This feature will be used by next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/include/rte_memory.h   | 11 ++++++-----
 lib/librte_eal/linuxapp/eal/eal_xen_memory.c | 17 +++++++++++++++--
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h
index f8dbece..0661109 100644
--- a/lib/librte_eal/common/include/rte_memory.h
+++ b/lib/librte_eal/common/include/rte_memory.h
@@ -200,21 +200,22 @@ unsigned rte_memory_get_nrank(void);
 int rte_xen_dom0_supported(void);
 
 /**< Internal use only - phys to virt mapping for xen */
-phys_addr_t rte_xen_mem_phy2mch(uint32_t, const phys_addr_t);
+phys_addr_t rte_xen_mem_phy2mch(int32_t, const phys_addr_t);
 
 /**
  * Return the physical address of elt, which is an element of the pool mp.
  *
  * @param memseg_id
- *   The mempool is from which memory segment.
+ *   Identifier of the memory segment owning the physical address. If
+ *   set to -1, find it automatically.
  * @param phy_addr
  *   physical address of elt.
  *
  * @return
- *   The physical address or error.
+ *   The physical address or RTE_BAD_PHYS_ADDR on error.
  */
 static inline phys_addr_t
-rte_mem_phy2mch(uint32_t memseg_id, const phys_addr_t phy_addr)
+rte_mem_phy2mch(int32_t memseg_id, const phys_addr_t phy_addr)
 {
 	if (rte_xen_dom0_supported())
 		return rte_xen_mem_phy2mch(memseg_id, phy_addr);
@@ -250,7 +251,7 @@ static inline int rte_xen_dom0_supported(void)
 }
 
 static inline phys_addr_t
-rte_mem_phy2mch(uint32_t memseg_id __rte_unused, const phys_addr_t phy_addr)
+rte_mem_phy2mch(int32_t memseg_id __rte_unused, const phys_addr_t phy_addr)
 {
 	return phy_addr;
 }
diff --git a/lib/librte_eal/linuxapp/eal/eal_xen_memory.c b/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
index 495eef9..efbd374 100644
--- a/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
@@ -156,13 +156,26 @@ get_xen_memory_size(void)
  * Based on physical address to caculate MFN in Xen Dom0.
  */
 phys_addr_t
-rte_xen_mem_phy2mch(uint32_t memseg_id, const phys_addr_t phy_addr)
+rte_xen_mem_phy2mch(int32_t memseg_id, const phys_addr_t phy_addr)
 {
-	int mfn_id;
+	int mfn_id, i;
 	uint64_t mfn, mfn_offset;
 	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
 	struct rte_memseg *memseg = mcfg->memseg;
 
+	/* find the memory segment owning the physical address */
+	if (memseg_id == -1) {
+		for (i = 0; i < RTE_MAX_MEMSEG; i++) {
+			if ((phy_addr >= memseg[i].phys_addr) &&
+				(phys_addr < memseg[i].phys_addr + memseg[i].size)) {
+				memseg_id = i;
+				break;
+			}
+		}
+		if (memseg_id == -1)
+			return RTE_BAD_PHYS_ADDR;
+	}
+
 	mfn_id = (phy_addr - memseg[memseg_id].phys_addr) / RTE_PGSIZE_2M;
 
 	/*the MFN is contiguous in 2M */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 28/35] mempool: rework support of xen dom0
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (26 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 27/35] eal/xen: return machine address without knowing memseg id Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 29/35] mempool: create the internal ring when populating Olivier Matz
                   ` (9 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Avoid to have a specific file for that, and remove #ifdefs.
Now that we have introduced a function to populate a mempool
with a virtual area, the support of xen dom0 is much easier.

The only thing we need to do is to convert the guest physical
address into the machine physical address using rte_mem_phy2mch().
This function does nothing when not running xen.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/Makefile                |   3 -
 lib/librte_mempool/rte_dom0_mempool.c      | 133 -----------------------------
 lib/librte_mempool/rte_mempool.c           |  33 ++-----
 lib/librte_mempool/rte_mempool.h           |  89 -------------------
 lib/librte_mempool/rte_mempool_version.map |   1 -
 5 files changed, 5 insertions(+), 254 deletions(-)
 delete mode 100644 lib/librte_mempool/rte_dom0_mempool.c

diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index 706f844..43423e0 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -42,9 +42,6 @@ LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
-ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
-SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_dom0_mempool.c
-endif
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_MEMPOOL)-include := rte_mempool.h
 
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
deleted file mode 100644
index dad755c..0000000
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ /dev/null
@@ -1,133 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <stdio.h>
-#include <string.h>
-#include <stdint.h>
-#include <unistd.h>
-#include <stdarg.h>
-#include <inttypes.h>
-#include <errno.h>
-#include <sys/queue.h>
-
-#include <rte_common.h>
-#include <rte_log.h>
-#include <rte_debug.h>
-#include <rte_memory.h>
-#include <rte_memzone.h>
-#include <rte_atomic.h>
-#include <rte_launch.h>
-#include <rte_eal.h>
-#include <rte_eal_memconfig.h>
-#include <rte_per_lcore.h>
-#include <rte_lcore.h>
-#include <rte_branch_prediction.h>
-#include <rte_ring.h>
-#include <rte_errno.h>
-#include <rte_string_fns.h>
-#include <rte_spinlock.h>
-
-#include "rte_mempool.h"
-
-static void
-get_phys_map(void *va, phys_addr_t pa[], uint32_t pg_num,
-	uint32_t pg_sz, uint32_t memseg_id)
-{
-	uint32_t i;
-	uint64_t virt_addr, mfn_id;
-	struct rte_mem_config *mcfg;
-	uint32_t page_size = getpagesize();
-
-	/* get pointer to global configuration */
-	mcfg = rte_eal_get_configuration()->mem_config;
-	virt_addr = (uintptr_t) mcfg->memseg[memseg_id].addr;
-
-	for (i = 0; i != pg_num; i++) {
-		mfn_id = ((uintptr_t)va + i * pg_sz - virt_addr) / RTE_PGSIZE_2M;
-		pa[i] = mcfg->memseg[memseg_id].mfn[mfn_id] * page_size;
-	}
-}
-
-/* create the mempool for supporting Dom0 */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
-	unsigned cache_size, unsigned private_data_size,
-	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-	int socket_id, unsigned flags)
-{
-	struct rte_mempool *mp = NULL;
-	phys_addr_t *pa;
-	char *va;
-	size_t sz;
-	uint32_t pg_num, pg_shift, pg_sz, total_size;
-	const struct rte_memzone *mz;
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-
-	pg_sz = RTE_PGSIZE_2M;
-
-	pg_shift = rte_bsf32(pg_sz);
-	total_size = rte_mempool_calc_obj_size(elt_size, flags, NULL);
-
-	/* calc max memory size and max number of pages needed. */
-	sz = rte_mempool_xmem_size(elt_num, total_size, pg_shift) +
-		RTE_PGSIZE_2M;
-	pg_num = sz >> pg_shift;
-
-	/* extract physical mappings of the allocated memory. */
-	pa = calloc(pg_num, sizeof (*pa));
-	if (pa == NULL)
-		return mp;
-
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT "_elt", name);
-	mz = rte_memzone_reserve(mz_name, sz, socket_id, mz_flags);
-	if (mz == NULL) {
-		free(pa);
-		return mp;
-	}
-
-	va = (char *)RTE_ALIGN_CEIL((uintptr_t)mz->addr, RTE_PGSIZE_2M);
-	/* extract physical mappings of the allocated memory. */
-	get_phys_map(va, pa, pg_num, pg_sz, mz->memseg_id);
-
-	mp = rte_mempool_xmem_create(name, elt_num, elt_size,
-		cache_size, private_data_size,
-		mp_init, mp_init_arg,
-		obj_init, obj_init_arg,
-		socket_id, flags, va, pa, pg_num, pg_shift);
-
-	free(pa);
-
-	return mp;
-}
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index afb2992..0f4cb4e 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -300,26 +300,6 @@ rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
 	return (size_t)paddr_idx << pg_shift;
 }
 
-#ifndef RTE_LIBRTE_XEN_DOM0
-/* stub if DOM0 support not configured */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name __rte_unused,
-			unsigned n __rte_unused,
-			unsigned elt_size __rte_unused,
-			unsigned cache_size __rte_unused,
-			unsigned private_data_size __rte_unused,
-			rte_mempool_ctor_t *mp_init __rte_unused,
-			void *mp_init_arg __rte_unused,
-			rte_mempool_obj_ctor_t *obj_init __rte_unused,
-			void *obj_init_arg __rte_unused,
-			int socket_id __rte_unused,
-			unsigned flags __rte_unused)
-{
-	rte_errno = EINVAL;
-	return NULL;
-}
-#endif
-
 /* create the internal ring */
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
@@ -492,6 +472,9 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 		     mp->populated_size < mp->size; off += phys_len) {
 
 		paddr = rte_mem_virt2phy(addr + off);
+		/* required for xen_dom0 to get the machine address */
+		paddr = rte_mem_phy2mch(-1, paddr);
+
 		if (paddr == RTE_BAD_PHYS_ADDR) {
 			ret = -EINVAL;
 			goto fail;
@@ -573,7 +556,8 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		if (rte_eal_has_hugepages())
+		/* use memzone physical address if it is valid */
+		if (rte_eal_has_hugepages() && !rte_xen_dom0_supported())
 			ret = rte_mempool_populate_phys(mp, mz->addr,
 				mz->phys_addr, mz->len,
 				rte_mempool_memchunk_mz_free,
@@ -749,13 +733,6 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 {
 	struct rte_mempool *mp;
 
-	if (rte_xen_dom0_supported())
-		return rte_dom0_mempool_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags);
-
 	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
 		private_data_size, socket_id, flags);
 	if (mp == NULL)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 05241e1..47743a6 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -500,95 +500,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
 /**
- * Create a new mempool named *name* in memory on Xen Dom0.
- *
- * This function uses ``rte_mempool_xmem_create()`` to allocate memory. The
- * pool contains n elements of elt_size. Its size is set to n.
- * All elements of the mempool are allocated together with the mempool header,
- * and memory buffer can consist of set of disjoint physical pages.
- *
- * @param name
- *   The name of the mempool.
- * @param n
- *   The number of elements in the mempool. The optimum size (in terms of
- *   memory usage) for a mempool is when n is a power of two minus one:
- *   n = (2^q - 1).
- * @param elt_size
- *   The size of each element.
- * @param cache_size
- *   If cache_size is non-zero, the rte_mempool library will try to
- *   limit the accesses to the common lockless pool, by maintaining a
- *   per-lcore object cache. This argument must be lower or equal to
- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
- *   cache_size to have "n modulo cache_size == 0": if this is
- *   not the case, some elements will always stay in the pool and will
- *   never be used. The access to the per-lcore table is of course
- *   faster than the multi-producer/consumer pool. The cache can be
- *   disabled if the cache_size argument is set to 0; it can be useful to
- *   avoid losing objects in cache. Note that even if not used, the
- *   memory space for cache is always reserved in a mempool structure,
- *   except if CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE is set to 0.
- * @param private_data_size
- *   The size of the private data appended after the mempool
- *   structure. This is useful for storing some private data after the
- *   mempool structure, as is done for rte_mbuf_pool for example.
- * @param mp_init
- *   A function pointer that is called for initialization of the pool,
- *   before object initialization. The user can initialize the private
- *   data in this function if needed. This parameter can be NULL if
- *   not needed.
- * @param mp_init_arg
- *   An opaque pointer to data that can be used in the mempool
- *   constructor function.
- * @param obj_init
- *   A function pointer that is called for each object at
- *   initialization of the pool. The user can set some meta data in
- *   objects if needed. This parameter can be NULL if not needed.
- *   The obj_init() function takes the mempool pointer, the init_arg,
- *   the object pointer and the object number as parameters.
- * @param obj_init_arg
- *   An opaque pointer to data that can be used as an argument for
- *   each call to the object constructor function.
- * @param socket_id
- *   The *socket_id* argument is the socket identifier in the case of
- *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
- *   constraint for the reserved zone.
- * @param flags
- *   The *flags* arguments is an OR of following flags:
- *   - MEMPOOL_F_NO_SPREAD: By default, objects addresses are spread
- *     between channels in RAM: the pool allocator will add padding
- *     between objects depending on the hardware configuration. See
- *     Memory alignment constraints for details. If this flag is set,
- *     the allocator will just align them to a cache line.
- *   - MEMPOOL_F_NO_CACHE_ALIGN: By default, the returned objects are
- *     cache-aligned. This flag removes this constraint, and no
- *     padding will be present between objects. This flag implies
- *     MEMPOOL_F_NO_SPREAD.
- *   - MEMPOOL_F_SP_PUT: If this flag is set, the default behavior
- *     when using rte_mempool_put() or rte_mempool_put_bulk() is
- *     "single-producer". Otherwise, it is "multi-producers".
- *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
- *     when using rte_mempool_get() or rte_mempool_get_bulk() is
- *     "single-consumer". Otherwise, it is "multi-consumers".
- * @return
- *   The pointer to the new allocated mempool, on success. NULL on error
- *   with rte_errno set appropriately. Possible rte_errno values include:
- *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
- *    - E_RTE_SECONDARY - function was called from a secondary process instance
- *    - EINVAL - cache size provided is too large
- *    - ENOSPC - the maximum number of memzones has already been allocated
- *    - EEXIST - a memzone with the same name already exists
- *    - ENOMEM - no appropriate memory area found in which to create memzone
- */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
-		unsigned cache_size, unsigned private_data_size,
-		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		int socket_id, unsigned flags);
-
-
-/**
  * Call a function for each mempool element
  *
  * Iterate across all objects attached to a rte_mempool and call the
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index ca887b5..c4f2da0 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -1,7 +1,6 @@
 DPDK_2.0 {
 	global:
 
-	rte_dom0_mempool_create;
 	rte_mempool_audit;
 	rte_mempool_calc_obj_size;
 	rte_mempool_count;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 29/35] mempool: create the internal ring when populating
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (27 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 28/35] mempool: rework support of xen dom0 Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 30/35] mempool: populate a mempool with anonymous memory Olivier Matz
                   ` (8 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Instead of creating the internal ring at mempool creation, do
it when populating the mempool with the first memory chunk. The
objective here is to simplify the change of external handler
when it will be introduced.

For instance, this will be possible:

  mp = rte_mempool_create_empty(...)
  rte_mempool_set_ext_handler(mp, my_handler)
  rte_mempool_populate_default()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 12 +++++++++---
 lib/librte_mempool/rte_mempool.h |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0f4cb4e..2546740 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -326,6 +326,7 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 		return -rte_errno;
 
 	mp->ring = r;
+	mp->flags |= MEMPOOL_F_RING_CREATED;
 	return 0;
 }
 
@@ -374,6 +375,14 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	unsigned i = 0;
 	size_t off;
 	struct rte_mempool_memhdr *memhdr;
+	int ret;
+
+	/* create the internal ring if not already done */
+	if ((mp->flags & MEMPOOL_F_RING_CREATED) == 0) {
+		ret = rte_mempool_ring_create(mp);
+		if (ret < 0)
+			return ret;
+	}
 
 	/* mempool is already populated */
 	if (mp->populated_size >= mp->size)
@@ -698,9 +707,6 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	STAILQ_INIT(&mp->elt_list);
 	STAILQ_INIT(&mp->mem_list);
 
-	if (rte_mempool_ring_create(mp) < 0)
-		goto exit_unlock;
-
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
 	 * The local_cache points to just past the elt_pa[] array.
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 47743a6..e0549c6 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -234,6 +234,7 @@ struct rte_mempool {
 #define MEMPOOL_F_NO_CACHE_ALIGN 0x0002 /**< Do not align objs on cache lines.*/
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
+#define MEMPOOL_F_RING_CREATED   0x0010 /**< Internal: ring is created */
 
 /**
  * @internal When debug is enabled, store some statistics.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 30/35] mempool: populate a mempool with anonymous memory
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (28 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 29/35] mempool: create the internal ring when populating Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 31/35] test-pmd: remove specific anon mempool code Olivier Matz
                   ` (7 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Now that we can populate a mempool with any virtual memory,
it is easier to introduce a function to populate a mempool
with memory coming from an anonymous mapping, as it's done
in test-pmd.

The next commit will replace test-pmd anonymous mapping by
this function.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 58 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 2546740..1f5ba50 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -39,6 +39,7 @@
 #include <inttypes.h>
 #include <errno.h>
 #include <sys/queue.h>
+#include <sys/mman.h>
 
 #include <rte_common.h>
 #include <rte_log.h>
@@ -587,6 +588,63 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	return ret;
 }
 
+/* return the memory size required for mempool objects in anonymous mem */
+static size_t
+get_anon_size(const struct rte_mempool *mp)
+{
+	size_t size, total_elt_sz, pg_sz, pg_shift;
+
+	pg_sz = getpagesize();
+	pg_shift = rte_bsf32(pg_sz);
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+	size = rte_mempool_xmem_size(mp->size, total_elt_sz, pg_shift);
+
+	return size;
+}
+
+/* unmap a memory zone mapped by rte_mempool_populate_anon() */
+static void
+rte_mempool_memchunk_anon_free(struct rte_mempool_memhdr *memhdr,
+	void *opaque)
+{
+	munmap(opaque, get_anon_size(memhdr->mp));
+}
+
+/* populate the mempool with an anonymous mapping */
+__rte_unused static int
+rte_mempool_populate_anon(struct rte_mempool *mp)
+{
+	size_t size;
+	int ret;
+	char *addr;
+
+	/* mempool is already populated, error */
+	if (!STAILQ_EMPTY(&mp->mem_list)) {
+		rte_errno = EINVAL;
+		return 0;
+	}
+
+	/* get chunk of virtually continuous memory */
+	size = get_anon_size(mp);
+	addr = mmap(NULL, size, PROT_READ | PROT_WRITE,
+		MAP_SHARED | MAP_ANONYMOUS | MAP_LOCKED, -1, 0);
+	if (addr == MAP_FAILED) {
+		rte_errno = errno;
+		return 0;
+	}
+
+	ret = rte_mempool_populate_virt(mp, addr, size, getpagesize(),
+		rte_mempool_memchunk_anon_free, addr);
+	if (ret == 0)
+		goto fail;
+
+	return mp->populated_size;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return 0;
+}
+
 /* free a mempool */
 static void
 rte_mempool_free(struct rte_mempool *mp)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 31/35] test-pmd: remove specific anon mempool code
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (29 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 30/35] mempool: populate a mempool with anonymous memory Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 32/35] mempool: make mempool populate and free api public Olivier Matz
                   ` (6 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Now that mempool library provide functions to populate with anonymous
mmap'd memory, we can remove this specific code from test-pmd.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test-pmd/Makefile        |   4 -
 app/test-pmd/mempool_anon.c  | 201 -------------------------------------------
 app/test-pmd/mempool_osdep.h |  54 ------------
 app/test-pmd/testpmd.c       |  17 ++--
 4 files changed, 11 insertions(+), 265 deletions(-)
 delete mode 100644 app/test-pmd/mempool_anon.c
 delete mode 100644 app/test-pmd/mempool_osdep.h

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 72426f3..40039a1 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -58,11 +58,7 @@ SRCS-y += txonly.c
 SRCS-y += csumonly.c
 SRCS-y += icmpecho.c
 SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c
-SRCS-y += mempool_anon.c
 
-ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
-CFLAGS_mempool_anon.o := -D_GNU_SOURCE
-endif
 CFLAGS_cmdline.o := -D_GNU_SOURCE
 
 # this application needs libraries first
diff --git a/app/test-pmd/mempool_anon.c b/app/test-pmd/mempool_anon.c
deleted file mode 100644
index 5e23848..0000000
--- a/app/test-pmd/mempool_anon.c
+++ /dev/null
@@ -1,201 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <sys/types.h>
-#include <sys/stat.h>
-#include "mempool_osdep.h"
-#include <rte_errno.h>
-
-#ifdef RTE_EXEC_ENV_LINUXAPP
-
-#include <fcntl.h>
-#include <unistd.h>
-#include <sys/mman.h>
-
-
-#define	PAGEMAP_FNAME		"/proc/self/pagemap"
-
-/*
- * the pfn (page frame number) are bits 0-54 (see pagemap.txt in linux
- * Documentation).
- */
-#define	PAGEMAP_PFN_BITS	54
-#define	PAGEMAP_PFN_MASK	RTE_LEN2MASK(PAGEMAP_PFN_BITS, phys_addr_t)
-
-
-static int
-get_phys_map(void *va, phys_addr_t pa[], uint32_t pg_num, uint32_t pg_sz)
-{
-	int32_t fd, rc;
-	uint32_t i, nb;
-	off_t ofs;
-
-	ofs = (uintptr_t)va / pg_sz * sizeof(*pa);
-	nb = pg_num * sizeof(*pa);
-
-	if ((fd = open(PAGEMAP_FNAME, O_RDONLY)) < 0)
-		return ENOENT;
-
-	if ((rc = pread(fd, pa, nb, ofs)) < 0 || (rc -= nb) != 0) {
-
-		RTE_LOG(ERR, USER1, "failed read of %u bytes from \'%s\' "
-			"at offset %zu, error code: %d\n",
-			nb, PAGEMAP_FNAME, (size_t)ofs, errno);
-		rc = ENOENT;
-	}
-
-	close(fd);
-
-	for (i = 0; i != pg_num; i++)
-		pa[i] = (pa[i] & PAGEMAP_PFN_MASK) * pg_sz;
-
-	return rc;
-}
-
-struct rte_mempool *
-mempool_anon_create(const char *name, unsigned elt_num, unsigned elt_size,
-		   unsigned cache_size, unsigned private_data_size,
-		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		   int socket_id, unsigned flags)
-{
-	struct rte_mempool *mp;
-	phys_addr_t *pa;
-	char *va, *uv;
-	uint32_t n, pg_num, pg_shift, pg_sz, total_size;
-	size_t sz;
-	ssize_t usz;
-	int32_t rc;
-
-	rc = ENOMEM;
-	mp = NULL;
-
-	pg_sz = getpagesize();
-	if (rte_is_power_of_2(pg_sz) == 0) {
-		rte_errno = EINVAL;
-		return mp;
-	}
-
-	pg_shift = rte_bsf32(pg_sz);
-
-	total_size = rte_mempool_calc_obj_size(elt_size, flags, NULL);
-
-	/* calc max memory size and max number of pages needed. */
-	sz = rte_mempool_xmem_size(elt_num, total_size, pg_shift);
-	pg_num = sz >> pg_shift;
-
-	/* get chunk of virtually continuos memory.*/
-	if ((va = mmap(NULL, sz, PROT_READ | PROT_WRITE,
-			MAP_SHARED | MAP_ANONYMOUS | MAP_LOCKED,
-			-1, 0)) == MAP_FAILED) {
-		RTE_LOG(ERR, USER1, "%s(%s) failed mmap of %zu bytes, "
-			"error code: %d\n",
-			__func__, name, sz, errno);
-		rte_errno = rc;
-		return mp;
-	}
-
-	/* extract physical mappings of the allocated memory. */
-	if ((pa = calloc(pg_num, sizeof (*pa))) != NULL &&
-			(rc = get_phys_map(va, pa, pg_num, pg_sz)) == 0) {
-
-		/*
-		 * Check that allocated size is big enough to hold elt_num
-		 * objects and a calcualte how many bytes are actually required.
-		 */
-
-		if ((usz = rte_mempool_xmem_usage(va, elt_num, total_size, pa,
-				pg_num, pg_shift)) < 0) {
-
-			n = -usz;
-			rc = ENOENT;
-			RTE_LOG(ERR, USER1, "%s(%s) only %u objects from %u "
-				"requested can  be created over "
-				"mmaped region %p of %zu bytes\n",
-				__func__, name, n, elt_num, va, sz);
-		} else {
-
-			/* unmap unused pages if any */
-			if ((size_t)usz < sz) {
-
-				uv = va + usz;
-				usz = sz - usz;
-
-				RTE_LOG(INFO, USER1,
-					"%s(%s): unmap unused %zu of %zu "
-					"mmaped bytes @%p\n",
-					__func__, name, (size_t)usz, sz, uv);
-				munmap(uv, usz);
-				sz -= usz;
-				pg_num = sz >> pg_shift;
-			}
-
-			if ((mp = rte_mempool_xmem_create(name, elt_num,
-					elt_size, cache_size, private_data_size,
-					mp_init, mp_init_arg,
-					obj_init, obj_init_arg,
-					socket_id, flags, va, pa, pg_num,
-					pg_shift)) != NULL)
-
-				RTE_VERIFY(elt_num == mp->size);
-		}
-	}
-
-	if (mp == NULL) {
-		munmap(va, sz);
-		rte_errno = rc;
-	}
-
-	free(pa);
-	return mp;
-}
-
-#else /* RTE_EXEC_ENV_LINUXAPP */
-
-
-struct rte_mempool *
-mempool_anon_create(__rte_unused const char *name,
-	__rte_unused unsigned elt_num, __rte_unused unsigned elt_size,
-	__rte_unused unsigned cache_size,
-	__rte_unused unsigned private_data_size,
-	__rte_unused rte_mempool_ctor_t *mp_init,
-	__rte_unused void *mp_init_arg,
-	__rte_unused rte_mempool_obj_cb_t *obj_init,
-	__rte_unused void *obj_init_arg,
-	__rte_unused int socket_id, __rte_unused unsigned flags)
-{
-	rte_errno = ENOTSUP;
-	return NULL;
-}
-
-#endif /* RTE_EXEC_ENV_LINUXAPP */
diff --git a/app/test-pmd/mempool_osdep.h b/app/test-pmd/mempool_osdep.h
deleted file mode 100644
index 7ce7297..0000000
--- a/app/test-pmd/mempool_osdep.h
+++ /dev/null
@@ -1,54 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _MEMPOOL_OSDEP_H_
-#define _MEMPOOL_OSDEP_H_
-
-#include <rte_mempool.h>
-
-/**
- * @file
- * mempool OS specific header.
- */
-
-/*
- * Create mempool over objects from mmap(..., MAP_ANONYMOUS, ...).
- */
-struct rte_mempool *
-mempool_anon_create(const char *name, unsigned n, unsigned elt_size,
-	unsigned cache_size, unsigned private_data_size,
-	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-	int socket_id, unsigned flags);
-
-#endif /*_RTE_MEMPOOL_OSDEP_H_ */
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 1319917..a8adf93 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -77,7 +77,6 @@
 #endif
 
 #include "testpmd.h"
-#include "mempool_osdep.h"
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 
@@ -427,17 +426,23 @@ mbuf_pool_create(uint16_t mbuf_seg_size, unsigned nb_mbuf,
 
 
 #else
-	if (mp_anon != 0)
-		rte_mp = mempool_anon_create(pool_name, nb_mbuf, mb_size,
+	if (mp_anon != 0) {
+		rte_mp = rte_mempool_create_empty(pool_name, nb_mbuf, mb_size,
 				    (unsigned) mb_mempool_cache,
 				    sizeof(struct rte_pktmbuf_pool_private),
-				    rte_pktmbuf_pool_init, NULL,
-				    rte_pktmbuf_init, NULL,
 				    socket_id, 0);
-	else
+
+		if (rte_mempool_populate_anon(rte_mp) == 0) {
+			rte_mempool_free(rte_mp);
+			rte_mp = NULL;
+		}
+		rte_pktmbuf_pool_init(rte_mp, NULL);
+		rte_mempool_obj_iter(rte_mp, rte_pktmbuf_init, NULL);
+	} else {
 		/* wrapper to rte_mempool_create() */
 		rte_mp = rte_pktmbuf_pool_create(pool_name, nb_mbuf,
 			mb_mempool_cache, 0, mbuf_seg_size, socket_id);
+	}
 
 #endif
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 32/35] mempool: make mempool populate and free api public
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (30 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 31/35] test-pmd: remove specific anon mempool code Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 33/35] mem: avoid memzone/mempool/ring name truncation Olivier Matz
                   ` (5 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Add the following functions to the public mempool API:

- rte_mempool_create_empty()
- rte_mempool_populate_phys()
- rte_mempool_populate_phys_tab()
- rte_mempool_populate_virt()
- rte_mempool_populate_default()
- rte_mempool_populate_anon()
- rte_mempool_free()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c           |  14 +--
 lib/librte_mempool/rte_mempool.h           | 168 +++++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool_version.map |   9 +-
 3 files changed, 183 insertions(+), 8 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 1f5ba50..2a7d6cd 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -367,7 +367,7 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
 /* Add objects in the pool, using a physically contiguous memory
  * zone. Return the number of objects added, or a negative value
  * on error. */
-static int
+int
 rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
 	void *opaque)
@@ -425,7 +425,7 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 
 /* Add objects in the pool, using a table of physical pages. Return the
  * number of objects added, or a negative value on error. */
-static int
+int
 rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
 	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque)
@@ -460,7 +460,7 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 
 /* Populate the mempool with a virtual area. Return the number of
  * objects added, or a negative value on error. */
-static int
+int
 rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
 	void *opaque)
@@ -520,7 +520,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 /* Default function to populate the mempool: allocate memory in memzones,
  * and populate them. Return the number of objects added, or a negative
  * value on error. */
-static int
+int
 rte_mempool_populate_default(struct rte_mempool *mp)
 {
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
@@ -611,7 +611,7 @@ rte_mempool_memchunk_anon_free(struct rte_mempool_memhdr *memhdr,
 }
 
 /* populate the mempool with an anonymous mapping */
-__rte_unused static int
+int
 rte_mempool_populate_anon(struct rte_mempool *mp)
 {
 	size_t size;
@@ -646,7 +646,7 @@ rte_mempool_populate_anon(struct rte_mempool *mp)
 }
 
 /* free a mempool */
-static void
+void
 rte_mempool_free(struct rte_mempool *mp)
 {
 	struct rte_mempool_list *mempool_list = NULL;
@@ -675,7 +675,7 @@ rte_mempool_free(struct rte_mempool *mp)
 }
 
 /* create an empty mempool */
-static struct rte_mempool *
+struct rte_mempool *
 rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	int socket_id, unsigned flags)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index e0549c6..7a3e652 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -501,6 +501,174 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
 /**
+ * Create an empty mempool
+ *
+ * The mempool is allocated and initialized, but it is not populated: no
+ * memory is allocated for the mempool elements. The user has to call
+ * rte_mempool_populate_*() or to add memory chunks to the pool. Once
+ * populated, the user may also want to initialize each object with
+ * rte_mempool_obj_iter().
+ *
+ * @param name
+ *   The name of the mempool.
+ * @param n
+ *   The maximum number of elements that can be added in the mempool.
+ *   The optimum size (in terms of memory usage) for a mempool is when n
+ *   is a power of two minus one: n = (2^q - 1).
+ * @param elt_size
+ *   The size of each element.
+ * @param cache_size
+ *   Size of the cache. See rte_mempool_create() for details.
+ * @param private_data_size
+ *   The size of the private data appended after the mempool
+ *   structure. This is useful for storing some private data after the
+ *   mempool structure, as is done for rte_mbuf_pool for example.
+ * @param socket_id
+ *   The *socket_id* argument is the socket identifier in the case of
+ *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
+ *   constraint for the reserved zone.
+ * @param flags
+ *   Flags controlling the behavior of the mempool. See
+ *   rte_mempool_create() for details.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. See rte_mempool_create() for details.
+ */
+struct rte_mempool *
+rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	int socket_id, unsigned flags);
+/**
+ * Free a mempool
+ *
+ * Unlink the mempool from global list, free the memory chunks, and all
+ * memory referenced by the mempool. The objects must not be used by
+ * other cores as they will be freed.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ */
+void
+rte_mempool_free(struct rte_mempool *mp);
+
+/**
+ * Add physically contiguous memory for objects in the pool at init
+ *
+ * Add a virtually and physically contiguous memory chunk in the pool
+ * where objects can be instanciated.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param vaddr
+ *   The virtual address of memory that should be used to store objects.
+ * @param paddr
+ *   The physical address
+ * @param len
+ *   The length of memory in bytes.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
+	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque);
+
+/**
+ * Add physical memory for objects in the pool at init
+ *
+ * Add a virtually contiguous memory chunk in the pool where objects can
+ * be instanciated. The physical addresses corresponding to the virtual
+ * area are described in paddr[], pg_num, pg_shift.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param vaddr
+ *   The virtual address of memory that should be used to store objects.
+ * @param paddr
+ *   An array of physical addresses of each page composing the virtual
+ *   area.
+ * @param pg_num
+ *   Number of elements in the paddr array.
+ * @param pg_shift
+ *   LOG2 of the physical pages size.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunks are not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
+	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque);
+
+/**
+ * Add virtually contiguous memory for objects in the pool at init
+ *
+ * Add a virtually contiguous memory chunk in the pool where objects can
+ * be instanciated.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param addr
+ *   The virtual address of memory that should be used to store objects.
+ *   Must be page-aligned.
+ * @param len
+ *   The length of memory in bytes. Must be page-aligned.
+ * @param pg_sz
+ *   The size of memory pages in this virtual area.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int
+rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
+	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque);
+
+/**
+ * Add memory for objects in the pool at init
+ *
+ * This is the default function used by rte_mempool_create() to populate
+ * the mempool. It adds memory allocated using rte_memzone_reserve().
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_default(struct rte_mempool *mp);
+
+/**
+ * Add memory from anonymous mapping for objects in the pool at init
+ *
+ * This function mmap an anonymous memory zone that is locked in
+ * memory to store the objects of the mempool.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_anon(struct rte_mempool *mp);
+
+/**
  * Call a function for each mempool element
  *
  * Iterate across all objects attached to a rte_mempool and call the
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index c4f2da0..7d1f670 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -16,11 +16,18 @@ DPDK_2.0 {
 	local: *;
 };
 
-DPDK_16.07 {
+DPDK_16.7 {
 	global:
 
 	rte_mempool_obj_iter;
 	rte_mempool_mem_iter;
+	rte_mempool_create_empty;
+	rte_mempool_populate_phys;
+	rte_mempool_populate_phys_tab;
+	rte_mempool_populate_virt;
+	rte_mempool_populate_default;
+	rte_mempool_populate_anon;
+	rte_mempool_free;
 
 	local: *;
 } DPDK_2.0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 33/35] mem: avoid memzone/mempool/ring name truncation
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (31 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 32/35] mempool: make mempool populate and free api public Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 34/35] mempool: new flag when phys contig mem is not needed Olivier Matz
                   ` (4 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Check the return value of snprintf to ensure that the name of
the object is not truncated.

By the way, update the test to avoid to trigger an error in
that case.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c                    | 12 ++++++++----
 lib/librte_eal/common/eal_common_memzone.c | 10 +++++++++-
 lib/librte_mempool/rte_mempool.c           | 20 ++++++++++++++++----
 lib/librte_ring/rte_ring.c                 | 16 +++++++++++++---
 4 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 80d95d5..93098b3 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -407,21 +407,25 @@ test_mempool_same_name_twice_creation(void)
 {
 	struct rte_mempool *mp_tc;
 
-	mp_tc = rte_mempool_create("test_mempool_same_name_twice_creation", MEMPOOL_SIZE,
+	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
 						MEMPOOL_ELT_SIZE, 0, 0,
 						NULL, NULL,
 						NULL, NULL,
 						SOCKET_ID_ANY, 0);
-	if (NULL == mp_tc)
+	if (NULL == mp_tc) {
+		printf("cannot create mempool\n");
 		return -1;
+	}
 
-	mp_tc = rte_mempool_create("test_mempool_same_name_twice_creation", MEMPOOL_SIZE,
+	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
 						MEMPOOL_ELT_SIZE, 0, 0,
 						NULL, NULL,
 						NULL, NULL,
 						SOCKET_ID_ANY, 0);
-	if (NULL != mp_tc)
+	if (NULL != mp_tc) {
+		printf("should not be able to create mempool\n");
 		return -1;
+	}
 
 	return 0;
 }
diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 711c845..774eb5d 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -126,6 +126,7 @@ static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
+	struct rte_memzone *mz;
 	struct rte_mem_config *mcfg;
 	size_t requested_len;
 	int socket, i;
@@ -148,6 +149,13 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 		return NULL;
 	}
 
+	if (strlen(name) >= sizeof(mz->name) - 1) {
+		RTE_LOG(DEBUG, EAL, "%s(): memzone <%s>: name too long\n",
+			__func__, name);
+		rte_errno = EEXIST;
+		return NULL;
+	}
+
 	/* if alignment is not a power of two */
 	if (align && !rte_is_power_of_2(align)) {
 		RTE_LOG(ERR, EAL, "%s(): Invalid alignment: %u\n", __func__,
@@ -223,7 +231,7 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
-	struct rte_memzone *mz = get_next_free_memzone();
+	mz = get_next_free_memzone();
 
 	if (mz == NULL) {
 		RTE_LOG(ERR, EAL, "%s(): Cannot find free memzone but there is room "
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 2a7d6cd..397e6ec 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -305,11 +305,14 @@ rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
 {
-	int rg_flags = 0;
+	int rg_flags = 0, ret;
 	char rg_name[RTE_RING_NAMESIZE];
 	struct rte_ring *r;
 
-	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, mp->name);
+	ret = snprintf(rg_name, sizeof(rg_name),
+		RTE_MEMPOOL_MZ_FORMAT, mp->name);
+	if (ret < 0 || ret >= (int)sizeof(rg_name))
+		return -ENAMETOOLONG;
 
 	/* ring flags */
 	if (mp->flags & MEMPOOL_F_SP_PUT)
@@ -688,6 +691,7 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	struct rte_mempool_objsz objsz;
+	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -741,7 +745,11 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
 
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
+	ret = snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
+	if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+		rte_errno = ENAMETOOLONG;
+		goto exit_unlock;
+	}
 
 	mz = rte_memzone_reserve(mz_name, mempool_size, socket_id, mz_flags);
 	if (mz == NULL)
@@ -750,7 +758,11 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	/* init the mempool structure */
 	mp = mz->addr;
 	memset(mp, 0, sizeof(*mp));
-	snprintf(mp->name, sizeof(mp->name), "%s", name);
+	ret = snprintf(mp->name, sizeof(mp->name), "%s", name);
+	if (ret < 0 || ret >= (int)sizeof(mp->name)) {
+		rte_errno = ENAMETOOLONG;
+		goto exit_unlock;
+	}
 	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index d80faf3..ca0a108 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -122,6 +122,8 @@ int
 rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	unsigned flags)
 {
+	int ret;
+
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
 			  RTE_CACHE_LINE_MASK) != 0);
@@ -140,7 +142,9 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 
 	/* init the ring structure */
 	memset(r, 0, sizeof(*r));
-	snprintf(r->name, sizeof(r->name), "%s", name);
+	ret = snprintf(r->name, sizeof(r->name), "%s", name);
+	if (ret < 0 || ret >= (int)sizeof(r->name))
+		return -ENAMETOOLONG;
 	r->flags = flags;
 	r->prod.watermark = count;
 	r->prod.sp_enqueue = !!(flags & RING_F_SP_ENQ);
@@ -165,6 +169,7 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 	ssize_t ring_size;
 	int mz_flags = 0;
 	struct rte_ring_list* ring_list = NULL;
+	int ret;
 
 	ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list);
 
@@ -174,6 +179,13 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 		return NULL;
 	}
 
+	ret = snprintf(mz_name, sizeof(mz_name), "%s%s",
+		RTE_RING_MZ_PREFIX, name);
+	if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+		rte_errno = ENAMETOOLONG;
+		return NULL;
+	}
+
 	te = rte_zmalloc("RING_TAILQ_ENTRY", sizeof(*te), 0);
 	if (te == NULL) {
 		RTE_LOG(ERR, RING, "Cannot reserve memory for tailq\n");
@@ -181,8 +193,6 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 		return NULL;
 	}
 
-	snprintf(mz_name, sizeof(mz_name), "%s%s", RTE_RING_MZ_PREFIX, name);
-
 	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
 
 	/* reserve a memory zone for this ring. If we can't get rte_config or
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 34/35] mempool: new flag when phys contig mem is not needed
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (32 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 33/35] mem: avoid memzone/mempool/ring name truncation Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 16:19 ` [RFC 35/35] mempool: update copyright Olivier Matz
                   ` (3 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Add a new flag to remove the constraint of having physically contiguous
objects inside a mempool.

Add this flag to the log history mempool to start, but we could add
it in most cases where objects are not mbufs.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/eal_common_log.c |  2 +-
 lib/librte_mempool/rte_mempool.c       | 23 ++++++++++++++++++++---
 lib/librte_mempool/rte_mempool.h       |  5 +++++
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
index 1ae8de7..9122b34 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -322,7 +322,7 @@ rte_eal_common_log_init(FILE *default_log)
 				LOG_ELT_SIZE, 0, 0,
 				NULL, NULL,
 				NULL, NULL,
-				SOCKET_ID_ANY, 0);
+				SOCKET_ID_ANY, MEMPOOL_F_NO_PHYS_CONTIG);
 
 	if ((log_history_mp == NULL) &&
 	    ((log_history_mp = rte_mempool_lookup(LOG_HISTORY_MP_NAME)) == NULL)){
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 397e6ec..209449a 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -412,7 +412,11 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 
 	while (off + total_elt_sz <= len && mp->populated_size < mp->size) {
 		off += mp->header_size;
-		mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
+		if (paddr == RTE_BAD_PHYS_ADDR)
+			mempool_add_elem(mp, (char *)vaddr + off,
+				RTE_BAD_PHYS_ADDR);
+		else
+			mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
 		off += mp->elt_size + mp->trailer_size;
 		i++;
 	}
@@ -441,6 +445,10 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	if (mp->nb_mem_chunks != 0)
 		return -EEXIST;
 
+	if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+		return rte_mempool_populate_phys(mp, vaddr, RTE_BAD_PHYS_ADDR,
+			pg_num * pg_sz, free_cb, opaque);
+
 	for (i = 0; i < pg_num && mp->populated_size < mp->size; i += n) {
 
 		/* populate with the largest group of contiguous pages */
@@ -481,6 +489,10 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
 		return -EINVAL;
 
+	if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+		return rte_mempool_populate_phys(mp, addr, RTE_BAD_PHYS_ADDR,
+			len, free_cb, opaque);
+
 	for (off = 0; off + pg_sz <= len &&
 		     mp->populated_size < mp->size; off += phys_len) {
 
@@ -530,6 +542,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
 	size_t size, total_elt_sz, align, pg_sz, pg_shift;
+	phys_addr_t paddr;
 	unsigned mz_id, n;
 	int ret;
 
@@ -569,10 +582,14 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		/* use memzone physical address if it is valid */
+		if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+			paddr = RTE_BAD_PHYS_ADDR;
+		else
+			paddr = mz->phys_addr;
+
 		if (rte_eal_has_hugepages() && !rte_xen_dom0_supported())
 			ret = rte_mempool_populate_phys(mp, mz->addr,
-				mz->phys_addr, mz->len,
+				paddr, mz->len,
 				rte_mempool_memchunk_mz_free,
 				RTE_DECONST(void *, mz));
 		else
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 7a3e652..7599790 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -235,6 +235,7 @@ struct rte_mempool {
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
 #define MEMPOOL_F_RING_CREATED   0x0010 /**< Internal: ring is created */
+#define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
 
 /**
  * @internal When debug is enabled, store some statistics.
@@ -416,6 +417,8 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, void *);
  *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
  *     when using rte_mempool_get() or rte_mempool_get_bulk() is
  *     "single-consumer". Otherwise, it is "multi-consumers".
+ *   - MEMPOOL_F_NO_PHYS_CONTIG: If set, allocated objects won't
+ *     necessarilly be contiguous in physical memory.
  * @return
  *   The pointer to the new allocated mempool, on success. NULL on error
  *   with rte_errno set appropriately. Possible rte_errno values include:
@@ -1221,6 +1224,8 @@ rte_mempool_empty(const struct rte_mempool *mp)
  *   A pointer (virtual address) to the element of the pool.
  * @return
  *   The physical address of the elt element.
+ *   If the mempool was created with MEMPOOL_F_NO_PHYS_CONTIG, the
+ *   returned value is RTE_BAD_PHYS_ADDR.
  */
 static inline phys_addr_t
 rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [RFC 35/35] mempool: update copyright
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (33 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 34/35] mempool: new flag when phys contig mem is not needed Olivier Matz
@ 2016-03-09 16:19 ` Olivier Matz
  2016-03-09 18:52   ` Stephen Hemminger
  2016-03-09 16:44 ` [RFC 00/35] mempool: rework memory allocation Olivier MATZ
                   ` (2 subsequent siblings)
  37 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-03-09 16:19 UTC (permalink / raw)
  To: dev

Update the copyright of files touched by this patch series.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 1 +
 lib/librte_mempool/rte_mempool.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 209449a..3851edd 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2016 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 7599790..56220a4 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2016 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* Re: [RFC 00/35] mempool: rework memory allocation
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (34 preceding siblings ...)
  2016-03-09 16:19 ` [RFC 35/35] mempool: update copyright Olivier Matz
@ 2016-03-09 16:44 ` Olivier MATZ
  2016-03-17  9:05 ` [PATCH] doc: mempool ABI deprecation notice for 16.07 Olivier Matz
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier MATZ @ 2016-03-09 16:44 UTC (permalink / raw)
  To: dev


On 03/09/2016 05:19 PM, Olivier Matz wrote:
> This series is a rework of mempool.
> 
> [...]

I forgot to mention that this series applies on top of Keith's
patch, which is also planned for 16.07:
http://www.dpdk.org/dev/patchwork/patch/10492/


Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 35/35] mempool: update copyright
  2016-03-09 16:19 ` [RFC 35/35] mempool: update copyright Olivier Matz
@ 2016-03-09 18:52   ` Stephen Hemminger
  2016-03-10 14:57     ` Panu Matilainen
  0 siblings, 1 reply; 150+ messages in thread
From: Stephen Hemminger @ 2016-03-09 18:52 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

I understand that 6Wind has made major contributions to DPDK in many places.

I would prefer that each file not get copyright additions from each
contributor,
otherwise this starts a bad precedent where the source gets cluttered with
every contributor.


On Wed, Mar 9, 2016 at 8:19 AM, Olivier Matz <olivier.matz@6wind.com> wrote:

> Update the copyright of files touched by this patch series.
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  lib/librte_mempool/rte_mempool.c | 1 +
>  lib/librte_mempool/rte_mempool.h | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/lib/librte_mempool/rte_mempool.c
> b/lib/librte_mempool/rte_mempool.c
> index 209449a..3851edd 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -2,6 +2,7 @@
>   *   BSD LICENSE
>   *
>   *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   Copyright(c) 2016 6WIND S.A.
>   *   All rights reserved.
>   *
>   *   Redistribution and use in source and binary forms, with or without
> diff --git a/lib/librte_mempool/rte_mempool.h
> b/lib/librte_mempool/rte_mempool.h
> index 7599790..56220a4 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -2,6 +2,7 @@
>   *   BSD LICENSE
>   *
>   *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   Copyright(c) 2016 6WIND S.A.
>   *   All rights reserved.
>   *
>   *   Redistribution and use in source and binary forms, with or without
> --
> 2.1.4
>
>

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-09 16:19 ` [RFC 10/35] eal: introduce RTE_DECONST macro Olivier Matz
@ 2016-03-09 18:53   ` Stephen Hemminger
  2016-03-09 20:47     ` Olivier MATZ
  0 siblings, 1 reply; 150+ messages in thread
From: Stephen Hemminger @ 2016-03-09 18:53 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev

Can't we just write correct code rather than trying to trick the compiler.

On Wed, Mar 9, 2016 at 8:19 AM, Olivier Matz <olivier.matz@6wind.com> wrote:

> This macro removes the const attribute of a variable. It must be used
> with care in specific situations. It's better to use this macro instead
> of a manual cast, as it explicitly shows the intention of the developer.
>
> This macro is used in the next commit of the series.
>
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  lib/librte_eal/common/include/rte_common.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/lib/librte_eal/common/include/rte_common.h
> b/lib/librte_eal/common/include/rte_common.h
> index 332f2a4..dc0fc83 100644
> --- a/lib/librte_eal/common/include/rte_common.h
> +++ b/lib/librte_eal/common/include/rte_common.h
> @@ -285,6 +285,15 @@ rte_align64pow2(uint64_t v)
>
>  /*********** Other general functions / macros ********/
>
> +/**
> + * Remove the const attribute of a variable
> + *
> + * This must be used with care in specific situations. It's better to
> + * use this macro instead of a manual cast, as it explicitly shows the
> + * intention of the developer.
> + */
> +#define RTE_DECONST(type, var) ((type)(uintptr_t)(const void *)(var))
> +
>  #ifdef __SSE2__
>  #include <emmintrin.h>
>  /**
> --
> 2.1.4
>
>

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-09 18:53   ` Stephen Hemminger
@ 2016-03-09 20:47     ` Olivier MATZ
  2016-03-09 21:01       ` Stephen Hemminger
  2016-03-09 21:22       ` Bruce Richardson
  0 siblings, 2 replies; 150+ messages in thread
From: Olivier MATZ @ 2016-03-09 20:47 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev

Hi,

On 03/09/2016 07:53 PM, Stephen Hemminger wrote:
> Can't we just write correct code rather than trying to trick the compiler.

Thank you for your comment. This macro is introduced for next
commit, I would be happy if you could help me to remove it.

My opinion is that using a macro like this is cleaner than doing a
discreet cast that nobody, because it is explicit. The const qualifier
is not only for the compiler, but also for people reading the code.

In this case, the objective is to be able to do the following:

uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
       rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg)
{
	/* call a function on all objects of a mempool */
}

static void
mempool_obj_audit(struct rte_mempool *mp,
	__rte_unused void *opaque, void *obj, __rte_unused unsigned idx)
{
	/* do some check on one mempool object */
}


void rte_mempool_audit(const struct rte_mempool *mp)
{
	/* iterate objects in mempool using rte_mempool_obj_iter() */
}


In the public API:

- rte_mempool_obj_iter() has the proper prototype: this function
  can be used to make rw access to the mempool
- rte_mempool_audit() has the proper public prototype: this function
  won't modify the mempool

Internally:
- we use a deconst to be able to make use of rte_mempool_obj_iter(),
  but we call a static function that won't modify the mempool.

Note that this kind of macro is also used in projects like FreeBSD:
http://fxr.watson.org/fxr/ident?i=__DECONST

You can also find many examples in Linux kernel where const qualifier
is silently dropped. For instance, you can grep the following in Linux:
 "git grep 'iov_base = (void \*)'"

If you have a better alternative, without duplicating the code,
I'll be happy to learn.


Thanks,
Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-09 20:47     ` Olivier MATZ
@ 2016-03-09 21:01       ` Stephen Hemminger
  2016-03-10  8:11         ` Olivier MATZ
  2016-03-09 21:22       ` Bruce Richardson
  1 sibling, 1 reply; 150+ messages in thread
From: Stephen Hemminger @ 2016-03-09 21:01 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

On Wed, 9 Mar 2016 21:47:35 +0100
Olivier MATZ <olivier.matz@6wind.com> wrote:

> Hi,
> 
> On 03/09/2016 07:53 PM, Stephen Hemminger wrote:
> > Can't we just write correct code rather than trying to trick the compiler.
> 
> Thank you for your comment. This macro is introduced for next
> commit, I would be happy if you could help me to remove it.
> 
> My opinion is that using a macro like this is cleaner than doing a
> discreet cast that nobody, because it is explicit. The const qualifier
> is not only for the compiler, but also for people reading the code.
> 
> In this case, the objective is to be able to do the following:
> 
> uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
>        rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg)
> {
> 	/* call a function on all objects of a mempool */
> }
> 
> static void
> mempool_obj_audit(struct rte_mempool *mp,
> 	__rte_unused void *opaque, void *obj, __rte_unused unsigned idx)
> {
> 	/* do some check on one mempool object */
> }
> 
> 
> void rte_mempool_audit(const struct rte_mempool *mp)
> {
> 	/* iterate objects in mempool using rte_mempool_obj_iter() */
> }
> 
> 
> In the public API:
> 
> - rte_mempool_obj_iter() has the proper prototype: this function
>   can be used to make rw access to the mempool
> - rte_mempool_audit() has the proper public prototype: this function
>   won't modify the mempool
> 
> Internally:
> - we use a deconst to be able to make use of rte_mempool_obj_iter(),
>   but we call a static function that won't modify the mempool.
> 
> Note that this kind of macro is also used in projects like FreeBSD:
> http://fxr.watson.org/fxr/ident?i=__DECONST
> 
> You can also find many examples in Linux kernel where const qualifier
> is silently dropped. For instance, you can grep the following in Linux:
>  "git grep 'iov_base = (void \*)'"
> 
> If you have a better alternative, without duplicating the code,
> I'll be happy to learn.

I would rather have the mempool_audit code take a non-const argument.
The macro method sets a bad precedent and will encourage more bad code.
Plus code checkers are likely to flag any such usage as suspect.

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-09 20:47     ` Olivier MATZ
  2016-03-09 21:01       ` Stephen Hemminger
@ 2016-03-09 21:22       ` Bruce Richardson
  2016-03-10  8:29         ` Olivier MATZ
  1 sibling, 1 reply; 150+ messages in thread
From: Bruce Richardson @ 2016-03-09 21:22 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

On Wed, Mar 09, 2016 at 09:47:35PM +0100, Olivier MATZ wrote:
> Hi,
> 
> On 03/09/2016 07:53 PM, Stephen Hemminger wrote:
> > Can't we just write correct code rather than trying to trick the compiler.
> 
> Thank you for your comment. This macro is introduced for next
> commit, I would be happy if you could help me to remove it.
> 
> My opinion is that using a macro like this is cleaner than doing a
> discreet cast that nobody, because it is explicit. The const qualifier
> is not only for the compiler, but also for people reading the code.
> 
> In this case, the objective is to be able to do the following:
> 
> uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
>        rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg)
> {
> 	/* call a function on all objects of a mempool */
> }
> 
> static void
> mempool_obj_audit(struct rte_mempool *mp,
> 	__rte_unused void *opaque, void *obj, __rte_unused unsigned idx)
> {
> 	/* do some check on one mempool object */
> }
> 
> 
> void rte_mempool_audit(const struct rte_mempool *mp)
> {
> 	/* iterate objects in mempool using rte_mempool_obj_iter() */
> }
> 
> 
> In the public API:
> 
> - rte_mempool_obj_iter() has the proper prototype: this function
>   can be used to make rw access to the mempool
> - rte_mempool_audit() has the proper public prototype: this function
>   won't modify the mempool
> 
> Internally:
> - we use a deconst to be able to make use of rte_mempool_obj_iter(),
>   but we call a static function that won't modify the mempool.
> 
> Note that this kind of macro is also used in projects like FreeBSD:
> http://fxr.watson.org/fxr/ident?i=__DECONST
> 
> You can also find many examples in Linux kernel where const qualifier
> is silently dropped. For instance, you can grep the following in Linux:
>  "git grep 'iov_base = (void \*)'"
> 
> If you have a better alternative, without duplicating the code,
> I'll be happy to learn.
> 
I really don't like this dropping of const either, but I do see the problem.
I'd nearly rather see two copies of the function than start dropping the const
in such a way. Also, I'd see having the function itself be a wrapper around a
macro as a better alternative too, assuming such a construction is possible.

/Bruce

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-09 21:01       ` Stephen Hemminger
@ 2016-03-10  8:11         ` Olivier MATZ
  2016-03-11 21:47           ` Stephen Hemminger
  0 siblings, 1 reply; 150+ messages in thread
From: Olivier MATZ @ 2016-03-10  8:11 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev


> I would rather have the mempool_audit code take a non-const argument.
> The macro method sets a bad precedent and will encourage more bad code.
> Plus code checkers are likely to flag any such usage as suspect.

Doing that would imply dropping the const qualifier in several
functions:

- rte_mempool_dump()
- rte_mempool_audit()
- mempool_audit_cookies()
- mempool_audit_cache()

This is maybe acceptable, but I think it is more important to
keep a const in the API, explicitly saying to the API user that
this parameter is read only.


Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-09 21:22       ` Bruce Richardson
@ 2016-03-10  8:29         ` Olivier MATZ
  2016-03-10  9:26           ` Bruce Richardson
  0 siblings, 1 reply; 150+ messages in thread
From: Olivier MATZ @ 2016-03-10  8:29 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

>> If you have a better alternative, without duplicating the code,
>> I'll be happy to learn.
> 
> I really don't like this dropping of const either, but I do see the problem.
> I'd nearly rather see two copies of the function than start dropping the const
> in such a way.

I don't think duplicating the code is a good option.

> Also, I'd see having the function itself be a wrapper around a
> macro as a better alternative too, assuming such a construction is possible.

Sorry, I'm not sure to understand. Could you please elaborate?


Regards,
Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-10  8:29         ` Olivier MATZ
@ 2016-03-10  9:26           ` Bruce Richardson
  2016-03-10 10:05             ` Olivier MATZ
  0 siblings, 1 reply; 150+ messages in thread
From: Bruce Richardson @ 2016-03-10  9:26 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

On Thu, Mar 10, 2016 at 09:29:03AM +0100, Olivier MATZ wrote:
> >> If you have a better alternative, without duplicating the code,
> >> I'll be happy to learn.
> > 
> > I really don't like this dropping of const either, but I do see the problem.
> > I'd nearly rather see two copies of the function than start dropping the const
> > in such a way.
> 
> I don't think duplicating the code is a good option.

Personally, I'd actually prefer it to eliminating const-ness. I'm a big fan of
having the compiler work for it's pay by doing typechecking for us. :-) 
However, I would hope that by using a macro, as I suggest below, we could have
two functions without duplicating all the code.

> 
> > Also, I'd see having the function itself be a wrapper around a
> > macro as a better alternative too, assuming such a construction is possible.
> 
> Sorry, I'm not sure to understand. Could you please elaborate?
> 
The part of the code which iterates through the elements and calls a function
for each could be a macro, which would mean that it would be fine to use the
macro with a const mempool so long as the function being called took const
parameters too, i.e. the type checking is done post-expansion. Basically,
doing a multi-type function via macro (like MIN/MAX macros etc).

Haven't tried writing the code for it though, so no idea if it's actually doable
or what the result looks like. However, at worst I would think you could 
extract the body of the function to make it a macro, and then call it from two
wrapper functions, one of which takes non-const param, the other of which
takes const param. The macro itself could use typeof() internally to maintain
const-ness or not.

/Bruce

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-10  9:26           ` Bruce Richardson
@ 2016-03-10 10:05             ` Olivier MATZ
  0 siblings, 0 replies; 150+ messages in thread
From: Olivier MATZ @ 2016-03-10 10:05 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

Hi Bruce,

On 03/10/2016 10:26 AM, Bruce Richardson wrote:
> On Thu, Mar 10, 2016 at 09:29:03AM +0100, Olivier MATZ wrote:
>>>> If you have a better alternative, without duplicating the code,
>>>> I'll be happy to learn.
>>>
>>> I really don't like this dropping of const either, but I do see the problem.
>>> I'd nearly rather see two copies of the function than start dropping the const
>>> in such a way.
>>
>> I don't think duplicating the code is a good option.
> 
> Personally, I'd actually prefer it to eliminating const-ness. I'm a big fan of
> having the compiler work for it's pay by doing typechecking for us. :-) 
> However, I would hope that by using a macro, as I suggest below, we could have
> two functions without duplicating all the code.

Does it mean we should duplicate all iterate-like functions of the
DPDK to have a const and a non-const version?
I would personally find this quite odd.


>>> Also, I'd see having the function itself be a wrapper around a
>>> macro as a better alternative too, assuming such a construction is possible.
>>
>> Sorry, I'm not sure to understand. Could you please elaborate?
>>
> The part of the code which iterates through the elements and calls a function
> for each could be a macro, which would mean that it would be fine to use the
> macro with a const mempool so long as the function being called took const
> parameters too, i.e. the type checking is done post-expansion. Basically,
> doing a multi-type function via macro (like MIN/MAX macros etc).
> 
> Haven't tried writing the code for it though, so no idea if it's actually doable
> or what the result looks like. However, at worst I would think you could 
> extract the body of the function to make it a macro, and then call it from two
> wrapper functions, one of which takes non-const param, the other of which
> takes const param. The macro itself could use typeof() internally to maintain
> const-ness or not.

OK, it's clearer, thanks.
But I'm not sure having several lines of code inside a macro
is something we should encourage either.


To summarize, I see 4 solutions:

1 do a discreet cast: I think that's what people usually do in these
  cases, I would not be suprised to find several in current DPDK
2 use a RTE_DECONST() macro: it points out that we are doing a bad cast,
  which is a valuable info for the reviewer (proof: you saw it :) )
3 duplicate the iterate functions to have a const and a non-const
  version and use a macro to duplicate the code
4 remove the const for these iterate functions, implying to remove
  const in some other functions like dumps.


I still personally prefer solution 2, because it keeps the API clean,
giving the proper information to the user and the compiler "this mempool
structure won't be modified". Using a cast like this should of course
be avoided most of the time, but I think it's acceptable in cases like
this, knowing that it is properly pointed-out by the RTE_DECONST macro.



Regards,
Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 35/35] mempool: update copyright
  2016-03-09 18:52   ` Stephen Hemminger
@ 2016-03-10 14:57     ` Panu Matilainen
  0 siblings, 0 replies; 150+ messages in thread
From: Panu Matilainen @ 2016-03-10 14:57 UTC (permalink / raw)
  To: Stephen Hemminger, Olivier Matz; +Cc: dev

On 03/09/2016 08:52 PM, Stephen Hemminger wrote:
> I understand that 6Wind has made major contributions to DPDK in many places.
>
> I would prefer that each file not get copyright additions from each
> contributor,
> otherwise this starts a bad precedent where the source gets cluttered with
> every contributor.

That, and they also add rather useless noise to patches and commit 
history just because people feel compelled to update the copyright years.

Many projects have a separate credits file where contributors get noted, 
but I guess those tend to be under copyleft licenses, the BSD license 
expects somebody to claim copyright.

Anyway, I'd much rather see one toplevel license where all such updates 
go. It'd make life easier for packagers whose distros require including 
a license file in packages, and it'd also help fix the first impression 
of dpdk being under [L]GPL (which easily happens if you just glimpse at 
the toplevel source directory)

This is of course getting a bit side-tracked for this patch...

	- Panu -

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [RFC 10/35] eal: introduce RTE_DECONST macro
  2016-03-10  8:11         ` Olivier MATZ
@ 2016-03-11 21:47           ` Stephen Hemminger
  0 siblings, 0 replies; 150+ messages in thread
From: Stephen Hemminger @ 2016-03-11 21:47 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

On Thu, 10 Mar 2016 09:11:52 +0100
Olivier MATZ <olivier.matz@6wind.com> wrote:

> 
> > I would rather have the mempool_audit code take a non-const argument.
> > The macro method sets a bad precedent and will encourage more bad code.
> > Plus code checkers are likely to flag any such usage as suspect.
> 
> Doing that would imply dropping the const qualifier in several
> functions:
> 
> - rte_mempool_dump()
> - rte_mempool_audit()
> - mempool_audit_cookies()
> - mempool_audit_cache()

Sure no problem.

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH] doc: mempool ABI deprecation notice for 16.07
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (35 preceding siblings ...)
  2016-03-09 16:44 ` [RFC 00/35] mempool: rework memory allocation Olivier MATZ
@ 2016-03-17  9:05 ` Olivier Matz
  2016-04-04 14:38   ` Thomas Monjalon
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
  37 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-03-17  9:05 UTC (permalink / raw)
  To: dev

Add a deprecation notice for coming changes in mempool for 16.07.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/rel_notes/deprecation.rst | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 252a096..3e8e327 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -33,3 +33,11 @@ Deprecation Notices
 * ABI changes are planned for adding four new flow types. This impacts
   RTE_ETH_FLOW_MAX. The release 2.2 does not contain these ABI changes,
   but release 2.3 will.
+
+* librte_mempool: new fixes and features will be added in 16.07:
+  allocation of large mempool in several virtual memory chunks, new API
+  to populate a mempool, new API to free a mempool, allocation in
+  anonymous mapping, drop of specific dom0 code. These changes will
+  induce a modification of the rte_mempool structure, plus a
+  modification of the API of rte_mempool_obj_iter(), implying a breakage
+  of the ABI.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* Re: [PATCH] doc: mempool ABI deprecation notice for 16.07
  2016-03-17  9:05 ` [PATCH] doc: mempool ABI deprecation notice for 16.07 Olivier Matz
@ 2016-04-04 14:38   ` Thomas Monjalon
  2016-04-05  9:27     ` Hunt, David
  0 siblings, 1 reply; 150+ messages in thread
From: Thomas Monjalon @ 2016-04-04 14:38 UTC (permalink / raw)
  To: dev; +Cc: Olivier Matz

2016-03-17 10:05, Olivier Matz:
> Add a deprecation notice for coming changes in mempool for 16.07.
[...]
> +* librte_mempool: new fixes and features will be added in 16.07:
> +  allocation of large mempool in several virtual memory chunks, new API
> +  to populate a mempool, new API to free a mempool, allocation in
> +  anonymous mapping, drop of specific dom0 code. These changes will
> +  induce a modification of the rte_mempool structure, plus a
> +  modification of the API of rte_mempool_obj_iter(), implying a breakage
> +  of the ABI.

Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>

Other people involved in the discussion wanting to bring their support?

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH] doc: mempool ABI deprecation notice for 16.07
  2016-04-04 14:38   ` Thomas Monjalon
@ 2016-04-05  9:27     ` Hunt, David
  2016-04-05 14:08       ` Wiles, Keith
  0 siblings, 1 reply; 150+ messages in thread
From: Hunt, David @ 2016-04-05  9:27 UTC (permalink / raw)
  To: Thomas Monjalon, dev; +Cc: Olivier Matz


On 4/4/2016 3:38 PM, Thomas Monjalon wrote:
> 2016-03-17 10:05, Olivier Matz:
>> Add a deprecation notice for coming changes in mempool for 16.07.
> [...]
>> +* librte_mempool: new fixes and features will be added in 16.07:
>> +  allocation of large mempool in several virtual memory chunks, new API
>> +  to populate a mempool, new API to free a mempool, allocation in
>> +  anonymous mapping, drop of specific dom0 code. These changes will
>> +  induce a modification of the rte_mempool structure, plus a
>> +  modification of the API of rte_mempool_obj_iter(), implying a breakage
>> +  of the ABI.
> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
>
> Other people involved in the discussion wanting to bring their support?

Acked-by: David Hunt<david.hunt@intel.com>


Regards,
David.

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH] doc: mempool ABI deprecation notice for 16.07
  2016-04-05  9:27     ` Hunt, David
@ 2016-04-05 14:08       ` Wiles, Keith
  2016-04-05 15:17         ` Thomas Monjalon
  0 siblings, 1 reply; 150+ messages in thread
From: Wiles, Keith @ 2016-04-05 14:08 UTC (permalink / raw)
  To: Hunt, David, Thomas Monjalon, dev; +Cc: Olivier Matz

>
>On 4/4/2016 3:38 PM, Thomas Monjalon wrote:
>> 2016-03-17 10:05, Olivier Matz:
>>> Add a deprecation notice for coming changes in mempool for 16.07.
>> [...]
>>> +* librte_mempool: new fixes and features will be added in 16.07:
>>> +  allocation of large mempool in several virtual memory chunks, new API
>>> +  to populate a mempool, new API to free a mempool, allocation in
>>> +  anonymous mapping, drop of specific dom0 code. These changes will
>>> +  induce a modification of the rte_mempool structure, plus a
>>> +  modification of the API of rte_mempool_obj_iter(), implying a breakage
>>> +  of the ABI.
>> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
>>
>> Other people involved in the discussion wanting to bring their support?
>
>Acked-by: David Hunt<david.hunt@intel.com>

Acked-by: Keith Wiles <keith.wiles@intel.com>
>
>
>Regards,
>David.
>


Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH] doc: mempool ABI deprecation notice for 16.07
  2016-04-05 14:08       ` Wiles, Keith
@ 2016-04-05 15:17         ` Thomas Monjalon
  0 siblings, 0 replies; 150+ messages in thread
From: Thomas Monjalon @ 2016-04-05 15:17 UTC (permalink / raw)
  To: Olivier Matz; +Cc: Wiles, Keith, Hunt, David, dev

> >>> Add a deprecation notice for coming changes in mempool for 16.07.
> >> [...]
> >>> +* librte_mempool: new fixes and features will be added in 16.07:
> >>> +  allocation of large mempool in several virtual memory chunks, new API
> >>> +  to populate a mempool, new API to free a mempool, allocation in
> >>> +  anonymous mapping, drop of specific dom0 code. These changes will
> >>> +  induce a modification of the rte_mempool structure, plus a
> >>> +  modification of the API of rte_mempool_obj_iter(), implying a breakage
> >>> +  of the ABI.
> >> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
> >>
> >> Other people involved in the discussion wanting to bring their support?
> >
> >Acked-by: David Hunt<david.hunt@intel.com>
> 
> Acked-by: Keith Wiles <keith.wiles@intel.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH 00/36] mempool: rework memory allocation
  2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
                   ` (36 preceding siblings ...)
  2016-03-17  9:05 ` [PATCH] doc: mempool ABI deprecation notice for 16.07 Olivier Matz
@ 2016-04-14 10:19 ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 01/36] mempool: fix comments and style Olivier Matz
                     ` (37 more replies)
  37 siblings, 38 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

This series is a rework of mempool. For those who don't want to read
all the cover letter, here is a sumary:

- it is not possible to allocate large mempools if there is not enough
  contiguous memory, this series solves this issue
- introduce new APIs with less arguments: "create, populate, obj_init"
- allow to free a mempool
- split code in smaller functions, will ease the introduction of ext_handler
- remove test-pmd anonymous mempool creation
- remove most of dom0-specific mempool code
- opens the door for a eal_memory rework: we probably don't need large
  contiguous memory area anymore, working with pages would work.

This breaks the ABI as it was indicated in the deprecation for 16.04.
The API stays almost the same, no modification is needed in examples app
or in test-pmd. Only kni and mellanox drivers are slightly modified.

This patch applies on top of 16.04 + v5 of Keith's patch:
"mempool: reduce rte_mempool structure size"

Changes RFC -> v1:

- remove the rte_deconst macro, and remove some const qualifier in
  dump/audit functions
- rework modifications in mellanox drivers to ensure the mempool is
  virtually contiguous
- fix mempool memory chunk iteration (bad pointer was used)
- fix compilation on freebsd: replace MAP_LOCKED flag by mlock()
- fix compilation on tilera (pointer arithmetics)
- slightly rework and clean the mempool autotest
- fix mempool autotest on bsd
- more validation (especially mellanox drivers and kni that were not
  tested in RFC)
- passed autotests (x86_64-native-linuxapp-gcc and x86_64-native-bsdapp-gcc)
- rebase on head, reorder the patches a bit and fix minor split issues


Description of the initial issue
--------------------------------

The allocation of mbuf pool can fail even if there is enough memory.
The problem is related to the way the memory is allocated and used in
dpdk. It is particularly annoying with mbuf pools, but it can also fail
in other use cases allocating a large amount of memory.

- rte_malloc() allocates physically contiguous memory, which is needed
  for mempools, but useless most of the time.

  Allocating a large physically contiguous zone is often impossible
  because the system provide hugepages which may not be contiguous.

- rte_mempool_create() (and therefore rte_pktmbuf_pool_create())
  requires a physically contiguous zone.

- rte_mempool_xmem_create() does not solve the issue as it still
  needs the memory to be virtually contiguous, and there is no
  way in dpdk to allocate a virtually contiguous memory that is
  not also physically contiguous.

How to reproduce the issue
--------------------------

- start the dpdk with some 2MB hugepages (it can also occur with 1GB)
- allocate a large mempool
- even if there is enough memory, the allocation can fail

Example:

  git clone http://dpdk.org/git/dpdk
  cd dpdk
  make config T=x86_64-native-linuxapp-gcc
  make -j32
  mkdir -p /mnt/huge
  mount -t hugetlbfs nodev /mnt/huge
  echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

  # we try to allocate a mempool whose size is ~450MB, it fails
  ./build/app/testpmd -l 2,4 -- --total-num-mbufs=200000 -i

The EAL logs "EAL: Virtual area found at..." shows that there are
several zones, but all smaller than 450MB.

Workarounds:

- Use 1GB hugepages: it sometimes work, but for very large
  pools (millions of mbufs) there is the same issue. Moreover,
  it would consume 1GB memory at least which can be a lot
  in some cases.

- Reboot the machine or allocate hugepages at boot time: this increases
  the chances to have more contiguous memory, but does not completely
  solve the issue

Solutions
---------

Below is a list of proposed solutions. I implemented a quick and dirty
PoC of solution 1, but it's not working in all conditions and it's
really an ugly hack.  This series implement the solution 4 which looks
the best to me, knowing it does not prevent to do more enhancements
in dpdk memory in the future (solution 3 for instance).

Solution 1: in application
--------------------------

- allocate several hugepages using rte_malloc() or rte_memzone_reserve()
  (only keeping complete hugepages)
- parse memsegs and /proc/maps to check which files mmaps these pages
- mmap the files in a contiguous virtual area
- use rte_mempool_xmem_create()

Cons:

- 1a. parsing the memsegs of rte config in the application does not
  use a public API, and can be broken if internal dpdk code changes
- 1b. some memory is lost due to malloc headers. Also, if the memory is
  very fragmented (ex: all 2MB pages are physically separated), it does
  not work at all because we cannot get any complete page. It is not
  possible to use a lower level allocator since commit fafcc11985a.
- 1c. we cannot use rte_pktmbuf_pool_create(), so we need to use mempool
  api and do a part of the job manually
- 1d. it breaks secondary processes as the virtual addresses won't be
  mmap'd at the same place in secondary process
- 1e. it only fixes the issue for the mbuf pool of the application,
  internal pools in dpdk libraries are not modified
- 1f. this is a pure linux solution (rte_map files)
- 1g. The application has to be aware of RTE_EAL_SINGLE_SEGMENTS option
  that changes the way hugepages are mapped. By the way, it's strange
  to have such a compile-time option, we should probably have only
  one behavior that works all the time.

Solution 2: in dpdk memory allocator
------------------------------------

- do the same than solution 1 in a new function rte_malloc_non_contig():
  allocate several chunks and mmap them in a contiguous virtual memory
- a flag has to be added in malloc header to do the proper cleanup in
  rte_free() (free all the chunks, munmap the memory)
- introduce a new rte_mem_get_physmap(*physmap,addr, len) that returns
  the virt2phys mapping of a virtual area in dpdk
- add a mempool flag MEMPOOL_F_NON_PHYS_CONTIG to use
  rte_malloc_non_contig() to allocate the area storing the objects

Cons:

- 2a. same than 1b: it breaks secondary processes if the mempool flag is
  used.
- 2b. same as 1d: some memory is lost due to malloc headers, and it
  cannot work if memory is too fragmented.
- 2c. rte_malloc_virt2phy() cannot be used on these zones. It would
  return the physical address of the first page. It would be better to
  return an error in this case.
- 2d. need to check how to implement this on bsd (TBD)

Solution 3: in dpdk eal memory
------------------------------

- Rework the way hugepages are mmap'd in dpdk: instead of having several
  rte_map* files, just mmap one file per node. It may drastically
  simplify EAL memory management in dpdk.
- An API should be added to retrieve the physical mapping of a virtual
  area (ex: rte_mem_get_physmap(*physmap, addr, len))
- rte_malloc() and rte_memzone_reserve() won't allocate physically
  contiguous memory anymore (TBD)
- Update mempool to always use the rte_mempool_xmem_create() version

Cons:

- 3a. lot of rework in eal memory, it will induce some behavior changes
  and maybe api changes
- 3b. possible conflicts with xen_dom0 mempool

Solution 4: in mempool
----------------------

- Introduce a new API to fill a mempool with zones that are not
  virtually contiguous. It requires to add new functions to create and
  populate a mempool. Example (TBD):

  - rte_mempool_create_empty(name, n, elt_size, cache_size, priv_size)
  - rte_mempool_populate(mp, addr, len): add virtual memory for objects
  - rte_mempool_mempool_obj_iter(mp, obj_cb, arg): call a cb for each object

- update rte_mempool_create() to allocate objects in several memory
  chunks by default if there is no large enough physically contiguous
  memory.

Tests done
----------

Compilation
~~~~~~~~~~~

The following targets:

 x86_64-native-linuxapp-gcc
 i686-native-linuxapp-gcc
 x86_x32-native-linuxapp-gcc
 x86_64-native-linuxapp-clang
 x86_64-native-bsdapp-gcc
 ppc_64-power8-linuxapp-gcc
 tile-tilegx-linuxapp-gcc (only the mempool files, the target does not compile)

Libraries with and without debug, in static and shared mode + examples.

autotests
~~~~~~~~~

Passed all autotests on x86_64-native-linuxapp-gcc (including kni) and
mempool-related autotests on x86_64-native-bsdapp-gcc.

test-pmd
~~~~~~~~

# now starts fine, was failing before if mempool was too fragmented
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -- -i --port-topology=chained

# still ok
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -m 256 -- -i --port-topology=chained --mp-anon
set fwd txonly
start
stop

# fail, but was failing before too. The problem is because the physical
# addresses are not properly set when using --no-huge. The mempool phys addr
# are now correct, but the zones allocated through memzone_reserve() are
# still wrong. This could be fixed in a future series.
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -m 256 --no-huge -- -i ---port-topology=chained
set fwd txonly
start
stop


Olivier Matz (36):
  mempool: fix comments and style
  mempool: replace elt_size by total_elt_size
  mempool: uninline function to check cookies
  mempool: use sizeof to get the size of header and trailer
  mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t
  mempool: update library version
  mempool: list objects when added in the mempool
  mempool: remove const attribute in mempool_walk
  mempool: remove const qualifier in dump and audit
  mempool: use the list to iterate the mempool elements
  mempool: use the list to audit all elements
  mempool: use the list to initialize mempool objects
  mempool: create the internal ring in a specific function
  mempool: store physaddr in mempool objects
  mempool: remove MEMPOOL_IS_CONTIG()
  mempool: store memory chunks in a list
  mempool: new function to iterate the memory chunks
  mempool: simplify xmem_usage
  mempool: introduce a free callback for memory chunks
  mempool: make page size optional when getting xmem size
  mempool: default allocation in several memory chunks
  eal: lock memory when using no-huge
  mempool: support no-hugepage mode
  mempool: replace mempool physaddr by a memzone pointer
  mempool: introduce a function to free a mempool
  mempool: introduce a function to create an empty mempool
  eal/xen: return machine address without knowing memseg id
  mempool: rework support of xen dom0
  mempool: create the internal ring when populating
  mempool: populate a mempool with anonymous memory
  mempool: make mempool populate and free api public
  test-pmd: remove specific anon mempool code
  mem: avoid memzone/mempool/ring name truncation
  mempool: new flag when phys contig mem is not needed
  app/test: rework mempool test
  mempool: update copyright

 app/test-pmd/Makefile                        |    4 -
 app/test-pmd/mempool_anon.c                  |  201 -----
 app/test-pmd/mempool_osdep.h                 |   54 --
 app/test-pmd/testpmd.c                       |   23 +-
 app/test/test_mempool.c                      |  243 +++---
 doc/guides/rel_notes/release_16_04.rst       |    2 +-
 drivers/net/mlx4/mlx4.c                      |  140 ++--
 drivers/net/mlx5/mlx5_rxtx.c                 |  140 ++--
 drivers/net/mlx5/mlx5_rxtx.h                 |    4 +-
 drivers/net/xenvirt/rte_eth_xenvirt.h        |    2 +-
 drivers/net/xenvirt/rte_mempool_gntalloc.c   |    4 +-
 lib/librte_eal/common/eal_common_log.c       |    2 +-
 lib/librte_eal/common/eal_common_memzone.c   |   10 +-
 lib/librte_eal/common/include/rte_memory.h   |   11 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c     |    2 +-
 lib/librte_eal/linuxapp/eal/eal_xen_memory.c |   17 +-
 lib/librte_kni/rte_kni.c                     |   12 +-
 lib/librte_mempool/Makefile                  |    5 +-
 lib/librte_mempool/rte_dom0_mempool.c        |  133 ----
 lib/librte_mempool/rte_mempool.c             | 1042 +++++++++++++++++---------
 lib/librte_mempool/rte_mempool.h             |  594 +++++++--------
 lib/librte_mempool/rte_mempool_version.map   |   18 +-
 lib/librte_ring/rte_ring.c                   |   16 +-
 23 files changed, 1377 insertions(+), 1302 deletions(-)
 delete mode 100644 app/test-pmd/mempool_anon.c
 delete mode 100644 app/test-pmd/mempool_osdep.h
 delete mode 100644 lib/librte_mempool/rte_dom0_mempool.c

-- 
2.1.4

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH 01/36] mempool: fix comments and style
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 14:15     ` Wiles, Keith
  2016-04-14 10:19   ` [PATCH 02/36] mempool: replace elt_size by total_elt_size Olivier Matz
                     ` (36 subsequent siblings)
  37 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

No functional change, just fix some comments and styling issues.
Also avoid to duplicate comments between rte_mempool_create()
and rte_mempool_xmem_create().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++++++++---
 lib/librte_mempool/rte_mempool.h | 59 +++++++++-------------------------------
 2 files changed, 26 insertions(+), 50 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 7a0e07e..ce78476 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -152,6 +152,13 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	rte_ring_sp_enqueue(mp->ring, obj);
 }
 
+/* Iterate through objects at the given address
+ *
+ * Given the pointer to the memory, and its topology in physical memory
+ * (the physical addresses table), iterate through the "elt_num" objects
+ * of size "total_elt_sz" aligned at "align". For each object in this memory
+ * chunk, invoke a callback. It returns the effective number of objects
+ * in this memory. */
 uint32_t
 rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
@@ -341,10 +348,8 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
 	return sz;
 }
 
-/*
- * Calculate how much memory would be actually required with the
- * given memory footprint to store required number of elements.
- */
+/* Callback used by rte_mempool_xmem_usage(): it sets the opaque
+ * argument to the end of the object. */
 static void
 mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
 	__rte_unused uint32_t idx)
@@ -352,6 +357,10 @@ mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
 	*(uintptr_t *)arg = (uintptr_t)end;
 }
 
+/*
+ * Calculate how much memory would be actually required with the
+ * given memory footprint to store required number of elements.
+ */
 ssize_t
 rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 8595e77..bd78df5 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -214,7 +214,7 @@ struct rte_mempool {
 
 }  __rte_cache_aligned;
 
-#define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread in memory. */
+#define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread among memory channels. */
 #define MEMPOOL_F_NO_CACHE_ALIGN 0x0002 /**< Do not align objs on cache lines.*/
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
@@ -270,7 +270,8 @@ struct rte_mempool {
 /* return the header of a mempool object (internal) */
 static inline struct rte_mempool_objhdr *__mempool_get_header(void *obj)
 {
-	return (struct rte_mempool_objhdr *)RTE_PTR_SUB(obj, sizeof(struct rte_mempool_objhdr));
+	return (struct rte_mempool_objhdr *)RTE_PTR_SUB(obj,
+		sizeof(struct rte_mempool_objhdr));
 }
 
 /**
@@ -544,8 +545,9 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 /**
  * Create a new mempool named *name* in memory.
  *
- * This function uses ``memzone_reserve()`` to allocate memory. The
- * pool contains n elements of elt_size. Its size is set to n.
+ * The pool contains n elements of elt_size. Its size is set to n.
+ * This function uses ``memzone_reserve()`` to allocate the mempool header
+ * (and the objects if vaddr is NULL).
  * Depending on the input parameters, mempool elements can be either allocated
  * together with the mempool header, or an externally provided memory buffer
  * could be used to store mempool objects. In later case, that external
@@ -560,18 +562,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  * @param elt_size
  *   The size of each element.
  * @param cache_size
- *   If cache_size is non-zero, the rte_mempool library will try to
- *   limit the accesses to the common lockless pool, by maintaining a
- *   per-lcore object cache. This argument must be lower or equal to
- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
- *   cache_size to have "n modulo cache_size == 0": if this is
- *   not the case, some elements will always stay in the pool and will
- *   never be used. The access to the per-lcore table is of course
- *   faster than the multi-producer/consumer pool. The cache can be
- *   disabled if the cache_size argument is set to 0; it can be useful to
- *   avoid losing objects in cache. Note that even if not used, the
- *   memory space for cache is always reserved in a mempool structure,
- *   except if CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE is set to 0.
+ *   Size of the cache. See rte_mempool_create() for details.
  * @param private_data_size
  *   The size of the private data appended after the mempool
  *   structure. This is useful for storing some private data after the
@@ -585,35 +576,17 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  *   An opaque pointer to data that can be used in the mempool
  *   constructor function.
  * @param obj_init
- *   A function pointer that is called for each object at
- *   initialization of the pool. The user can set some meta data in
- *   objects if needed. This parameter can be NULL if not needed.
- *   The obj_init() function takes the mempool pointer, the init_arg,
- *   the object pointer and the object number as parameters.
+ *   A function called for each object at initialization of the pool.
+ *   See rte_mempool_create() for details.
  * @param obj_init_arg
- *   An opaque pointer to data that can be used as an argument for
- *   each call to the object constructor function.
+ *   An opaque pointer passed to the object constructor function.
  * @param socket_id
  *   The *socket_id* argument is the socket identifier in the case of
  *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
  *   constraint for the reserved zone.
  * @param flags
- *   The *flags* arguments is an OR of following flags:
- *   - MEMPOOL_F_NO_SPREAD: By default, objects addresses are spread
- *     between channels in RAM: the pool allocator will add padding
- *     between objects depending on the hardware configuration. See
- *     Memory alignment constraints for details. If this flag is set,
- *     the allocator will just align them to a cache line.
- *   - MEMPOOL_F_NO_CACHE_ALIGN: By default, the returned objects are
- *     cache-aligned. This flag removes this constraint, and no
- *     padding will be present between objects. This flag implies
- *     MEMPOOL_F_NO_SPREAD.
- *   - MEMPOOL_F_SP_PUT: If this flag is set, the default behavior
- *     when using rte_mempool_put() or rte_mempool_put_bulk() is
- *     "single-producer". Otherwise, it is "multi-producers".
- *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
- *     when using rte_mempool_get() or rte_mempool_get_bulk() is
- *     "single-consumer". Otherwise, it is "multi-consumers".
+ *   Flags controlling the behavior of the mempool. See
+ *   rte_mempool_create() for details.
  * @param vaddr
  *   Virtual address of the externally allocated memory buffer.
  *   Will be used to store mempool objects.
@@ -626,13 +599,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  *   LOG2 of the physical pages size.
  * @return
  *   The pointer to the new allocated mempool, on success. NULL on error
- *   with rte_errno set appropriately. Possible rte_errno values include:
- *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
- *    - E_RTE_SECONDARY - function was called from a secondary process instance
- *    - EINVAL - cache size provided is too large
- *    - ENOSPC - the maximum number of memzones has already been allocated
- *    - EEXIST - a memzone with the same name already exists
- *    - ENOMEM - no appropriate memory area found in which to create memzone
+ *   with rte_errno set appropriately. See rte_mempool_create() for details.
  */
 struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 02/36] mempool: replace elt_size by total_elt_size
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
  2016-04-14 10:19   ` [PATCH 01/36] mempool: fix comments and style Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 14:18     ` Wiles, Keith
  2016-04-14 10:19   ` [PATCH 03/36] mempool: uninline function to check cookies Olivier Matz
                     ` (35 subsequent siblings)
  37 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

In some mempool functions, we use the size of the elements as arguments or in
variables. There is a confusion between the size including or not including
the header and trailer.

To avoid this confusion:
- update the API documentation
- rename the variables and argument names as "elt_size" when the size does not
  include the header and trailer, or else as "total_elt_size".

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 21 +++++++++++----------
 lib/librte_mempool/rte_mempool.h | 19 +++++++++++--------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index ce78476..90b5b1b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -156,13 +156,13 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
  *
  * Given the pointer to the memory, and its topology in physical memory
  * (the physical addresses table), iterate through the "elt_num" objects
- * of size "total_elt_sz" aligned at "align". For each object in this memory
+ * of size "elt_sz" aligned at "align". For each object in this memory
  * chunk, invoke a callback. It returns the effective number of objects
  * in this memory. */
 uint32_t
-rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
-	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
+rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
+	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
 {
 	uint32_t i, j, k;
 	uint32_t pgn, pgf;
@@ -178,7 +178,7 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
 	while (i != elt_num && j != pg_num) {
 
 		start = RTE_ALIGN_CEIL(va, align);
-		end = start + elt_sz;
+		end = start + total_elt_sz;
 
 		/* index of the first page for the next element. */
 		pgf = (end >> pg_shift) - (start >> pg_shift);
@@ -255,6 +255,7 @@ mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
 		mempool_obj_populate, &arg);
 }
 
+/* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 	struct rte_mempool_objsz *sz)
@@ -332,17 +333,17 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  * Calculate maximum amount of memory required to store given number of objects.
  */
 size_t
-rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
+rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 {
 	size_t n, pg_num, pg_sz, sz;
 
 	pg_sz = (size_t)1 << pg_shift;
 
-	if ((n = pg_sz / elt_sz) > 0) {
+	if ((n = pg_sz / total_elt_sz) > 0) {
 		pg_num = (elt_num + n - 1) / n;
 		sz = pg_num << pg_shift;
 	} else {
-		sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num;
+		sz = RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
 	}
 
 	return sz;
@@ -362,7 +363,7 @@ mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
  * given memory footprint to store required number of elements.
  */
 ssize_t
-rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
+rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
 	uint32_t n;
@@ -373,7 +374,7 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
 	va = (uintptr_t)vaddr;
 	uv = va;
 
-	if ((n = rte_mempool_obj_iter(vaddr, elt_num, elt_sz, 1,
+	if ((n = rte_mempool_obj_iter(vaddr, elt_num, total_elt_sz, 1,
 			paddr, pg_num, pg_shift, mempool_lelem_iter,
 			&uv)) != elt_num) {
 		return -(ssize_t)n;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index bd78df5..ca4657f 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1289,7 +1289,7 @@ struct rte_mempool *rte_mempool_lookup(const char *name);
  * calculates header, trailer, body and total sizes of the mempool object.
  *
  * @param elt_size
- *   The size of each element.
+ *   The size of each element, without header and trailer.
  * @param flags
  *   The flags used for the mempool creation.
  *   Consult rte_mempool_create() for more information about possible values.
@@ -1315,14 +1315,15 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  *
  * @param elt_num
  *   Number of elements.
- * @param elt_sz
- *   The size of each element.
+ * @param total_elt_sz
+ *   The size of each element, including header and trailer, as returned
+ *   by rte_mempool_calc_obj_size().
  * @param pg_shift
  *   LOG2 of the physical pages size.
  * @return
  *   Required memory size aligned at page boundary.
  */
-size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
+size_t rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz,
 	uint32_t pg_shift);
 
 /**
@@ -1336,8 +1337,9 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
  *   Will be used to store mempool objects.
  * @param elt_num
  *   Number of elements.
- * @param elt_sz
- *   The size of each element.
+ * @param total_elt_sz
+ *   The size of each element, including header and trailer, as returned
+ *   by rte_mempool_calc_obj_size().
  * @param paddr
  *   Array of physical addresses of the pages that comprises given memory
  *   buffer.
@@ -1351,8 +1353,9 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
  *   buffer is too small, return a negative value whose absolute value
  *   is the actual number of elements that can be stored in that buffer.
  */
-ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
+ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num,
+	size_t total_elt_sz, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift);
 
 /**
  * Walk list of all memory pools
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 03/36] mempool: uninline function to check cookies
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
  2016-04-14 10:19   ` [PATCH 01/36] mempool: fix comments and style Olivier Matz
  2016-04-14 10:19   ` [PATCH 02/36] mempool: replace elt_size by total_elt_size Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 04/36] mempool: use sizeof to get the size of header and trailer Olivier Matz
                     ` (34 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

There's no reason to keep this function inlined. Move it to
rte_mempool.c.

Note: we don't see it in the patch, but the #pragma ignoring
"-Wcast-qual" is still there in the C file.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 68 +++++++++++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool.h | 77 ++--------------------------------------
 2 files changed, 71 insertions(+), 74 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 90b5b1b..2e1ccc0 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -709,6 +709,74 @@ struct mempool_audit_arg {
 	uint32_t obj_num;
 };
 
+/* check and update cookies or panic (internal) */
+void __mempool_check_cookies(const struct rte_mempool *mp,
+	void * const *obj_table_const, unsigned n, int free)
+{
+	struct rte_mempool_objhdr *hdr;
+	struct rte_mempool_objtlr *tlr;
+	uint64_t cookie;
+	void *tmp;
+	void *obj;
+	void **obj_table;
+
+	/* Force to drop the "const" attribute. This is done only when
+	 * DEBUG is enabled */
+	tmp = (void *) obj_table_const;
+	obj_table = (void **) tmp;
+
+	while (n--) {
+		obj = obj_table[n];
+
+		if (rte_mempool_from_obj(obj) != mp)
+			rte_panic("MEMPOOL: object is owned by another "
+				  "mempool\n");
+
+		hdr = __mempool_get_header(obj);
+		cookie = hdr->cookie;
+
+		if (free == 0) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (put)\n");
+			}
+			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
+		}
+		else if (free == 1) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (get)\n");
+			}
+			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE1;
+		}
+		else if (free == 2) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1 &&
+			    cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (audit)\n");
+			}
+		}
+		tlr = __mempool_get_trailer(obj);
+		cookie = tlr->cookie;
+		if (cookie != RTE_MEMPOOL_TRAILER_COOKIE) {
+			rte_log_set_history(0);
+			RTE_LOG(CRIT, MEMPOOL,
+				"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+				obj, (const void *) mp, cookie);
+			rte_panic("MEMPOOL: bad trailer cookie\n");
+		}
+	}
+}
+
 static void
 mempool_obj_audit(void *arg, void *start, void *end, uint32_t idx)
 {
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index ca4657f..6d98cdf 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -296,6 +296,7 @@ static inline struct rte_mempool_objtlr *__mempool_get_trailer(void *obj)
 	return (struct rte_mempool_objtlr *)RTE_PTR_ADD(obj, mp->elt_size);
 }
 
+#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 /**
  * @internal Check and update cookies or panic.
  *
@@ -310,80 +311,8 @@ static inline struct rte_mempool_objtlr *__mempool_get_trailer(void *obj)
  *   - 1: object is supposed to be free, mark it as allocated
  *   - 2: just check that cookie is valid (free or allocated)
  */
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#ifndef __INTEL_COMPILER
-#pragma GCC diagnostic ignored "-Wcast-qual"
-#endif
-static inline void __mempool_check_cookies(const struct rte_mempool *mp,
-					   void * const *obj_table_const,
-					   unsigned n, int free)
-{
-	struct rte_mempool_objhdr *hdr;
-	struct rte_mempool_objtlr *tlr;
-	uint64_t cookie;
-	void *tmp;
-	void *obj;
-	void **obj_table;
-
-	/* Force to drop the "const" attribute. This is done only when
-	 * DEBUG is enabled */
-	tmp = (void *) obj_table_const;
-	obj_table = (void **) tmp;
-
-	while (n--) {
-		obj = obj_table[n];
-
-		if (rte_mempool_from_obj(obj) != mp)
-			rte_panic("MEMPOOL: object is owned by another "
-				  "mempool\n");
-
-		hdr = __mempool_get_header(obj);
-		cookie = hdr->cookie;
-
-		if (free == 0) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (put)\n");
-			}
-			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
-		}
-		else if (free == 1) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (get)\n");
-			}
-			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE1;
-		}
-		else if (free == 2) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1 &&
-			    cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (audit)\n");
-			}
-		}
-		tlr = __mempool_get_trailer(obj);
-		cookie = tlr->cookie;
-		if (cookie != RTE_MEMPOOL_TRAILER_COOKIE) {
-			rte_log_set_history(0);
-			RTE_LOG(CRIT, MEMPOOL,
-				"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-				obj, (const void *) mp, cookie);
-			rte_panic("MEMPOOL: bad trailer cookie\n");
-		}
-	}
-}
-#ifndef __INTEL_COMPILER
-#pragma GCC diagnostic error "-Wcast-qual"
-#endif
+void __mempool_check_cookies(const struct rte_mempool *mp,
+	void * const *obj_table_const, unsigned n, int free);
 #else
 #define __mempool_check_cookies(mp, obj_table_const, n, free) do {} while(0)
 #endif /* RTE_LIBRTE_MEMPOOL_DEBUG */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 04/36] mempool: use sizeof to get the size of header and trailer
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (2 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 03/36] mempool: uninline function to check cookies Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 05/36] mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t Olivier Matz
                     ` (33 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Since commits d2e0ca22f and 97e7e685b the headers and trailers
of the mempool are defined as a structure. We can get their
size using a sizeof instead of doing a calculation that will
become wrong at the first structure update.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 2e1ccc0..b5b87e7 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -264,24 +264,13 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 
 	sz = (sz != NULL) ? sz : &lsz;
 
-	/*
-	 * In header, we have at least the pointer to the pool, and
-	 * optionaly a 64 bits cookie.
-	 */
-	sz->header_size = 0;
-	sz->header_size += sizeof(struct rte_mempool *); /* ptr to pool */
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-	sz->header_size += sizeof(uint64_t); /* cookie */
-#endif
+	sz->header_size = sizeof(struct rte_mempool_objhdr);
 	if ((flags & MEMPOOL_F_NO_CACHE_ALIGN) == 0)
 		sz->header_size = RTE_ALIGN_CEIL(sz->header_size,
 			RTE_MEMPOOL_ALIGN);
 
-	/* trailer contains the cookie in debug mode */
-	sz->trailer_size = 0;
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-	sz->trailer_size += sizeof(uint64_t); /* cookie */
-#endif
+	sz->trailer_size = sizeof(struct rte_mempool_objtlr);
+
 	/* element size is 8 bytes-aligned at least */
 	sz->elt_size = RTE_ALIGN_CEIL(elt_size, sizeof(uint64_t));
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 05/36] mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (3 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 04/36] mempool: use sizeof to get the size of header and trailer Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 06/36] mempool: update library version Olivier Matz
                     ` (32 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

In next commits, we will add the ability to populate the
mempool and iterate through objects using the same function.
We will use the same callback type for that. As the callback is
not a constructor anymore, rename it into rte_mempool_obj_cb_t.

The rte_mempool_obj_iter_t that was used to iterate over objects
will be removed in next commits.

No functional change.
In this commit, the API is preserved through a compat typedef.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test-pmd/mempool_anon.c                |  4 ++--
 app/test-pmd/mempool_osdep.h               |  2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.h      |  2 +-
 drivers/net/xenvirt/rte_mempool_gntalloc.c |  4 ++--
 lib/librte_mempool/rte_dom0_mempool.c      |  2 +-
 lib/librte_mempool/rte_mempool.c           |  8 ++++----
 lib/librte_mempool/rte_mempool.h           | 27 ++++++++++++++-------------
 7 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/app/test-pmd/mempool_anon.c b/app/test-pmd/mempool_anon.c
index 4730432..5e23848 100644
--- a/app/test-pmd/mempool_anon.c
+++ b/app/test-pmd/mempool_anon.c
@@ -86,7 +86,7 @@ struct rte_mempool *
 mempool_anon_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	struct rte_mempool *mp;
@@ -190,7 +190,7 @@ mempool_anon_create(__rte_unused const char *name,
 	__rte_unused unsigned private_data_size,
 	__rte_unused rte_mempool_ctor_t *mp_init,
 	__rte_unused void *mp_init_arg,
-	__rte_unused rte_mempool_obj_ctor_t *obj_init,
+	__rte_unused rte_mempool_obj_cb_t *obj_init,
 	__rte_unused void *obj_init_arg,
 	__rte_unused int socket_id, __rte_unused unsigned flags)
 {
diff --git a/app/test-pmd/mempool_osdep.h b/app/test-pmd/mempool_osdep.h
index 6b8df68..7ce7297 100644
--- a/app/test-pmd/mempool_osdep.h
+++ b/app/test-pmd/mempool_osdep.h
@@ -48,7 +48,7 @@ struct rte_mempool *
 mempool_anon_create(const char *name, unsigned n, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 	int socket_id, unsigned flags);
 
 #endif /*_RTE_MEMPOOL_OSDEP_H_ */
diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.h b/drivers/net/xenvirt/rte_eth_xenvirt.h
index fc15a63..4995a9b 100644
--- a/drivers/net/xenvirt/rte_eth_xenvirt.h
+++ b/drivers/net/xenvirt/rte_eth_xenvirt.h
@@ -51,7 +51,7 @@ struct rte_mempool *
 rte_mempool_gntalloc_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags);
 
 
diff --git a/drivers/net/xenvirt/rte_mempool_gntalloc.c b/drivers/net/xenvirt/rte_mempool_gntalloc.c
index 7bfbfda..69b9231 100644
--- a/drivers/net/xenvirt/rte_mempool_gntalloc.c
+++ b/drivers/net/xenvirt/rte_mempool_gntalloc.c
@@ -78,7 +78,7 @@ static struct _mempool_gntalloc_info
 _create_mempool(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	struct _mempool_gntalloc_info mgi;
@@ -253,7 +253,7 @@ struct rte_mempool *
 rte_mempool_gntalloc_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	int rv;
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
index 0d6d750..0051bd5 100644
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ b/lib/librte_mempool/rte_dom0_mempool.c
@@ -83,7 +83,7 @@ struct rte_mempool *
 rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 	int socket_id, unsigned flags)
 {
 	struct rte_mempool *mp = NULL;
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index b5b87e7..99b3eab 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -128,7 +128,7 @@ static unsigned optimize_object_size(unsigned obj_size)
 
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg)
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
@@ -224,7 +224,7 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 
 struct mempool_populate_arg {
 	struct rte_mempool     *mp;
-	rte_mempool_obj_ctor_t *obj_init;
+	rte_mempool_obj_cb_t   *obj_init;
 	void                   *obj_init_arg;
 };
 
@@ -239,7 +239,7 @@ mempool_obj_populate(void *arg, void *start, void *end, uint32_t idx)
 
 static void
 mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg)
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
 {
 	uint32_t elt_sz;
 	struct mempool_populate_arg arg;
@@ -429,7 +429,7 @@ struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags, void *vaddr,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 6d98cdf..da04021 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -318,6 +318,17 @@ void __mempool_check_cookies(const struct rte_mempool *mp,
 #endif /* RTE_LIBRTE_MEMPOOL_DEBUG */
 
 /**
+ * An object callback function for mempool.
+ *
+ * Arguments are the mempool, the opaque pointer given by the user in
+ * rte_mempool_create(), the pointer to the element and the index of
+ * the element in the pool.
+ */
+typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
+		void *opaque, void *obj, unsigned obj_idx);
+typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
+
+/**
  * A mempool object iterator callback function.
  */
 typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
@@ -366,16 +377,6 @@ uint32_t rte_mempool_obj_iter(void *vaddr,
 	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg);
 
 /**
- * An object constructor callback function for mempool.
- *
- * Arguments are the mempool, the opaque pointer given by the user in
- * rte_mempool_create(), the pointer to the element and the index of
- * the element in the pool.
- */
-typedef void (rte_mempool_obj_ctor_t)(struct rte_mempool *, void *,
-				      void *, unsigned);
-
-/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -468,7 +469,7 @@ struct rte_mempool *
 rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags);
 
 /**
@@ -534,7 +535,7 @@ struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags, void *vaddr,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
@@ -623,7 +624,7 @@ struct rte_mempool *
 rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags);
 
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 06/36] mempool: update library version
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (4 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 05/36] mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-15 12:38     ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 07/36] mempool: list objects when added in the mempool Olivier Matz
                     ` (31 subsequent siblings)
  37 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Next changes of this patch series are too heavy to keep a compat
layer. So bump the version number of the library.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/rel_notes/release_16_04.rst     | 2 +-
 lib/librte_mempool/Makefile                | 2 +-
 lib/librte_mempool/rte_mempool_version.map | 6 ++++++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index d0a09ef..5fe172d 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -513,7 +513,7 @@ The libraries prepended with a plus sign were incremented in this version.
      librte_kvargs.so.1
      librte_lpm.so.2
      librte_mbuf.so.2
-     librte_mempool.so.1
+   + librte_mempool.so.2
      librte_meter.so.1
    + librte_pipeline.so.3
      librte_pmd_bond.so.1
diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index a6898ef..706f844 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -38,7 +38,7 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
 
 EXPORT_MAP := rte_mempool_version.map
 
-LIBABIVER := 1
+LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 17151e0..8c157d0 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -17,3 +17,9 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_16.07 {
+	global:
+
+	local: *;
+} DPDK_2.0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 07/36] mempool: list objects when added in the mempool
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (5 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 06/36] mempool: update library version Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 08/36] mempool: remove const attribute in mempool_walk Olivier Matz
                     ` (30 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Introduce a list entry in object header so they can be listed and
browsed. The objective is to introduce a more simple way to browse the
elements of a mempool.

The next commits will update rte_mempool_obj_iter() to use this list,
and remove the previous complex implementation.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c |  2 ++
 lib/librte_mempool/rte_mempool.h | 15 ++++++++++++---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 99b3eab..83afda8 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -138,6 +138,7 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
+	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
@@ -585,6 +586,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->cache_size = cache_size;
 	mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
 	mp->private_data_size = private_data_size;
+	STAILQ_INIT(&mp->elt_list);
 
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index da04021..469bcbc 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -150,11 +150,13 @@ struct rte_mempool_objsz {
  * Mempool object header structure
  *
  * Each object stored in mempools are prefixed by this header structure,
- * it allows to retrieve the mempool pointer from the object. When debug
- * is enabled, a cookie is also added in this structure preventing
- * corruptions and double-frees.
+ * it allows to retrieve the mempool pointer from the object and to
+ * iterate on all objects attached to a mempool. When debug is enabled,
+ * a cookie is also added in this structure preventing corruptions and
+ * double-frees.
  */
 struct rte_mempool_objhdr {
+	STAILQ_ENTRY(rte_mempool_objhdr) next; /**< Next in list. */
 	struct rte_mempool *mp;          /**< The mempool owning the object. */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	uint64_t cookie;                 /**< Debug cookie. */
@@ -162,6 +164,11 @@ struct rte_mempool_objhdr {
 };
 
 /**
+ * A list of object headers type
+ */
+STAILQ_HEAD(rte_mempool_objhdr_list, rte_mempool_objhdr);
+
+/**
  * Mempool object trailer structure
  *
  * In debug mode, each object stored in mempools are suffixed by this
@@ -194,6 +201,8 @@ struct rte_mempool {
 
 	struct rte_mempool_cache *local_cache; /**< Per-lcore local cache */
 
+	struct rte_mempool_objhdr_list elt_list; /**< List of objects in pool */
+
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	/** Per-lcore statistics. */
 	struct rte_mempool_debug_stats stats[RTE_MAX_LCORE];
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 08/36] mempool: remove const attribute in mempool_walk
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (6 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 07/36] mempool: list objects when added in the mempool Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 09/36] mempool: remove const qualifier in dump and audit Olivier Matz
                     ` (29 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Most functions that can be done on a mempool require a non-const mempool
pointer, except the dump and the audit. Therefore, the mempool_walk()
is more useful if the mempool pointer is not const.

This is required by next commit where the mellanox drivers use
rte_mempool_walk() to iterate the mempools, then rte_mempool_obj_iter()
to iterate the objects in each mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx4/mlx4.c          | 2 +-
 drivers/net/mlx5/mlx5_rxtx.c     | 2 +-
 drivers/net/mlx5/mlx5_rxtx.h     | 2 +-
 lib/librte_mempool/rte_mempool.c | 2 +-
 lib/librte_mempool/rte_mempool.h | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 4f21dbe..41453cb 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1369,7 +1369,7 @@ txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
  *   Pointer to TX queue structure.
  */
 static void
-txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
+txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 9d1380a..88226b6 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -311,7 +311,7 @@ txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
  *   Pointer to TX queue structure.
  */
 void
-txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
+txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 0e2b607..db054d6 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -342,7 +342,7 @@ uint16_t mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
 /* mlx5_rxtx.c */
 
 struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, const struct rte_mempool *);
-void txq_mp2mr_iter(const struct rte_mempool *, void *);
+void txq_mp2mr_iter(struct rte_mempool *, void *);
 uint16_t mlx5_tx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst_sp(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst(void *, struct rte_mbuf **, uint16_t);
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 83afda8..664a2bf 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -965,7 +965,7 @@ rte_mempool_lookup(const char *name)
 	return mp;
 }
 
-void rte_mempool_walk(void (*func)(const struct rte_mempool *, void *),
+void rte_mempool_walk(void (*func)(struct rte_mempool *, void *),
 		      void *arg)
 {
 	struct rte_tailq_entry *te = NULL;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 469bcbc..54a5917 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1304,7 +1304,7 @@ ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num,
  * @param arg
  *   Argument passed to iterator
  */
-void rte_mempool_walk(void (*func)(const struct rte_mempool *, void *arg),
+void rte_mempool_walk(void (*func)(struct rte_mempool *, void *arg),
 		      void *arg);
 
 #ifdef __cplusplus
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 09/36] mempool: remove const qualifier in dump and audit
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (7 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 08/36] mempool: remove const attribute in mempool_walk Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 10/36] mempool: use the list to iterate the mempool elements Olivier Matz
                     ` (28 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

In next commits, we will use an iterator to walk through the objects in
mempool in rte_mempool_audit(). This iterator takes a "struct
rte_mempool *" as a parameter because it is assumed that the callback
function can modify the mempool.

The previous approach was to introduce a RTE_DECONST() macro, but
after discussion it seems that removing the const qualifier is better
to avoid fooling the compiler, and also because these functions are
not used in datapath (possible compiler optimizations due to const
are not critical).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 8 ++++----
 lib/librte_mempool/rte_mempool.h | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 664a2bf..0fd244b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -781,7 +781,7 @@ mempool_obj_audit(void *arg, void *start, void *end, uint32_t idx)
 }
 
 static void
-mempool_audit_cookies(const struct rte_mempool *mp)
+mempool_audit_cookies(struct rte_mempool *mp)
 {
 	uint32_t elt_sz, num;
 	struct mempool_audit_arg arg;
@@ -839,7 +839,7 @@ mempool_audit_cache(const struct rte_mempool *mp)
 
 /* check the consistency of mempool (size, cookies, ...) */
 void
-rte_mempool_audit(const struct rte_mempool *mp)
+rte_mempool_audit(struct rte_mempool *mp)
 {
 	mempool_audit_cache(mp);
 	mempool_audit_cookies(mp);
@@ -850,7 +850,7 @@ rte_mempool_audit(const struct rte_mempool *mp)
 
 /* dump the status of the mempool on the console */
 void
-rte_mempool_dump(FILE *f, const struct rte_mempool *mp)
+rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 {
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	struct rte_mempool_debug_stats sum;
@@ -921,7 +921,7 @@ rte_mempool_dump(FILE *f, const struct rte_mempool *mp)
 void
 rte_mempool_list_dump(FILE *f)
 {
-	const struct rte_mempool *mp = NULL;
+	struct rte_mempool *mp = NULL;
 	struct rte_tailq_entry *te;
 	struct rte_mempool_list *mempool_list;
 
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 54a5917..a80335f 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -645,7 +645,7 @@ rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
  * @param mp
  *   A pointer to the mempool structure.
  */
-void rte_mempool_dump(FILE *f, const struct rte_mempool *mp);
+void rte_mempool_dump(FILE *f, struct rte_mempool *mp);
 
 /**
  * @internal Put several objects back in the mempool; used internally.
@@ -1183,7 +1183,7 @@ rte_mempool_virt2phy(const struct rte_mempool *mp, const void *elt)
  * @param mp
  *   A pointer to the mempool structure.
  */
-void rte_mempool_audit(const struct rte_mempool *mp);
+void rte_mempool_audit(struct rte_mempool *mp);
 
 /**
  * Return a pointer to the private data in an mempool structure.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 10/36] mempool: use the list to iterate the mempool elements
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (8 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 09/36] mempool: remove const qualifier in dump and audit Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 15:33     ` Wiles, Keith
  2016-05-11 10:02     ` [PATCH v2 " Olivier Matz
  2016-04-14 10:19   ` [PATCH 11/36] mempool: use the list to audit all elements Olivier Matz
                     ` (27 subsequent siblings)
  37 siblings, 2 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Now that the mempool objects are chained into a list, we can use it to
browse them. This implies a rework of rte_mempool_obj_iter() API, that
does not need to take as many arguments as before. The previous function
is kept as a private function, and renamed in this commit. It will be
removed in a next commit of the patch series.

The only internal users of this function are the mellanox drivers. The
code is updated accordingly.

Introducing an API compatibility for this function has been considered,
but it is not easy to do without keeping the old code, as the previous
function could also be used to browse elements that were not added in a
mempool. Moreover, the API is already be broken by other patches in this
version.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx4/mlx4.c                    | 51 +++++++---------------
 drivers/net/mlx5/mlx5_rxtx.c               | 51 +++++++---------------
 lib/librte_mempool/rte_mempool.c           | 36 ++++++++++++---
 lib/librte_mempool/rte_mempool.h           | 70 ++++++++----------------------
 lib/librte_mempool/rte_mempool_version.map |  3 +-
 5 files changed, 82 insertions(+), 129 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 41453cb..089bbec 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1320,7 +1320,6 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 }
 
 struct txq_mp2mr_mbuf_check_data {
-	const struct rte_mempool *mp;
 	int ret;
 };
 
@@ -1328,34 +1327,26 @@ struct txq_mp2mr_mbuf_check_data {
  * Callback function for rte_mempool_obj_iter() to check whether a given
  * mempool object looks like a mbuf.
  *
- * @param[in, out] arg
- *   Context data (struct txq_mp2mr_mbuf_check_data). Contains mempool pointer
- *   and return value.
- * @param[in] start
- *   Object start address.
- * @param[in] end
- *   Object end address.
+ * @param[in] mp
+ *   The mempool pointer
+ * @param[in] arg
+ *   Context data (struct txq_mp2mr_mbuf_check_data). Contains the
+ *   return value.
+ * @param[in] obj
+ *   Object address.
  * @param index
- *   Unused.
- *
- * @return
- *   Nonzero value when object is not a mbuf.
+ *   Object index, unused.
  */
 static void
-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
-		     uint32_t index __rte_unused)
+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
+	__rte_unused uint32_t index)
 {
 	struct txq_mp2mr_mbuf_check_data *data = arg;
-	struct rte_mbuf *buf =
-		(void *)((uintptr_t)start + data->mp->header_size);
+	struct rte_mbuf *buf = obj;
 
-	(void)index;
 	/* Check whether mbuf structure fits element size and whether mempool
 	 * pointer is valid. */
-	if (((uintptr_t)end >= (uintptr_t)(buf + 1)) &&
-	    (buf->pool == data->mp))
-		data->ret = 0;
-	else
+	if (sizeof(*buf) > mp->elt_size || buf->pool != mp)
 		data->ret = -1;
 }
 
@@ -1373,24 +1364,12 @@ txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
-		.mp = mp,
-		.ret = -1,
+		.ret = 0,
 	};
 
-	/* Discard empty mempools. */
-	if (mp->size == 0)
-		return;
 	/* Register mempool only if the first element looks like a mbuf. */
-	rte_mempool_obj_iter((void *)mp->elt_va_start,
-			     1,
-			     mp->header_size + mp->elt_size + mp->trailer_size,
-			     1,
-			     mp->elt_pa,
-			     mp->pg_num,
-			     mp->pg_shift,
-			     txq_mp2mr_mbuf_check,
-			     &data);
-	if (data.ret)
+	if (rte_mempool_obj_iter(mp, txq_mp2mr_mbuf_check, &data) == 0 ||
+			data.ret == -1)
 		return;
 	txq_mp2mr(txq, mp);
 }
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 88226b6..afd4338 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -262,7 +262,6 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 }
 
 struct txq_mp2mr_mbuf_check_data {
-	const struct rte_mempool *mp;
 	int ret;
 };
 
@@ -270,34 +269,26 @@ struct txq_mp2mr_mbuf_check_data {
  * Callback function for rte_mempool_obj_iter() to check whether a given
  * mempool object looks like a mbuf.
  *
- * @param[in, out] arg
- *   Context data (struct txq_mp2mr_mbuf_check_data). Contains mempool pointer
- *   and return value.
- * @param[in] start
- *   Object start address.
- * @param[in] end
- *   Object end address.
+ * @param[in] mp
+ *   The mempool pointer
+ * @param[in] arg
+ *   Context data (struct txq_mp2mr_mbuf_check_data). Contains the
+ *   return value.
+ * @param[in] obj
+ *   Object address.
  * @param index
- *   Unused.
- *
- * @return
- *   Nonzero value when object is not a mbuf.
+ *   Object index, unused.
  */
 static void
-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
-		     uint32_t index __rte_unused)
+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
+	__rte_unused uint32_t index)
 {
 	struct txq_mp2mr_mbuf_check_data *data = arg;
-	struct rte_mbuf *buf =
-		(void *)((uintptr_t)start + data->mp->header_size);
+	struct rte_mbuf *buf = obj;
 
-	(void)index;
 	/* Check whether mbuf structure fits element size and whether mempool
 	 * pointer is valid. */
-	if (((uintptr_t)end >= (uintptr_t)(buf + 1)) &&
-	    (buf->pool == data->mp))
-		data->ret = 0;
-	else
+	if (sizeof(*buf) > mp->elt_size || buf->pool != mp)
 		data->ret = -1;
 }
 
@@ -315,24 +306,12 @@ txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
-		.mp = mp,
-		.ret = -1,
+		.ret = 0,
 	};
 
-	/* Discard empty mempools. */
-	if (mp->size == 0)
-		return;
 	/* Register mempool only if the first element looks like a mbuf. */
-	rte_mempool_obj_iter((void *)mp->elt_va_start,
-			     1,
-			     mp->header_size + mp->elt_size + mp->trailer_size,
-			     1,
-			     mp->elt_pa,
-			     mp->pg_num,
-			     mp->pg_shift,
-			     txq_mp2mr_mbuf_check,
-			     &data);
-	if (data.ret)
+	if (rte_mempool_obj_iter(mp, txq_mp2mr_mbuf_check, &data) == 0 ||
+			data.ret == -1)
 		return;
 	txq_mp2mr(txq, mp);
 }
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0fd244b..5cb58db 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -126,6 +126,14 @@ static unsigned optimize_object_size(unsigned obj_size)
 	return new_obj_size * RTE_MEMPOOL_ALIGN;
 }
 
+/**
+ * A mempool object iterator callback function.
+ */
+typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
+	void * /*obj_start*/,
+	void * /*obj_end*/,
+	uint32_t /*obj_index */);
+
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
@@ -160,8 +168,8 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
  * of size "elt_sz" aligned at "align". For each object in this memory
  * chunk, invoke a callback. It returns the effective number of objects
  * in this memory. */
-uint32_t
-rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
+static uint32_t
+rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
 	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
 {
@@ -219,6 +227,24 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	return i;
 }
 
+/* call obj_cb() for each mempool element */
+uint32_t
+rte_mempool_obj_iter(struct rte_mempool *mp,
+	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg)
+{
+	struct rte_mempool_objhdr *hdr;
+	void *obj;
+	unsigned n = 0;
+
+	STAILQ_FOREACH(hdr, &mp->elt_list, next) {
+		obj = (char *)hdr + sizeof(*hdr);
+		obj_cb(mp, obj_cb_arg, obj, n);
+		n++;
+	}
+
+	return n;
+}
+
 /*
  * Populate  mempool with the objects.
  */
@@ -250,7 +276,7 @@ mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
 	arg.obj_init = obj_init;
 	arg.obj_init_arg = obj_init_arg;
 
-	mp->size = rte_mempool_obj_iter((void *)mp->elt_va_start,
+	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		num, elt_sz, align,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
 		mempool_obj_populate, &arg);
@@ -364,7 +390,7 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	va = (uintptr_t)vaddr;
 	uv = va;
 
-	if ((n = rte_mempool_obj_iter(vaddr, elt_num, total_elt_sz, 1,
+	if ((n = rte_mempool_obj_mem_iter(vaddr, elt_num, total_elt_sz, 1,
 			paddr, pg_num, pg_shift, mempool_lelem_iter,
 			&uv)) != elt_num) {
 		return -(ssize_t)n;
@@ -792,7 +818,7 @@ mempool_audit_cookies(struct rte_mempool *mp)
 	arg.obj_end = mp->elt_va_start;
 	arg.obj_num = 0;
 
-	num = rte_mempool_obj_iter((void *)mp->elt_va_start,
+	num = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		mp->size, elt_sz, 1,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
 		mempool_obj_audit, &arg);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index a80335f..f5f6752 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -329,63 +329,13 @@ void __mempool_check_cookies(const struct rte_mempool *mp,
 /**
  * An object callback function for mempool.
  *
- * Arguments are the mempool, the opaque pointer given by the user in
- * rte_mempool_create(), the pointer to the element and the index of
- * the element in the pool.
+ * Used by rte_mempool_create() and rte_mempool_obj_iter().
  */
 typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
 		void *opaque, void *obj, unsigned obj_idx);
 typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
 
 /**
- * A mempool object iterator callback function.
- */
-typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
-	void * /*obj_start*/,
-	void * /*obj_end*/,
-	uint32_t /*obj_index */);
-
-/**
- * Call a function for each mempool object in a memory chunk
- *
- * Iterate across objects of the given size and alignment in the
- * provided chunk of memory. The given memory buffer can consist of
- * disjointed physical pages.
- *
- * For each object, call the provided callback (if any). This function
- * is used to populate a mempool, or walk through all the elements of a
- * mempool, or estimate how many elements of the given size could be
- * created in the given memory buffer.
- *
- * @param vaddr
- *   Virtual address of the memory buffer.
- * @param elt_num
- *   Maximum number of objects to iterate through.
- * @param elt_sz
- *   Size of each object.
- * @param align
- *   Alignment of each object.
- * @param paddr
- *   Array of physical addresses of the pages that comprises given memory
- *   buffer.
- * @param pg_num
- *   Number of elements in the paddr array.
- * @param pg_shift
- *   LOG2 of the physical pages size.
- * @param obj_iter
- *   Object iterator callback function (could be NULL).
- * @param obj_iter_arg
- *   User defined parameter for the object iterator callback function.
- *
- * @return
- *   Number of objects iterated through.
- */
-uint32_t rte_mempool_obj_iter(void *vaddr,
-	uint32_t elt_num, size_t elt_sz, size_t align,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
-	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg);
-
-/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -638,6 +588,24 @@ rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
 
 
 /**
+ * Call a function for each mempool element
+ *
+ * Iterate across all objects attached to a rte_mempool and call the
+ * callback function on it.
+ *
+ * @param mp
+ *   A pointer to an initialized mempool.
+ * @param obj_cb
+ *   A function pointer that is called for each object.
+ * @param obj_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   Number of objects iterated.
+ */
+uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
+	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg);
+
+/**
  * Dump the status of the mempool to the console.
  *
  * @param f
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 8c157d0..4db75ca 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -9,7 +9,6 @@ DPDK_2.0 {
 	rte_mempool_dump;
 	rte_mempool_list_dump;
 	rte_mempool_lookup;
-	rte_mempool_obj_iter;
 	rte_mempool_walk;
 	rte_mempool_xmem_create;
 	rte_mempool_xmem_size;
@@ -21,5 +20,7 @@ DPDK_2.0 {
 DPDK_16.07 {
 	global:
 
+	rte_mempool_obj_iter;
+
 	local: *;
 } DPDK_2.0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 11/36] mempool: use the list to audit all elements
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (9 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 10/36] mempool: use the list to iterate the mempool elements Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 12/36] mempool: use the list to initialize mempool objects Olivier Matz
                     ` (26 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Use the new rte_mempool_obj_iter() instead the old rte_mempool_obj_iter()
to iterate among objects to audit them (check for cookies).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 41 ++++++----------------------------------
 1 file changed, 6 insertions(+), 35 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 5cb58db..2266f38 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -720,12 +720,6 @@ rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
 
-struct mempool_audit_arg {
-	const struct rte_mempool *mp;
-	uintptr_t obj_end;
-	uint32_t obj_num;
-};
-
 /* check and update cookies or panic (internal) */
 void __mempool_check_cookies(const struct rte_mempool *mp,
 	void * const *obj_table_const, unsigned n, int free)
@@ -795,45 +789,22 @@ void __mempool_check_cookies(const struct rte_mempool *mp,
 }
 
 static void
-mempool_obj_audit(void *arg, void *start, void *end, uint32_t idx)
+mempool_obj_audit(struct rte_mempool *mp, __rte_unused void *opaque,
+	void *obj, __rte_unused unsigned idx)
 {
-	struct mempool_audit_arg *pa = arg;
-	void *obj;
-
-	obj = (char *)start + pa->mp->header_size;
-	pa->obj_end = (uintptr_t)end;
-	pa->obj_num = idx + 1;
-	__mempool_check_cookies(pa->mp, &obj, 1, 2);
+	__mempool_check_cookies(mp, &obj, 1, 2);
 }
 
 static void
 mempool_audit_cookies(struct rte_mempool *mp)
 {
-	uint32_t elt_sz, num;
-	struct mempool_audit_arg arg;
-
-	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-
-	arg.mp = mp;
-	arg.obj_end = mp->elt_va_start;
-	arg.obj_num = 0;
-
-	num = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
-		mp->size, elt_sz, 1,
-		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_audit, &arg);
+	unsigned num;
 
+	num = rte_mempool_obj_iter(mp, mempool_obj_audit, NULL);
 	if (num != mp->size) {
-			rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
+		rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
 			"iterated only over %u elements\n",
 			mp, mp->size, num);
-	} else if (arg.obj_end != mp->elt_va_end || arg.obj_num != mp->size) {
-			rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
-			"last callback va_end: %#tx (%#tx expeceted), "
-			"num of objects: %u (%u expected)\n",
-			mp, mp->size,
-			arg.obj_end, mp->elt_va_end,
-			arg.obj_num, mp->size);
 	}
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 12/36] mempool: use the list to initialize mempool objects
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (10 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 11/36] mempool: use the list to audit all elements Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 13/36] mempool: create the internal ring in a specific function Olivier Matz
                     ` (25 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Before this patch, the mempool elements were initialized at the time
they were added to the mempool. This patch changes this to do the
initialization of all objects once the mempool is populated, using
rte_mempool_obj_iter() introduced in previous commits.

Thanks to this modification, we are getting closer to a new API
that would allow us to do:
  mempool_init()
  mempool_populate(mem1)
  mempool_populate(mem2)
  mempool_populate(mem3)
  mempool_init_obj()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 36 +++++++++++++-----------------------
 1 file changed, 13 insertions(+), 23 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 2266f38..5d957b1 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -135,8 +135,7 @@ typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
 	uint32_t /*obj_index */);
 
 static void
-mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
+mempool_add_elem(struct rte_mempool *mp, void *obj)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
@@ -153,9 +152,6 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	tlr = __mempool_get_trailer(obj);
 	tlr->cookie = RTE_MEMPOOL_TRAILER_COOKIE;
 #endif
-	/* call the initializer */
-	if (obj_init)
-		obj_init(mp, obj_init_arg, obj, obj_idx);
 
 	/* enqueue in ring */
 	rte_ring_sp_enqueue(mp->ring, obj);
@@ -249,37 +245,27 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
  * Populate  mempool with the objects.
  */
 
-struct mempool_populate_arg {
-	struct rte_mempool     *mp;
-	rte_mempool_obj_cb_t   *obj_init;
-	void                   *obj_init_arg;
-};
-
 static void
-mempool_obj_populate(void *arg, void *start, void *end, uint32_t idx)
+mempool_obj_populate(void *arg, void *start, void *end,
+	__rte_unused uint32_t idx)
 {
-	struct mempool_populate_arg *pa = arg;
+	struct rte_mempool *mp = arg;
 
-	mempool_add_elem(pa->mp, start, idx, pa->obj_init, pa->obj_init_arg);
-	pa->mp->elt_va_end = (uintptr_t)end;
+	mempool_add_elem(mp, start);
+	mp->elt_va_end = (uintptr_t)end;
 }
 
 static void
-mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
+mempool_populate(struct rte_mempool *mp, size_t num, size_t align)
 {
 	uint32_t elt_sz;
-	struct mempool_populate_arg arg;
 
 	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-	arg.mp = mp;
-	arg.obj_init = obj_init;
-	arg.obj_init_arg = obj_init_arg;
 
 	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		num, elt_sz, align,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_populate, &arg);
+		mempool_obj_populate, mp);
 }
 
 /* get the header, trailer and total size of a mempool element. */
@@ -648,7 +634,11 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (mp_init)
 		mp_init(mp, mp_init_arg);
 
-	mempool_populate(mp, n, 1, obj_init, obj_init_arg);
+	mempool_populate(mp, n, 1);
+
+	/* call the initializer */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
 
 	te->data = (void *) mp;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 13/36] mempool: create the internal ring in a specific function
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (11 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 12/36] mempool: use the list to initialize mempool objects Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 14/36] mempool: store physaddr in mempool objects Olivier Matz
                     ` (24 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

This makes the code of rte_mempool_create() clearer, and it will make
the introduction of external mempool handler easier (in another patch
series). Indeed, this function contains the specific part when a ring is
used, but it could be replaced by something else in the future.

This commit also adds a socket_id field in the mempool structure that
is used by this new function.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 55 +++++++++++++++++++++++++---------------
 lib/librte_mempool/rte_mempool.h |  1 +
 2 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 5d957b1..839b828 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -431,6 +431,35 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 					       MEMPOOL_PG_SHIFT_MAX);
 }
 
+/* create the internal ring */
+static int
+rte_mempool_ring_create(struct rte_mempool *mp)
+{
+	int rg_flags = 0;
+	char rg_name[RTE_RING_NAMESIZE];
+	struct rte_ring *r;
+
+	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, mp->name);
+
+	/* ring flags */
+	if (mp->flags & MEMPOOL_F_SP_PUT)
+		rg_flags |= RING_F_SP_ENQ;
+	if (mp->flags & MEMPOOL_F_SC_GET)
+		rg_flags |= RING_F_SC_DEQ;
+
+	/* Allocate the ring that will be used to store objects.
+	 * Ring functions will return appropriate errors if we are
+	 * running as a secondary process etc., so no checks made
+	 * in this function for that condition. */
+	r = rte_ring_create(rg_name, rte_align32pow2(mp->size + 1),
+		mp->socket_id, rg_flags);
+	if (r == NULL)
+		return -rte_errno;
+
+	mp->ring = r;
+	return 0;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -447,15 +476,12 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
 	char mz_name[RTE_MEMZONE_NAMESIZE];
-	char rg_name[RTE_RING_NAMESIZE];
 	struct rte_mempool_list *mempool_list;
 	struct rte_mempool *mp = NULL;
 	struct rte_tailq_entry *te = NULL;
-	struct rte_ring *r = NULL;
 	const struct rte_memzone *mz;
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-	int rg_flags = 0;
 	void *obj;
 	struct rte_mempool_objsz objsz;
 	void *startaddr;
@@ -498,12 +524,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		flags |= MEMPOOL_F_NO_SPREAD;
 
-	/* ring flags */
-	if (flags & MEMPOOL_F_SP_PUT)
-		rg_flags |= RING_F_SP_ENQ;
-	if (flags & MEMPOOL_F_SC_GET)
-		rg_flags |= RING_F_SC_DEQ;
-
 	/* calculate mempool object sizes. */
 	if (!rte_mempool_calc_obj_size(elt_size, flags, &objsz)) {
 		rte_errno = EINVAL;
@@ -512,15 +532,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 	rte_rwlock_write_lock(RTE_EAL_MEMPOOL_RWLOCK);
 
-	/* allocate the ring that will be used to store objects */
-	/* Ring functions will return appropriate errors if we are
-	 * running as a secondary process etc., so no checks made
-	 * in this function for that condition */
-	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, name);
-	r = rte_ring_create(rg_name, rte_align32pow2(n+1), socket_id, rg_flags);
-	if (r == NULL)
-		goto exit_unlock;
-
 	/*
 	 * reserve a memory zone for this mempool: private data is
 	 * cache-aligned
@@ -589,7 +600,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->phys_addr = mz->phys_addr;
-	mp->ring = r;
+	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
 	mp->elt_size = objsz.elt_size;
@@ -600,6 +611,9 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->private_data_size = private_data_size;
 	STAILQ_INIT(&mp->elt_list);
 
+	if (rte_mempool_ring_create(mp) < 0)
+		goto exit_unlock;
+
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
 	 * The local_cache points to just past the elt_pa[] array.
@@ -651,7 +665,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	rte_ring_free(r);
+	if (mp != NULL)
+		rte_ring_free(mp->ring);
 	rte_free(te);
 
 	return NULL;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index f5f6752..0153e62 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -188,6 +188,7 @@ struct rte_mempool {
 	struct rte_ring *ring;           /**< Ring to store objects. */
 	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
 	int flags;                       /**< Flags of the mempool. */
+	int socket_id;                   /**< Socket id passed at mempool creation. */
 	uint32_t size;                   /**< Size of the mempool. */
 	uint32_t cache_size;             /**< Size of per-lcore local cache. */
 	uint32_t cache_flushthresh;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 14/36] mempool: store physaddr in mempool objects
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (12 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 13/36] mempool: create the internal ring in a specific function Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 15:40     ` Wiles, Keith
  2016-04-14 10:19   ` [PATCH 15/36] mempool: remove MEMPOOL_IS_CONTIG() Olivier Matz
                     ` (23 subsequent siblings)
  37 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Store the physical address of the object in its header. It simplifies
rte_mempool_virt2phy() and prepares the removing of the paddr[] table
in the mempool header.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++++++++++------
 lib/librte_mempool/rte_mempool.h | 11 ++++++-----
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 839b828..b8e46fc 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -132,19 +132,22 @@ static unsigned optimize_object_size(unsigned obj_size)
 typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
 	void * /*obj_start*/,
 	void * /*obj_end*/,
-	uint32_t /*obj_index */);
+	uint32_t /*obj_index */,
+	phys_addr_t /*physaddr*/);
 
 static void
-mempool_add_elem(struct rte_mempool *mp, void *obj)
+mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
 
 	obj = (char *)obj + mp->header_size;
+	physaddr += mp->header_size;
 
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
+	hdr->physaddr = physaddr;
 	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
@@ -173,6 +176,7 @@ rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	uint32_t pgn, pgf;
 	uintptr_t end, start, va;
 	uintptr_t pg_sz;
+	phys_addr_t physaddr;
 
 	pg_sz = (uintptr_t)1 << pg_shift;
 	va = (uintptr_t)vaddr;
@@ -208,9 +212,10 @@ rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 		 * otherwise, just skip that chunk unused.
 		 */
 		if (k == pgn) {
+			physaddr = paddr[k] + (start & (pg_sz - 1));
 			if (obj_iter != NULL)
 				obj_iter(obj_iter_arg, (void *)start,
-					(void *)end, i);
+					(void *)end, i, physaddr);
 			va = end;
 			j += pgf;
 			i++;
@@ -247,11 +252,11 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 
 static void
 mempool_obj_populate(void *arg, void *start, void *end,
-	__rte_unused uint32_t idx)
+	__rte_unused uint32_t idx, phys_addr_t physaddr)
 {
 	struct rte_mempool *mp = arg;
 
-	mempool_add_elem(mp, start);
+	mempool_add_elem(mp, start, physaddr);
 	mp->elt_va_end = (uintptr_t)end;
 }
 
@@ -355,7 +360,7 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
  * argument to the end of the object. */
 static void
 mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
-	__rte_unused uint32_t idx)
+	__rte_unused uint32_t idx, __rte_unused phys_addr_t physaddr)
 {
 	*(uintptr_t *)arg = (uintptr_t)end;
 }
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 0153e62..00ca087 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -158,6 +158,7 @@ struct rte_mempool_objsz {
 struct rte_mempool_objhdr {
 	STAILQ_ENTRY(rte_mempool_objhdr) next; /**< Next in list. */
 	struct rte_mempool *mp;          /**< The mempool owning the object. */
+	phys_addr_t physaddr;            /**< Physical address of the object. */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	uint64_t cookie;                 /**< Debug cookie. */
 #endif
@@ -1125,13 +1126,13 @@ rte_mempool_empty(const struct rte_mempool *mp)
  *   The physical address of the elt element.
  */
 static inline phys_addr_t
-rte_mempool_virt2phy(const struct rte_mempool *mp, const void *elt)
+rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
 {
 	if (rte_eal_has_hugepages()) {
-		uintptr_t off;
-
-		off = (const char *)elt - (const char *)mp->elt_va_start;
-		return mp->elt_pa[off >> mp->pg_shift] + (off & mp->pg_mask);
+		const struct rte_mempool_objhdr *hdr;
+		hdr = (const struct rte_mempool_objhdr *)RTE_PTR_SUB(elt,
+			sizeof(*hdr));
+		return hdr->physaddr;
 	} else {
 		/*
 		 * If huge pages are disabled, we cannot assume the
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 15/36] mempool: remove MEMPOOL_IS_CONTIG()
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (13 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 14/36] mempool: store physaddr in mempool objects Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 16/36] mempool: store memory chunks in a list Olivier Matz
                     ` (22 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

The next commits will change the behavior of the mempool library so that
the objects will never be allocated in the same memzone than the mempool
header. Therefore, there is no reason to keep this macro that would
always return 0.

This macro was only used in app/test.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c          | 7 +++----
 lib/librte_mempool/rte_mempool.h | 7 -------
 2 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 10e1fa4..2f317f2 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -126,12 +126,11 @@ test_mempool_basic(void)
 			MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size))
 		return -1;
 
+#ifndef RTE_EXEC_ENV_BSDAPP /* rte_mem_virt2phy() not supported on bsd */
 	printf("get physical address of an object\n");
-	if (MEMPOOL_IS_CONTIG(mp) &&
-			rte_mempool_virt2phy(mp, obj) !=
-			(phys_addr_t) (mp->phys_addr +
-			(phys_addr_t) ((char*) obj - (char*) mp)))
+	if (rte_mempool_virt2phy(mp, obj) != rte_mem_virt2phy(obj))
 		return -1;
+#endif
 
 	printf("put the object back\n");
 	rte_mempool_put(mp, obj);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 00ca087..74cecd6 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -271,13 +271,6 @@ struct rte_mempool {
 	(sizeof(*(mp)) + __PA_SIZE(mp, pgn) + (((cs) == 0) ? 0 : \
 	(sizeof(struct rte_mempool_cache) * RTE_MAX_LCORE)))
 
-/**
- * Return true if the whole mempool is in contiguous memory.
- */
-#define	MEMPOOL_IS_CONTIG(mp)                      \
-	((mp)->pg_num == MEMPOOL_PG_NUM_DEFAULT && \
-	(mp)->phys_addr == (mp)->elt_pa[0])
-
 /* return the header of a mempool object (internal) */
 static inline struct rte_mempool_objhdr *__mempool_get_header(void *obj)
 {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 16/36] mempool: store memory chunks in a list
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (14 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 15/36] mempool: remove MEMPOOL_IS_CONTIG() Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 17/36] mempool: new function to iterate the memory chunks Olivier Matz
                     ` (21 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Do not use paddr table to store the mempool memory chunks.
This will allow to have several chunks with different virtual addresses.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c          |   2 +-
 lib/librte_mempool/rte_mempool.c | 205 ++++++++++++++++++++++++++-------------
 lib/librte_mempool/rte_mempool.h |  51 +++++-----
 3 files changed, 165 insertions(+), 93 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 2f317f2..2bc3ac0 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -123,7 +123,7 @@ test_mempool_basic(void)
 
 	printf("get private data\n");
 	if (rte_mempool_get_priv(mp) != (char *)mp +
-			MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size))
+			MEMPOOL_HEADER_SIZE(mp, mp->cache_size))
 		return -1;
 
 #ifndef RTE_EXEC_ENV_BSDAPP /* rte_mem_virt2phy() not supported on bsd */
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index b8e46fc..9e3cfde 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -141,14 +141,12 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
 
-	obj = (char *)obj + mp->header_size;
-	physaddr += mp->header_size;
-
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
 	hdr->physaddr = physaddr;
 	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
+	mp->populated_size++;
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
@@ -246,33 +244,6 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 	return n;
 }
 
-/*
- * Populate  mempool with the objects.
- */
-
-static void
-mempool_obj_populate(void *arg, void *start, void *end,
-	__rte_unused uint32_t idx, phys_addr_t physaddr)
-{
-	struct rte_mempool *mp = arg;
-
-	mempool_add_elem(mp, start, physaddr);
-	mp->elt_va_end = (uintptr_t)end;
-}
-
-static void
-mempool_populate(struct rte_mempool *mp, size_t num, size_t align)
-{
-	uint32_t elt_sz;
-
-	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-
-	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
-		num, elt_sz, align,
-		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_populate, mp);
-}
-
 /* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
@@ -465,6 +436,108 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 	return 0;
 }
 
+/* Free memory chunks used by a mempool. Objects must be in pool */
+static void
+rte_mempool_free_memchunks(struct rte_mempool *mp)
+{
+	struct rte_mempool_memhdr *memhdr;
+	void *elt;
+
+	while (!STAILQ_EMPTY(&mp->elt_list)) {
+		rte_ring_sc_dequeue(mp->ring, &elt);
+		(void)elt;
+		STAILQ_REMOVE_HEAD(&mp->elt_list, next);
+		mp->populated_size--;
+	}
+
+	while (!STAILQ_EMPTY(&mp->mem_list)) {
+		memhdr = STAILQ_FIRST(&mp->mem_list);
+		STAILQ_REMOVE_HEAD(&mp->mem_list, next);
+		rte_free(memhdr);
+		mp->nb_mem_chunks--;
+	}
+}
+
+/* Add objects in the pool, using a physically contiguous memory
+ * zone. Return the number of objects added, or a negative value
+ * on error. */
+static int
+rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
+	phys_addr_t paddr, size_t len)
+{
+	unsigned total_elt_sz;
+	unsigned i = 0;
+	size_t off;
+	struct rte_mempool_memhdr *memhdr;
+
+	/* mempool is already populated */
+	if (mp->populated_size >= mp->size)
+		return -ENOSPC;
+
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+
+	memhdr = rte_zmalloc("MEMPOOL_MEMHDR", sizeof(*memhdr), 0);
+	if (memhdr == NULL)
+		return -ENOMEM;
+
+	memhdr->mp = mp;
+	memhdr->addr = vaddr;
+	memhdr->phys_addr = paddr;
+	memhdr->len = len;
+
+	if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
+		off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
+	else
+		off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - vaddr;
+
+	while (off + total_elt_sz <= len && mp->populated_size < mp->size) {
+		off += mp->header_size;
+		mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
+		off += mp->elt_size + mp->trailer_size;
+		i++;
+	}
+
+	/* not enough room to store one object */
+	if (i == 0)
+		return -EINVAL;
+
+	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
+	mp->nb_mem_chunks++;
+	return i;
+}
+
+/* Add objects in the pool, using a table of physical pages. Return the
+ * number of objects added, or a negative value on error. */
+static int
+rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+{
+	uint32_t i, n;
+	int ret, cnt = 0;
+	size_t pg_sz = (size_t)1 << pg_shift;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+
+	for (i = 0; i < pg_num && mp->populated_size < mp->size; i += n) {
+
+		/* populate with the largest group of contiguous pages */
+		for (n = 1; (i + n) < pg_num &&
+			     paddr[i] + pg_sz == paddr[i+n]; n++)
+			;
+
+		ret = rte_mempool_populate_phys(mp, vaddr + i * pg_sz,
+			paddr[i], n * pg_sz);
+		if (ret < 0) {
+			rte_mempool_free_memchunks(mp);
+			return ret;
+		}
+		cnt += ret;
+	}
+	return cnt;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -491,6 +564,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	struct rte_mempool_objsz objsz;
 	void *startaddr;
 	int page_size = getpagesize();
+	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -520,7 +594,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	}
 
 	/* Check that pg_num and pg_shift parameters are valid. */
-	if (pg_num < RTE_DIM(mp->elt_pa) || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
+	if (pg_num == 0 || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
@@ -567,7 +641,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	 * store mempool objects. Otherwise reserve a memzone that is large
 	 * enough to hold mempool header and metadata plus mempool objects.
 	 */
-	mempool_size = MEMPOOL_HEADER_SIZE(mp, pg_num, cache_size);
+	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
 	if (vaddr == NULL)
@@ -615,6 +689,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
 	mp->private_data_size = private_data_size;
 	STAILQ_INIT(&mp->elt_list);
+	STAILQ_INIT(&mp->mem_list);
 
 	if (rte_mempool_ring_create(mp) < 0)
 		goto exit_unlock;
@@ -624,37 +699,31 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	 * The local_cache points to just past the elt_pa[] array.
 	 */
 	mp->local_cache = (struct rte_mempool_cache *)
-		RTE_PTR_ADD(mp, MEMPOOL_HEADER_SIZE(mp, pg_num, 0));
+		RTE_PTR_ADD(mp, MEMPOOL_HEADER_SIZE(mp, 0));
 
-	/* calculate address of the first element for continuous mempool. */
-	obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, pg_num, cache_size) +
-		private_data_size;
-	obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
-
-	/* populate address translation fields. */
-	mp->pg_num = pg_num;
-	mp->pg_shift = pg_shift;
-	mp->pg_mask = RTE_LEN2MASK(mp->pg_shift, typeof(mp->pg_mask));
+	/* call the initializer */
+	if (mp_init)
+		mp_init(mp, mp_init_arg);
 
 	/* mempool elements allocated together with mempool */
 	if (vaddr == NULL) {
-		mp->elt_va_start = (uintptr_t)obj;
-		mp->elt_pa[0] = mp->phys_addr +
-			(mp->elt_va_start - (uintptr_t)mp);
+		/* calculate address of the first element for continuous mempool. */
+		obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, cache_size) +
+			private_data_size;
+		obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
+
+		ret = rte_mempool_populate_phys(mp, obj,
+			mp->phys_addr + ((char *)obj - (char *)mp),
+			objsz.total_size * n);
+		if (ret != (int)mp->size)
+			goto exit_unlock;
 	} else {
-		/* mempool elements in a separate chunk of memory. */
-		mp->elt_va_start = (uintptr_t)vaddr;
-		memcpy(mp->elt_pa, paddr, sizeof (mp->elt_pa[0]) * pg_num);
+		ret = rte_mempool_populate_phys_tab(mp, vaddr,
+			paddr, pg_num, pg_shift);
+		if (ret != (int)mp->size)
+			goto exit_unlock;
 	}
 
-	mp->elt_va_end = mp->elt_va_start;
-
-	/* call the initializer */
-	if (mp_init)
-		mp_init(mp, mp_init_arg);
-
-	mempool_populate(mp, n, 1);
-
 	/* call the initializer */
 	if (obj_init)
 		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
@@ -670,8 +739,10 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	if (mp != NULL)
+	if (mp != NULL) {
+		rte_mempool_free_memchunks(mp);
 		rte_ring_free(mp->ring);
+	}
 	rte_free(te);
 
 	return NULL;
@@ -863,8 +934,10 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 	struct rte_mempool_debug_stats sum;
 	unsigned lcore_id;
 #endif
+	struct rte_mempool_memhdr *memhdr;
 	unsigned common_count;
 	unsigned cache_count;
+	size_t mem_len = 0;
 
 	RTE_VERIFY(f != NULL);
 	RTE_VERIFY(mp != NULL);
@@ -873,7 +946,9 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 	fprintf(f, "  flags=%x\n", mp->flags);
 	fprintf(f, "  ring=<%s>@%p\n", mp->ring->name, mp->ring);
 	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->phys_addr);
+	fprintf(f, "  nb_mem_chunks=%u\n", mp->nb_mem_chunks);
 	fprintf(f, "  size=%"PRIu32"\n", mp->size);
+	fprintf(f, "  populated_size=%"PRIu32"\n", mp->populated_size);
 	fprintf(f, "  header_size=%"PRIu32"\n", mp->header_size);
 	fprintf(f, "  elt_size=%"PRIu32"\n", mp->elt_size);
 	fprintf(f, "  trailer_size=%"PRIu32"\n", mp->trailer_size);
@@ -881,17 +956,13 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 	       mp->header_size + mp->elt_size + mp->trailer_size);
 
 	fprintf(f, "  private_data_size=%"PRIu32"\n", mp->private_data_size);
-	fprintf(f, "  pg_num=%"PRIu32"\n", mp->pg_num);
-	fprintf(f, "  pg_shift=%"PRIu32"\n", mp->pg_shift);
-	fprintf(f, "  pg_mask=%#tx\n", mp->pg_mask);
-	fprintf(f, "  elt_va_start=%#tx\n", mp->elt_va_start);
-	fprintf(f, "  elt_va_end=%#tx\n", mp->elt_va_end);
-	fprintf(f, "  elt_pa[0]=0x%" PRIx64 "\n", mp->elt_pa[0]);
-
-	if (mp->size != 0)
+
+	STAILQ_FOREACH(memhdr, &mp->mem_list, next)
+		mem_len += memhdr->len;
+	if (mem_len != 0) {
 		fprintf(f, "  avg bytes/object=%#Lf\n",
-			(long double)(mp->elt_va_end - mp->elt_va_start) /
-			mp->size);
+			(long double)mem_len / mp->size);
+	}
 
 	cache_count = rte_mempool_dump_cache(f, mp);
 	common_count = rte_ring_count(mp->ring);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 74cecd6..7011a18 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -182,6 +182,25 @@ struct rte_mempool_objtlr {
 };
 
 /**
+ * A list of memory where objects are stored
+ */
+STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr);
+
+/**
+ * Mempool objects memory header structure
+ *
+ * The memory chunks where objects are stored. Each chunk is virtually
+ * and physically contiguous.
+ */
+struct rte_mempool_memhdr {
+	STAILQ_ENTRY(rte_mempool_memhdr) next; /**< Next in list. */
+	struct rte_mempool *mp;  /**< The mempool owning the chunk */
+	void *addr;              /**< Virtual address of the chunk */
+	phys_addr_t phys_addr;   /**< Physical address of the chunk */
+	size_t len;              /**< length of the chunk */
+};
+
+/**
  * The RTE mempool structure.
  */
 struct rte_mempool {
@@ -190,7 +209,7 @@ struct rte_mempool {
 	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
 	int flags;                       /**< Flags of the mempool. */
 	int socket_id;                   /**< Socket id passed at mempool creation. */
-	uint32_t size;                   /**< Size of the mempool. */
+	uint32_t size;                   /**< Max size of the mempool. */
 	uint32_t cache_size;             /**< Size of per-lcore local cache. */
 	uint32_t cache_flushthresh;
 	/**< Threshold before we flush excess elements. */
@@ -203,26 +222,15 @@ struct rte_mempool {
 
 	struct rte_mempool_cache *local_cache; /**< Per-lcore local cache */
 
+	uint32_t populated_size;         /**< Number of populated objects. */
 	struct rte_mempool_objhdr_list elt_list; /**< List of objects in pool */
+	uint32_t nb_mem_chunks;          /**< Number of memory chunks */
+	struct rte_mempool_memhdr_list mem_list; /**< List of memory chunks */
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	/** Per-lcore statistics. */
 	struct rte_mempool_debug_stats stats[RTE_MAX_LCORE];
 #endif
-
-	/* Address translation support, starts from next cache line. */
-
-	/** Number of elements in the elt_pa array. */
-	uint32_t    pg_num __rte_cache_aligned;
-	uint32_t    pg_shift;     /**< LOG2 of the physical pages. */
-	uintptr_t   pg_mask;      /**< physical page mask value. */
-	uintptr_t   elt_va_start;
-	/**< Virtual address of the first mempool object. */
-	uintptr_t   elt_va_end;
-	/**< Virtual address of the <size + 1> mempool object. */
-	phys_addr_t elt_pa[MEMPOOL_PG_NUM_DEFAULT];
-	/**< Array of physical page addresses for the mempool objects buffer. */
-
 }  __rte_cache_aligned;
 
 #define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread among memory channels. */
@@ -253,13 +261,6 @@ struct rte_mempool {
 #endif
 
 /**
- * Size of elt_pa array size based on number of pages. (Internal use)
- */
-#define __PA_SIZE(mp, pgn) \
-	RTE_ALIGN_CEIL((((pgn) - RTE_DIM((mp)->elt_pa)) * \
-	sizeof((mp)->elt_pa[0])), RTE_CACHE_LINE_SIZE)
-
-/**
  * Calculate the size of the mempool header.
  *
  * @param mp
@@ -267,8 +268,8 @@ struct rte_mempool {
  * @param pgn
  *   Number of pages used to store mempool objects.
  */
-#define MEMPOOL_HEADER_SIZE(mp, pgn, cs) \
-	(sizeof(*(mp)) + __PA_SIZE(mp, pgn) + (((cs) == 0) ? 0 : \
+#define MEMPOOL_HEADER_SIZE(mp, cs) \
+	(sizeof(*(mp)) + (((cs) == 0) ? 0 : \
 	(sizeof(struct rte_mempool_cache) * RTE_MAX_LCORE)))
 
 /* return the header of a mempool object (internal) */
@@ -1159,7 +1160,7 @@ void rte_mempool_audit(struct rte_mempool *mp);
 static inline void *rte_mempool_get_priv(struct rte_mempool *mp)
 {
 	return (char *)mp +
-		MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size);
+		MEMPOOL_HEADER_SIZE(mp, mp->cache_size);
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 17/36] mempool: new function to iterate the memory chunks
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (15 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 16/36] mempool: store memory chunks in a list Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 18/36] mempool: simplify xmem_usage Olivier Matz
                     ` (20 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

In the same model than rte_mempool_obj_iter(), introduce
rte_mempool_mem_iter() to iterate the memory chunks attached
to the mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c           | 16 ++++++++++++++++
 lib/librte_mempool/rte_mempool.h           | 27 +++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool_version.map |  1 +
 3 files changed, 44 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 9e3cfde..3e9d686 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -244,6 +244,22 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 	return n;
 }
 
+/* call mem_cb() for each mempool memory chunk */
+uint32_t
+rte_mempool_mem_iter(struct rte_mempool *mp,
+	rte_mempool_mem_cb_t *mem_cb, void *mem_cb_arg)
+{
+	struct rte_mempool_memhdr *hdr;
+	unsigned n = 0;
+
+	STAILQ_FOREACH(hdr, &mp->mem_list, next) {
+		mem_cb(mp, mem_cb_arg, hdr, n);
+		n++;
+	}
+
+	return n;
+}
+
 /* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 7011a18..0e4641e 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -332,6 +332,15 @@ typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
 typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
 
 /**
+ * A memory callback function for mempool.
+ *
+ * Used by rte_mempool_mem_iter().
+ */
+typedef void (rte_mempool_mem_cb_t)(struct rte_mempool *mp,
+		void *opaque, struct rte_mempool_memhdr *memhdr,
+		unsigned mem_idx);
+
+/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -602,6 +611,24 @@ uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
 	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg);
 
 /**
+ * Call a function for each mempool memory chunk
+ *
+ * Iterate across all memory chunks attached to a rte_mempool and call
+ * the callback function on it.
+ *
+ * @param mp
+ *   A pointer to an initialized mempool.
+ * @param mem_cb
+ *   A function pointer that is called for each memory chunk.
+ * @param mem_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   Number of memory chunks iterated.
+ */
+uint32_t rte_mempool_mem_iter(struct rte_mempool *mp,
+	rte_mempool_mem_cb_t *mem_cb, void *mem_cb_arg);
+
+/**
  * Dump the status of the mempool to the console.
  *
  * @param f
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 4db75ca..ca887b5 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -21,6 +21,7 @@ DPDK_16.07 {
 	global:
 
 	rte_mempool_obj_iter;
+	rte_mempool_mem_iter;
 
 	local: *;
 } DPDK_2.0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 18/36] mempool: simplify xmem_usage
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (16 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 17/36] mempool: new function to iterate the memory chunks Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 19/36] mempool: introduce a free callback for memory chunks Olivier Matz
                     ` (19 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Since previous commit, the function rte_mempool_xmem_usage() is
now the last user of rte_mempool_obj_mem_iter(). This complex
code can now be moved inside the function. We can get rid of the
callback and do some simplification to make the code more readable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 138 +++++++++++----------------------------
 1 file changed, 37 insertions(+), 101 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 3e9d686..f2f7846 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -126,15 +126,6 @@ static unsigned optimize_object_size(unsigned obj_size)
 	return new_obj_size * RTE_MEMPOOL_ALIGN;
 }
 
-/**
- * A mempool object iterator callback function.
- */
-typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
-	void * /*obj_start*/,
-	void * /*obj_end*/,
-	uint32_t /*obj_index */,
-	phys_addr_t /*physaddr*/);
-
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 {
@@ -158,74 +149,6 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 	rte_ring_sp_enqueue(mp->ring, obj);
 }
 
-/* Iterate through objects at the given address
- *
- * Given the pointer to the memory, and its topology in physical memory
- * (the physical addresses table), iterate through the "elt_num" objects
- * of size "elt_sz" aligned at "align". For each object in this memory
- * chunk, invoke a callback. It returns the effective number of objects
- * in this memory. */
-static uint32_t
-rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
-	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
-	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
-{
-	uint32_t i, j, k;
-	uint32_t pgn, pgf;
-	uintptr_t end, start, va;
-	uintptr_t pg_sz;
-	phys_addr_t physaddr;
-
-	pg_sz = (uintptr_t)1 << pg_shift;
-	va = (uintptr_t)vaddr;
-
-	i = 0;
-	j = 0;
-
-	while (i != elt_num && j != pg_num) {
-
-		start = RTE_ALIGN_CEIL(va, align);
-		end = start + total_elt_sz;
-
-		/* index of the first page for the next element. */
-		pgf = (end >> pg_shift) - (start >> pg_shift);
-
-		/* index of the last page for the current element. */
-		pgn = ((end - 1) >> pg_shift) - (start >> pg_shift);
-		pgn += j;
-
-		/* do we have enough space left for the element. */
-		if (pgn >= pg_num)
-			break;
-
-		for (k = j;
-				k != pgn &&
-				paddr[k] + pg_sz == paddr[k + 1];
-				k++)
-			;
-
-		/*
-		 * if next pgn chunks of memory physically continuous,
-		 * use it to create next element.
-		 * otherwise, just skip that chunk unused.
-		 */
-		if (k == pgn) {
-			physaddr = paddr[k] + (start & (pg_sz - 1));
-			if (obj_iter != NULL)
-				obj_iter(obj_iter_arg, (void *)start,
-					(void *)end, i, physaddr);
-			va = end;
-			j += pgf;
-			i++;
-		} else {
-			va = RTE_ALIGN_CEIL((va + 1), pg_sz);
-			j++;
-		}
-	}
-
-	return i;
-}
-
 /* call obj_cb() for each mempool element */
 uint32_t
 rte_mempool_obj_iter(struct rte_mempool *mp,
@@ -343,40 +266,53 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 	return sz;
 }
 
-/* Callback used by rte_mempool_xmem_usage(): it sets the opaque
- * argument to the end of the object. */
-static void
-mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
-	__rte_unused uint32_t idx, __rte_unused phys_addr_t physaddr)
-{
-	*(uintptr_t *)arg = (uintptr_t)end;
-}
-
 /*
  * Calculate how much memory would be actually required with the
  * given memory footprint to store required number of elements.
  */
 ssize_t
-rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
+	size_t total_elt_sz, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift)
 {
-	uint32_t n;
-	uintptr_t va, uv;
-	size_t pg_sz, usz;
+	uint32_t elt_cnt = 0;
+	phys_addr_t start, end;
+	uint32_t paddr_idx;
+	size_t pg_sz = (size_t)1 << pg_shift;
 
-	pg_sz = (size_t)1 << pg_shift;
-	va = (uintptr_t)vaddr;
-	uv = va;
+	/* if paddr is NULL, assume contiguous memory */
+	if (paddr == NULL) {
+		start = 0;
+		end = pg_sz * pg_num;
+		paddr_idx = pg_num;
+	} else {
+		start = paddr[0];
+		end = paddr[0] + pg_sz;
+		paddr_idx = 1;
+	}
+	while (elt_cnt < elt_num) {
+
+		if (end - start >= total_elt_sz) {
+			/* enough contiguous memory, add an object */
+			start += total_elt_sz;
+			elt_cnt++;
+		} else if (paddr_idx < pg_num) {
+			/* no room to store one obj, add a page */
+			if (end == paddr[paddr_idx]) {
+				end += pg_sz;
+			} else {
+				start = paddr[paddr_idx];
+				end = paddr[paddr_idx] + pg_sz;
+			}
+			paddr_idx++;
 
-	if ((n = rte_mempool_obj_mem_iter(vaddr, elt_num, total_elt_sz, 1,
-			paddr, pg_num, pg_shift, mempool_lelem_iter,
-			&uv)) != elt_num) {
-		return -(ssize_t)n;
+		} else {
+			/* no more page, return how many elements fit */
+			return -(size_t)elt_cnt;
+		}
 	}
 
-	uv = RTE_ALIGN_CEIL(uv, pg_sz);
-	usz = uv - va;
-	return usz;
+	return (size_t)paddr_idx << pg_shift;
 }
 
 #ifndef RTE_LIBRTE_XEN_DOM0
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 19/36] mempool: introduce a free callback for memory chunks
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (17 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 18/36] mempool: simplify xmem_usage Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 20/36] mempool: make page size optional when getting xmem size Olivier Matz
                     ` (18 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Introduce a free callback that is passed to the populate* functions,
which is used when freeing a mempool. This is unused now, but as next
commits will populate the mempool with several chunks of memory, we
need a way to free them properly on error.

Later in the series, we will also introduce a public rte_mempool_free()
and the ability for the user to populate a mempool with its own memory.
For that, we also need a free callback.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 27 ++++++++++++++++++++++-----
 lib/librte_mempool/rte_mempool.h |  8 ++++++++
 2 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index f2f7846..0ae899b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -388,6 +388,15 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 	return 0;
 }
 
+/* free a memchunk allocated with rte_memzone_reserve() */
+__rte_unused static void
+rte_mempool_memchunk_mz_free(__rte_unused struct rte_mempool_memhdr *memhdr,
+	void *opaque)
+{
+	const struct rte_memzone *mz = opaque;
+	rte_memzone_free(mz);
+}
+
 /* Free memory chunks used by a mempool. Objects must be in pool */
 static void
 rte_mempool_free_memchunks(struct rte_mempool *mp)
@@ -405,6 +414,8 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
 	while (!STAILQ_EMPTY(&mp->mem_list)) {
 		memhdr = STAILQ_FIRST(&mp->mem_list);
 		STAILQ_REMOVE_HEAD(&mp->mem_list, next);
+		if (memhdr->free_cb != NULL)
+			memhdr->free_cb(memhdr, memhdr->opaque);
 		rte_free(memhdr);
 		mp->nb_mem_chunks--;
 	}
@@ -415,7 +426,8 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
  * on error. */
 static int
 rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
-	phys_addr_t paddr, size_t len)
+	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque)
 {
 	unsigned total_elt_sz;
 	unsigned i = 0;
@@ -436,6 +448,8 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	memhdr->addr = vaddr;
 	memhdr->phys_addr = paddr;
 	memhdr->len = len;
+	memhdr->free_cb = free_cb;
+	memhdr->opaque = opaque;
 
 	if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
@@ -462,7 +476,8 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
  * number of objects added, or a negative value on error. */
 static int
 rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
+	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque)
 {
 	uint32_t i, n;
 	int ret, cnt = 0;
@@ -480,11 +495,13 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 			;
 
 		ret = rte_mempool_populate_phys(mp, vaddr + i * pg_sz,
-			paddr[i], n * pg_sz);
+			paddr[i], n * pg_sz, free_cb, opaque);
 		if (ret < 0) {
 			rte_mempool_free_memchunks(mp);
 			return ret;
 		}
+		/* no need to call the free callback for next chunks */
+		free_cb = NULL;
 		cnt += ret;
 	}
 	return cnt;
@@ -666,12 +683,12 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 		ret = rte_mempool_populate_phys(mp, obj,
 			mp->phys_addr + ((char *)obj - (char *)mp),
-			objsz.total_size * n);
+			objsz.total_size * n, NULL, NULL);
 		if (ret != (int)mp->size)
 			goto exit_unlock;
 	} else {
 		ret = rte_mempool_populate_phys_tab(mp, vaddr,
-			paddr, pg_num, pg_shift);
+			paddr, pg_num, pg_shift, NULL, NULL);
 		if (ret != (int)mp->size)
 			goto exit_unlock;
 	}
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 0e4641e..e06ccfc 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -187,6 +187,12 @@ struct rte_mempool_objtlr {
 STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr);
 
 /**
+ * Callback used to free a memory chunk
+ */
+typedef void (rte_mempool_memchunk_free_cb_t)(struct rte_mempool_memhdr *memhdr,
+	void *opaque);
+
+/**
  * Mempool objects memory header structure
  *
  * The memory chunks where objects are stored. Each chunk is virtually
@@ -198,6 +204,8 @@ struct rte_mempool_memhdr {
 	void *addr;              /**< Virtual address of the chunk */
 	phys_addr_t phys_addr;   /**< Physical address of the chunk */
 	size_t len;              /**< length of the chunk */
+	rte_mempool_memchunk_free_cb_t *free_cb; /**< Free callback */
+	void *opaque;            /**< Argument passed to the free callback */
 };
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 20/36] mempool: make page size optional when getting xmem size
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (18 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 19/36] mempool: introduce a free callback for memory chunks Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 21/36] mempool: default allocation in several memory chunks Olivier Matz
                     ` (17 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Update rte_mempool_xmem_size() so that when the page_shift argument is
set to 0, assume that memory is physically contiguous, allowing to
ignore page boundaries. This will be used in the next commits.

By the way, rename the variable 'n' as 'obj_per_page' and avoid the
affectation inside the if().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 18 +++++++++---------
 lib/librte_mempool/rte_mempool.h |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0ae899b..edf26ae 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -252,18 +252,18 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 size_t
 rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 {
-	size_t n, pg_num, pg_sz, sz;
+	size_t obj_per_page, pg_num, pg_sz;
 
-	pg_sz = (size_t)1 << pg_shift;
+	if (pg_shift == 0)
+		return total_elt_sz * elt_num;
 
-	if ((n = pg_sz / total_elt_sz) > 0) {
-		pg_num = (elt_num + n - 1) / n;
-		sz = pg_num << pg_shift;
-	} else {
-		sz = RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
-	}
+	pg_sz = (size_t)1 << pg_shift;
+	obj_per_page = pg_sz / total_elt_sz;
+	if (obj_per_page == 0)
+		return RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
 
-	return sz;
+	pg_num = (elt_num + obj_per_page - 1) / obj_per_page;
+	return pg_num << pg_shift;
 }
 
 /*
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index e06ccfc..38e5abd 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1257,7 +1257,7 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  *   The size of each element, including header and trailer, as returned
  *   by rte_mempool_calc_obj_size().
  * @param pg_shift
- *   LOG2 of the physical pages size.
+ *   LOG2 of the physical pages size. If set to 0, ignore page boundaries.
  * @return
  *   Required memory size aligned at page boundary.
  */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 21/36] mempool: default allocation in several memory chunks
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (19 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 20/36] mempool: make page size optional when getting xmem size Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 22/36] eal: lock memory when using no-huge Olivier Matz
                     ` (16 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Introduce rte_mempool_populate_default() which allocates
mempool objects in several memzones.

The mempool header is now always allocated in a specific memzone
(not with its objects). Thanks to this modification, we can remove
many specific behavior that was required when hugepages are not
enabled in case we are using rte_mempool_xmem_create().

This change requires to update how kni and mellanox drivers lookup for
mbuf memory. For now, this will only work if there is only one memory
chunk (like today), but we could make use of rte_mempool_mem_iter() to
support more memory chunks.

We can also remove RTE_MEMPOOL_OBJ_NAME that is not required anymore for
the lookup, as memory chunks are referenced by the mempool.

Note that rte_mempool_create() is still broken (it was the case before)
when there is no hugepages support (rte_mempool_create_xmem() has to be
used). This is fixed in next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
(cherry picked from commit e2ccba488aec7bfa5f06c12b4f7b771134255296)
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx4/mlx4.c               |  87 ++++++++++++++++++++++---
 drivers/net/mlx5/mlx5_rxtx.c          |  87 ++++++++++++++++++++++---
 drivers/net/mlx5/mlx5_rxtx.h          |   2 +-
 lib/librte_kni/rte_kni.c              |  12 +++-
 lib/librte_mempool/rte_dom0_mempool.c |   2 +-
 lib/librte_mempool/rte_mempool.c      | 119 +++++++++++++++++++---------------
 lib/librte_mempool/rte_mempool.h      |  11 ----
 7 files changed, 233 insertions(+), 87 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 089bbec..c8481a7 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1198,8 +1198,71 @@ txq_complete(struct txq *txq)
 	return 0;
 }
 
+struct mlx4_check_mempool_data {
+	int ret;
+	char *start;
+	char *end;
+};
+
+/* Called by mlx4_check_mempool() when iterating the memory chunks. */
+static void mlx4_check_mempool_cb(struct rte_mempool *mp,
+	void *opaque, struct rte_mempool_memhdr *memhdr,
+	unsigned mem_idx)
+{
+	struct mlx4_check_mempool_data *data = opaque;
+
+	(void)mp;
+	(void)mem_idx;
+
+	/* It already failed, skip the next chunks. */
+	if (data->ret != 0)
+		return;
+	/* It is the first chunk. */
+	if (data->start == NULL && data->end == NULL) {
+		data->start = memhdr->addr;
+		data->end = data->start + memhdr->len;
+		return;
+	}
+	if (data->end == memhdr->addr) {
+		data->end += memhdr->len;
+		return;
+	}
+	if (data->start == (char *)memhdr->addr + memhdr->len) {
+		data->start -= memhdr->len;
+		return;
+	}
+	/* Error, mempool is not virtually contigous. */
+	data->ret = -1;
+}
+
+/**
+ * Check if a mempool can be used: it must be virtually contiguous.
+ *
+ * @param[in] mp
+ *   Pointer to memory pool.
+ * @param[out] start
+ *   Pointer to the start address of the mempool virtual memory area
+ * @param[out] end
+ *   Pointer to the end address of the mempool virtual memory area
+ *
+ * @return
+ *   0 on success (mempool is virtually contiguous), -1 on error.
+ */
+static int mlx4_check_mempool(struct rte_mempool *mp, uintptr_t *start,
+	uintptr_t *end)
+{
+	struct mlx4_check_mempool_data data;
+
+	memset(&data, 0, sizeof(data));
+	rte_mempool_mem_iter(mp, mlx4_check_mempool_cb, &data);
+	*start = (uintptr_t)data.start;
+	*end = (uintptr_t)data.end;
+
+	return data.ret;
+}
+
 /* For best performance, this function should not be inlined. */
-static struct ibv_mr *mlx4_mp2mr(struct ibv_pd *, const struct rte_mempool *)
+static struct ibv_mr *mlx4_mp2mr(struct ibv_pd *, struct rte_mempool *)
 	__attribute__((noinline));
 
 /**
@@ -1214,15 +1277,21 @@ static struct ibv_mr *mlx4_mp2mr(struct ibv_pd *, const struct rte_mempool *)
  *   Memory region pointer, NULL in case of error.
  */
 static struct ibv_mr *
-mlx4_mp2mr(struct ibv_pd *pd, const struct rte_mempool *mp)
+mlx4_mp2mr(struct ibv_pd *pd, struct rte_mempool *mp)
 {
 	const struct rte_memseg *ms = rte_eal_get_physmem_layout();
-	uintptr_t start = mp->elt_va_start;
-	uintptr_t end = mp->elt_va_end;
+	uintptr_t start;
+	uintptr_t end;
 	unsigned int i;
 
+	if (mlx4_check_mempool(mp, &start, &end) != 0) {
+		ERROR("mempool %p: not virtually contiguous",
+			(void *)mp);
+		return NULL;
+	}
+
 	DEBUG("mempool %p area start=%p end=%p size=%zu",
-	      (const void *)mp, (void *)start, (void *)end,
+	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
 	/* Round start and end to page boundary if found in memory segments. */
 	for (i = 0; (i < RTE_MAX_MEMSEG) && (ms[i].addr != NULL); ++i) {
@@ -1236,7 +1305,7 @@ mlx4_mp2mr(struct ibv_pd *pd, const struct rte_mempool *mp)
 			end = RTE_ALIGN_CEIL(end, align);
 	}
 	DEBUG("mempool %p using start=%p end=%p size=%zu for MR",
-	      (const void *)mp, (void *)start, (void *)end,
+	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
 	return ibv_reg_mr(pd,
 			  (void *)start,
@@ -1276,7 +1345,7 @@ txq_mb2mp(struct rte_mbuf *buf)
  *   mr->lkey on success, (uint32_t)-1 on failure.
  */
 static uint32_t
-txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
+txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
 {
 	unsigned int i;
 	struct ibv_mr *mr;
@@ -1294,7 +1363,7 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	}
 	/* Add a new entry, register MR first. */
 	DEBUG("%p: discovered new memory pool \"%s\" (%p)",
-	      (void *)txq, mp->name, (const void *)mp);
+	      (void *)txq, mp->name, (void *)mp);
 	mr = mlx4_mp2mr(txq->priv->pd, mp);
 	if (unlikely(mr == NULL)) {
 		DEBUG("%p: unable to configure MR, ibv_reg_mr() failed.",
@@ -1315,7 +1384,7 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	txq->mp2mr[i].mr = mr;
 	txq->mp2mr[i].lkey = mr->lkey;
 	DEBUG("%p: new MR lkey for MP \"%s\" (%p): 0x%08" PRIu32,
-	      (void *)txq, mp->name, (const void *)mp, txq->mp2mr[i].lkey);
+	      (void *)txq, mp->name, (void *)mp, txq->mp2mr[i].lkey);
 	return txq->mp2mr[i].lkey;
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index afd4338..d1e57d2 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -140,8 +140,71 @@ txq_complete(struct txq *txq)
 	return 0;
 }
 
+struct mlx5_check_mempool_data {
+	int ret;
+	char *start;
+	char *end;
+};
+
+/* Called by mlx5_check_mempool() when iterating the memory chunks. */
+static void mlx5_check_mempool_cb(struct rte_mempool *mp,
+	void *opaque, struct rte_mempool_memhdr *memhdr,
+	unsigned mem_idx)
+{
+	struct mlx5_check_mempool_data *data = opaque;
+
+	(void)mp;
+	(void)mem_idx;
+
+	/* It already failed, skip the next chunks. */
+	if (data->ret != 0)
+		return;
+	/* It is the first chunk. */
+	if (data->start == NULL && data->end == NULL) {
+		data->start = memhdr->addr;
+		data->end = data->start + memhdr->len;
+		return;
+	}
+	if (data->end == memhdr->addr) {
+		data->end += memhdr->len;
+		return;
+	}
+	if (data->start == (char *)memhdr->addr + memhdr->len) {
+		data->start -= memhdr->len;
+		return;
+	}
+	/* Error, mempool is not virtually contigous. */
+	data->ret = -1;
+}
+
+/**
+ * Check if a mempool can be used: it must be virtually contiguous.
+ *
+ * @param[in] mp
+ *   Pointer to memory pool.
+ * @param[out] start
+ *   Pointer to the start address of the mempool virtual memory area
+ * @param[out] end
+ *   Pointer to the end address of the mempool virtual memory area
+ *
+ * @return
+ *   0 on success (mempool is virtually contiguous), -1 on error.
+ */
+static int mlx5_check_mempool(struct rte_mempool *mp, uintptr_t *start,
+	uintptr_t *end)
+{
+	struct mlx5_check_mempool_data data;
+
+	memset(&data, 0, sizeof(data));
+	rte_mempool_mem_iter(mp, mlx5_check_mempool_cb, &data);
+	*start = (uintptr_t)data.start;
+	*end = (uintptr_t)data.end;
+
+	return data.ret;
+}
+
 /* For best performance, this function should not be inlined. */
-struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, const struct rte_mempool *)
+struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, struct rte_mempool *)
 	__attribute__((noinline));
 
 /**
@@ -156,15 +219,21 @@ struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, const struct rte_mempool *)
  *   Memory region pointer, NULL in case of error.
  */
 struct ibv_mr *
-mlx5_mp2mr(struct ibv_pd *pd, const struct rte_mempool *mp)
+mlx5_mp2mr(struct ibv_pd *pd, struct rte_mempool *mp)
 {
 	const struct rte_memseg *ms = rte_eal_get_physmem_layout();
-	uintptr_t start = mp->elt_va_start;
-	uintptr_t end = mp->elt_va_end;
+	uintptr_t start;
+	uintptr_t end;
 	unsigned int i;
 
+	if (mlx5_check_mempool(mp, &start, &end) != 0) {
+		ERROR("mempool %p: not virtually contiguous",
+			(void *)mp);
+		return NULL;
+	}
+
 	DEBUG("mempool %p area start=%p end=%p size=%zu",
-	      (const void *)mp, (void *)start, (void *)end,
+	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
 	/* Round start and end to page boundary if found in memory segments. */
 	for (i = 0; (i < RTE_MAX_MEMSEG) && (ms[i].addr != NULL); ++i) {
@@ -178,7 +247,7 @@ mlx5_mp2mr(struct ibv_pd *pd, const struct rte_mempool *mp)
 			end = RTE_ALIGN_CEIL(end, align);
 	}
 	DEBUG("mempool %p using start=%p end=%p size=%zu for MR",
-	      (const void *)mp, (void *)start, (void *)end,
+	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
 	return ibv_reg_mr(pd,
 			  (void *)start,
@@ -218,7 +287,7 @@ txq_mb2mp(struct rte_mbuf *buf)
  *   mr->lkey on success, (uint32_t)-1 on failure.
  */
 static uint32_t
-txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
+txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
 {
 	unsigned int i;
 	struct ibv_mr *mr;
@@ -236,7 +305,7 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	}
 	/* Add a new entry, register MR first. */
 	DEBUG("%p: discovered new memory pool \"%s\" (%p)",
-	      (void *)txq, mp->name, (const void *)mp);
+	      (void *)txq, mp->name, (void *)mp);
 	mr = mlx5_mp2mr(txq->priv->pd, mp);
 	if (unlikely(mr == NULL)) {
 		DEBUG("%p: unable to configure MR, ibv_reg_mr() failed.",
@@ -257,7 +326,7 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	txq->mp2mr[i].mr = mr;
 	txq->mp2mr[i].lkey = mr->lkey;
 	DEBUG("%p: new MR lkey for MP \"%s\" (%p): 0x%08" PRIu32,
-	      (void *)txq, mp->name, (const void *)mp, txq->mp2mr[i].lkey);
+	      (void *)txq, mp->name, (void *)mp, txq->mp2mr[i].lkey);
 	return txq->mp2mr[i].lkey;
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index db054d6..d522f70 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -341,7 +341,7 @@ uint16_t mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
 
 /* mlx5_rxtx.c */
 
-struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, const struct rte_mempool *);
+struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, struct rte_mempool *);
 void txq_mp2mr_iter(struct rte_mempool *, void *);
 uint16_t mlx5_tx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst_sp(void *, struct rte_mbuf **, uint16_t);
diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index ea9baf4..3028fd4 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -323,6 +323,7 @@ rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
 	char intf_name[RTE_KNI_NAMESIZE];
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
+	const struct rte_mempool *mp;
 	struct rte_kni_memzone_slot *slot = NULL;
 
 	if (!pktmbuf_pool || !conf || !conf->name[0])
@@ -415,12 +416,17 @@ rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
 
 
 	/* MBUF mempool */
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME,
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT,
 		pktmbuf_pool->name);
 	mz = rte_memzone_lookup(mz_name);
 	KNI_MEM_CHECK(mz == NULL);
-	dev_info.mbuf_va = mz->addr;
-	dev_info.mbuf_phys = mz->phys_addr;
+	mp = (struct rte_mempool *)mz->addr;
+	/* KNI currently requires to have only one memory chunk */
+	if (mp->nb_mem_chunks != 1)
+		goto kni_fail;
+
+	dev_info.mbuf_va = STAILQ_FIRST(&mp->mem_list)->addr;
+	dev_info.mbuf_phys = STAILQ_FIRST(&mp->mem_list)->phys_addr;
 	ctx->pktmbuf_pool = pktmbuf_pool;
 	ctx->group_id = conf->group_id;
 	ctx->slot_id = slot->id;
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
index 0051bd5..dad755c 100644
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ b/lib/librte_mempool/rte_dom0_mempool.c
@@ -110,7 +110,7 @@ rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
 	if (pa == NULL)
 		return mp;
 
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME, name);
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT "_elt", name);
 	mz = rte_memzone_reserve(mz_name, sz, socket_id, mz_flags);
 	if (mz == NULL) {
 		free(pa);
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index edf26ae..5b21d0a 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -389,7 +389,7 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 }
 
 /* free a memchunk allocated with rte_memzone_reserve() */
-__rte_unused static void
+static void
 rte_mempool_memchunk_mz_free(__rte_unused struct rte_mempool_memhdr *memhdr,
 	void *opaque)
 {
@@ -507,6 +507,59 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	return cnt;
 }
 
+/* Default function to populate the mempool: allocate memory in mezones,
+ * and populate them. Return the number of objects added, or a negative
+ * value on error. */
+static int rte_mempool_populate_default(struct rte_mempool *mp)
+{
+	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
+	char mz_name[RTE_MEMZONE_NAMESIZE];
+	const struct rte_memzone *mz;
+	size_t size, total_elt_sz, align;
+	unsigned mz_id, n;
+	int ret;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+
+	align = RTE_CACHE_LINE_SIZE;
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
+		size = rte_mempool_xmem_size(n, total_elt_sz, 0);
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			ret = -ENAMETOOLONG;
+			goto fail;
+		}
+
+		mz = rte_memzone_reserve_aligned(mz_name, size,
+			mp->socket_id, mz_flags, align);
+		/* not enough memory, retry with the biggest zone we have */
+		if (mz == NULL)
+			mz = rte_memzone_reserve_aligned(mz_name, 0,
+				mp->socket_id, mz_flags, align);
+		if (mz == NULL) {
+			ret = -rte_errno;
+			goto fail;
+		}
+
+		ret = rte_mempool_populate_phys(mp, mz->addr, mz->phys_addr,
+			mz->len, rte_mempool_memchunk_mz_free,
+			(void *)(uintptr_t)mz);
+		if (ret < 0)
+			goto fail;
+	}
+
+	return mp->size;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return ret;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -526,13 +579,10 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	struct rte_mempool_list *mempool_list;
 	struct rte_mempool *mp = NULL;
 	struct rte_tailq_entry *te = NULL;
-	const struct rte_memzone *mz;
+	const struct rte_memzone *mz = NULL;
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-	void *obj;
 	struct rte_mempool_objsz objsz;
-	void *startaddr;
-	int page_size = getpagesize();
 	int ret;
 
 	/* compilation-time checks */
@@ -587,16 +637,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	private_data_size = (private_data_size +
 			     RTE_MEMPOOL_ALIGN_MASK) & (~RTE_MEMPOOL_ALIGN_MASK);
 
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * expand private data size to a whole page, so that the
-		 * first pool element will start on a new standard page
-		 */
-		int head = sizeof(struct rte_mempool);
-		int new_size = (private_data_size + head) % page_size;
-		if (new_size)
-			private_data_size += page_size - new_size;
-	}
 
 	/* try to allocate tailq entry */
 	te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
@@ -613,17 +653,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
-	if (vaddr == NULL)
-		mempool_size += (size_t)objsz.total_size * n;
-
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * we want the memory pool to start on a page boundary,
-		 * because pool elements crossing page boundaries would
-		 * result in discontiguous physical addresses
-		 */
-		mempool_size += page_size;
-	}
 
 	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
 
@@ -631,20 +660,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (mz == NULL)
 		goto exit_unlock;
 
-	if (rte_eal_has_hugepages()) {
-		startaddr = (void*)mz->addr;
-	} else {
-		/* align memory pool start address on a page boundary */
-		unsigned long addr = (unsigned long)mz->addr;
-		if (addr & (page_size - 1)) {
-			addr += page_size;
-			addr &= ~(page_size - 1);
-		}
-		startaddr = (void*)addr;
-	}
-
 	/* init the mempool structure */
-	mp = startaddr;
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->phys_addr = mz->phys_addr;
@@ -675,22 +691,17 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		mp_init(mp, mp_init_arg);
 
 	/* mempool elements allocated together with mempool */
-	if (vaddr == NULL) {
-		/* calculate address of the first element for continuous mempool. */
-		obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, cache_size) +
-			private_data_size;
-		obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
-
-		ret = rte_mempool_populate_phys(mp, obj,
-			mp->phys_addr + ((char *)obj - (char *)mp),
-			objsz.total_size * n, NULL, NULL);
-		if (ret != (int)mp->size)
-			goto exit_unlock;
-	} else {
+	if (vaddr == NULL)
+		ret = rte_mempool_populate_default(mp);
+	else
 		ret = rte_mempool_populate_phys_tab(mp, vaddr,
 			paddr, pg_num, pg_shift, NULL, NULL);
-		if (ret != (int)mp->size)
-			goto exit_unlock;
+	if (ret < 0) {
+		rte_errno = -ret;
+		goto exit_unlock;
+	} else if (ret != (int)mp->size) {
+		rte_errno = EINVAL;
+		goto exit_unlock;
 	}
 
 	/* call the initializer */
@@ -713,6 +724,8 @@ exit_unlock:
 		rte_ring_free(mp->ring);
 	}
 	rte_free(te);
+	if (mz != NULL)
+		rte_memzone_free(mz);
 
 	return NULL;
 }
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 38e5abd..a579953 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -124,17 +124,6 @@ struct rte_mempool_objsz {
 /* "MP_<name>" */
 #define	RTE_MEMPOOL_MZ_FORMAT	RTE_MEMPOOL_MZ_PREFIX "%s"
 
-#ifdef RTE_LIBRTE_XEN_DOM0
-
-/* "<name>_MP_elt" */
-#define	RTE_MEMPOOL_OBJ_NAME	"%s_" RTE_MEMPOOL_MZ_PREFIX "elt"
-
-#else
-
-#define	RTE_MEMPOOL_OBJ_NAME	RTE_MEMPOOL_MZ_FORMAT
-
-#endif /* RTE_LIBRTE_XEN_DOM0 */
-
 #define	MEMPOOL_PG_SHIFT_MAX	(sizeof(uintptr_t) * CHAR_BIT - 1)
 
 /** Mempool over one chunk of physically continuous memory */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 22/36] eal: lock memory when using no-huge
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (20 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 21/36] mempool: default allocation in several memory chunks Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 23/36] mempool: support no-hugepage mode Olivier Matz
                     ` (15 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Although the physical address won't be correct in memory segment,
this allows at least to retrieve the physical address using
rte_mem_virt2phy(). Indeed, if the page is not locked, the page
may not be present in physical memory.

With next commit, it allows a mempool to have properly filled physical
addresses when using --no-huge option.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..79d1d2d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1074,7 +1074,7 @@ rte_eal_hugepage_init(void)
 	/* hugetlbfs can be disabled */
 	if (internal_config.no_hugetlbfs) {
 		addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE,
-				MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
+			MAP_LOCKED | MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
 		if (addr == MAP_FAILED) {
 			RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__,
 					strerror(errno));
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 23/36] mempool: support no-hugepage mode
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (21 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 22/36] eal: lock memory when using no-huge Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 24/36] mempool: replace mempool physaddr by a memzone pointer Olivier Matz
                     ` (14 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Introduce a new function rte_mempool_populate_virt() that is now called
by default when hugepages are not supported. This function populate the
mempool with several physically contiguous chunks whose minimum size is
the page size of the system.

Thanks to this, rte_mempool_create() will work properly in without
hugepages (if the object size is smaller than a page size), and 2
specific workarouds can be removed:

- trailer_size was artificially extended to a page size
- rte_mempool_virt2phy() did not rely on object physical address

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 106 ++++++++++++++++++++++++++++++---------
 lib/librte_mempool/rte_mempool.h |  17 ++-----
 2 files changed, 85 insertions(+), 38 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 5b21d0a..54f2ab2 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -222,23 +222,6 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 		sz->trailer_size = new_size - sz->header_size - sz->elt_size;
 	}
 
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * compute trailer size so that pool elements fit exactly in
-		 * a standard page
-		 */
-		int page_size = getpagesize();
-		int new_size = page_size - sz->header_size - sz->elt_size;
-		if (new_size < 0 || (unsigned int)new_size < sz->trailer_size) {
-			printf("When hugepages are disabled, pool objects "
-			       "can't exceed PAGE_SIZE: %d + %d + %d > %d\n",
-			       sz->header_size, sz->elt_size, sz->trailer_size,
-			       page_size);
-			return 0;
-		}
-		sz->trailer_size = new_size;
-	}
-
 	/* this is the size of an object, including header and trailer */
 	sz->total_size = sz->header_size + sz->elt_size + sz->trailer_size;
 
@@ -507,15 +490,72 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	return cnt;
 }
 
-/* Default function to populate the mempool: allocate memory in mezones,
+/* Populate the mempool with a virtual area. Return the number of
+ * objects added, or a negative value on error. */
+static int
+rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
+	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque)
+{
+	phys_addr_t paddr;
+	size_t off, phys_len;
+	int ret, cnt = 0;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+	/* address and len must be page-aligned */
+	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
+		return -EINVAL;
+	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
+		return -EINVAL;
+
+	for (off = 0; off + pg_sz <= len &&
+		     mp->populated_size < mp->size; off += phys_len) {
+
+		paddr = rte_mem_virt2phy(addr + off);
+		if (paddr == RTE_BAD_PHYS_ADDR) {
+			ret = -EINVAL;
+			goto fail;
+		}
+
+		/* populate with the largest group of contiguous pages */
+		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
+			phys_addr_t paddr_tmp;
+
+			paddr_tmp = rte_mem_virt2phy(addr + off + phys_len);
+			paddr_tmp = rte_mem_phy2mch(-1, paddr_tmp);
+
+			if (paddr_tmp != paddr + phys_len)
+				break;
+		}
+
+		ret = rte_mempool_populate_phys(mp, addr + off, paddr,
+			phys_len, free_cb, opaque);
+		if (ret < 0)
+			goto fail;
+		/* no need to call the free callback for next chunks */
+		free_cb = NULL;
+		cnt += ret;
+	}
+
+	return cnt;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return ret;
+}
+
+/* Default function to populate the mempool: allocate memory in memzones,
  * and populate them. Return the number of objects added, or a negative
  * value on error. */
-static int rte_mempool_populate_default(struct rte_mempool *mp)
+static int
+rte_mempool_populate_default(struct rte_mempool *mp)
 {
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
-	size_t size, total_elt_sz, align;
+	size_t size, total_elt_sz, align, pg_sz, pg_shift;
 	unsigned mz_id, n;
 	int ret;
 
@@ -523,10 +563,19 @@ static int rte_mempool_populate_default(struct rte_mempool *mp)
 	if (mp->nb_mem_chunks != 0)
 		return -EEXIST;
 
-	align = RTE_CACHE_LINE_SIZE;
+	if (rte_eal_has_hugepages()) {
+		pg_shift = 0; /* not needed, zone is physically contiguous */
+		pg_sz = 0;
+		align = RTE_CACHE_LINE_SIZE;
+	} else {
+		pg_sz = getpagesize();
+		pg_shift = rte_bsf32(pg_sz);
+		align = pg_sz;
+	}
+
 	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
 	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
-		size = rte_mempool_xmem_size(n, total_elt_sz, 0);
+		size = rte_mempool_xmem_size(n, total_elt_sz, pg_shift);
 
 		ret = snprintf(mz_name, sizeof(mz_name),
 			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
@@ -546,9 +595,16 @@ static int rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		ret = rte_mempool_populate_phys(mp, mz->addr, mz->phys_addr,
-			mz->len, rte_mempool_memchunk_mz_free,
-			(void *)(uintptr_t)mz);
+		if (rte_eal_has_hugepages())
+			ret = rte_mempool_populate_phys(mp, mz->addr,
+				mz->phys_addr, mz->len,
+				rte_mempool_memchunk_mz_free,
+				(void *)(uintptr_t)mz);
+		else
+			ret = rte_mempool_populate_virt(mp, mz->addr,
+				mz->len, pg_sz,
+				rte_mempool_memchunk_mz_free,
+				(void *)(uintptr_t)mz);
 		if (ret < 0)
 			goto fail;
 	}
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index a579953..82b0334 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1146,19 +1146,10 @@ rte_mempool_empty(const struct rte_mempool *mp)
 static inline phys_addr_t
 rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
 {
-	if (rte_eal_has_hugepages()) {
-		const struct rte_mempool_objhdr *hdr;
-		hdr = (const struct rte_mempool_objhdr *)RTE_PTR_SUB(elt,
-			sizeof(*hdr));
-		return hdr->physaddr;
-	} else {
-		/*
-		 * If huge pages are disabled, we cannot assume the
-		 * memory region to be physically contiguous.
-		 * Lookup for each element.
-		 */
-		return rte_mem_virt2phy(elt);
-	}
+	const struct rte_mempool_objhdr *hdr;
+	hdr = (const struct rte_mempool_objhdr *)RTE_PTR_SUB(elt,
+		sizeof(*hdr));
+	return hdr->physaddr;
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 24/36] mempool: replace mempool physaddr by a memzone pointer
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (22 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 23/36] mempool: support no-hugepage mode Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 25/36] mempool: introduce a function to free a mempool Olivier Matz
                     ` (13 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Storing the pointer to the memzone instead of the physical address
provides more information than just the physical address: for instance,
the memzone flags.

Moreover, keeping the memzone pointer will allow us to free the mempool
(this is done later in the series).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 4 ++--
 lib/librte_mempool/rte_mempool.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 54f2ab2..7336616 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -719,7 +719,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	/* init the mempool structure */
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
-	mp->phys_addr = mz->phys_addr;
+	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
@@ -983,7 +983,7 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 	fprintf(f, "mempool <%s>@%p\n", mp->name, mp);
 	fprintf(f, "  flags=%x\n", mp->flags);
 	fprintf(f, "  ring=<%s>@%p\n", mp->ring->name, mp->ring);
-	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->phys_addr);
+	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->mz->phys_addr);
 	fprintf(f, "  nb_mem_chunks=%u\n", mp->nb_mem_chunks);
 	fprintf(f, "  size=%"PRIu32"\n", mp->size);
 	fprintf(f, "  populated_size=%"PRIu32"\n", mp->populated_size);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 82b0334..4a8c76b 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -203,7 +203,7 @@ struct rte_mempool_memhdr {
 struct rte_mempool {
 	char name[RTE_MEMPOOL_NAMESIZE]; /**< Name of mempool. */
 	struct rte_ring *ring;           /**< Ring to store objects. */
-	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
+	const struct rte_memzone *mz;    /**< Memzone where mempool is allocated */
 	int flags;                       /**< Flags of the mempool. */
 	int socket_id;                   /**< Socket id passed at mempool creation. */
 	uint32_t size;                   /**< Max size of the mempool. */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 25/36] mempool: introduce a function to free a mempool
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (23 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 24/36] mempool: replace mempool physaddr by a memzone pointer Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 26/36] mempool: introduce a function to create an empty mempool Olivier Matz
                     ` (12 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Introduce rte_mempool_free() that:

- unlink the mempool from the global list if it is found
- free all the memory chunks using their free callbacks
- free the internal ring
- free the memzone containing the mempool

Currently this function is only used in error cases when
creating a new mempool, but it will be made public later
in the patch series.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 37 ++++++++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 7336616..b432aae 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -616,6 +616,35 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	return ret;
 }
 
+/* free a mempool */
+static void
+rte_mempool_free(struct rte_mempool *mp)
+{
+	struct rte_mempool_list *mempool_list = NULL;
+	struct rte_tailq_entry *te;
+
+	if (mp == NULL)
+		return;
+
+	mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+	/* find out tailq entry */
+	TAILQ_FOREACH(te, mempool_list, next) {
+		if (te->data == (void *)mp)
+			break;
+	}
+
+	if (te != NULL) {
+		TAILQ_REMOVE(mempool_list, te, next);
+		rte_free(te);
+	}
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	rte_mempool_free_memchunks(mp);
+	rte_ring_free(mp->ring);
+	rte_memzone_free(mp->mz);
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -775,13 +804,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	if (mp != NULL) {
-		rte_mempool_free_memchunks(mp);
-		rte_ring_free(mp->ring);
-	}
-	rte_free(te);
-	if (mz != NULL)
-		rte_memzone_free(mz);
+	rte_mempool_free(mp);
 
 	return NULL;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 26/36] mempool: introduce a function to create an empty mempool
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (24 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 25/36] mempool: introduce a function to free a mempool Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 15:57     ` Wiles, Keith
  2016-04-14 10:19   ` [PATCH 27/36] eal/xen: return machine address without knowing memseg id Olivier Matz
                     ` (11 subsequent siblings)
  37 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Introduce a new function rte_mempool_create_empty()
that allocates a mempool that is not populated.

The functions rte_mempool_create() and rte_mempool_xmem_create()
now make use of it, making their code much easier to read.
Currently, they are the only users of rte_mempool_create_empty()
but the function will be made public in next commits.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 185 ++++++++++++++++++++++-----------------
 1 file changed, 107 insertions(+), 78 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index b432aae..03d506a 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -318,30 +318,6 @@ rte_dom0_mempool_create(const char *name __rte_unused,
 }
 #endif
 
-/* create the mempool */
-struct rte_mempool *
-rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
-		   unsigned cache_size, unsigned private_data_size,
-		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
-		   int socket_id, unsigned flags)
-{
-	if (rte_xen_dom0_supported())
-		return rte_dom0_mempool_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags);
-	else
-		return rte_mempool_xmem_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags,
-					       NULL, NULL, MEMPOOL_PG_NUM_DEFAULT,
-					       MEMPOOL_PG_SHIFT_MAX);
-}
-
 /* create the internal ring */
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
@@ -645,20 +621,11 @@ rte_mempool_free(struct rte_mempool *mp)
 	rte_memzone_free(mp->mz);
 }
 
-/*
- * Create the mempool over already allocated chunk of memory.
- * That external memory buffer can consists of physically disjoint pages.
- * Setting vaddr to NULL, makes mempool to fallback to original behaviour
- * and allocate space for mempool and it's elements as one big chunk of
- * physically continuos memory.
- * */
-struct rte_mempool *
-rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
-		unsigned cache_size, unsigned private_data_size,
-		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		int socket_id, unsigned flags, void *vaddr,
-		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+/* create an empty mempool */
+static struct rte_mempool *
+rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	int socket_id, unsigned flags)
 {
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	struct rte_mempool_list *mempool_list;
@@ -668,7 +635,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	struct rte_mempool_objsz objsz;
-	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -691,18 +657,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		return NULL;
 	}
 
-	/* check that we have both VA and PA */
-	if (vaddr != NULL && paddr == NULL) {
-		rte_errno = EINVAL;
-		return NULL;
-	}
-
-	/* Check that pg_num and pg_shift parameters are valid. */
-	if (pg_num == 0 || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
-		rte_errno = EINVAL;
-		return NULL;
-	}
-
 	/* "no cache align" imply "no spread" */
 	if (flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		flags |= MEMPOOL_F_NO_SPREAD;
@@ -730,11 +684,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		goto exit_unlock;
 	}
 
-	/*
-	 * If user provided an external memory buffer, then use it to
-	 * store mempool objects. Otherwise reserve a memzone that is large
-	 * enough to hold mempool header and metadata plus mempool objects.
-	 */
 	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
@@ -746,12 +695,14 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		goto exit_unlock;
 
 	/* init the mempool structure */
+	mp = mz->addr;
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
+	mp->socket_id = socket_id;
 	mp->elt_size = objsz.elt_size;
 	mp->header_size = objsz.header_size;
 	mp->trailer_size = objsz.trailer_size;
@@ -771,41 +722,119 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->local_cache = (struct rte_mempool_cache *)
 		RTE_PTR_ADD(mp, MEMPOOL_HEADER_SIZE(mp, 0));
 
-	/* call the initializer */
+	te->data = mp;
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+	TAILQ_INSERT_TAIL(mempool_list, te, next);
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+
+	return mp;
+
+exit_unlock:
+	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+	rte_free(te);
+	rte_mempool_free(mp);
+	return NULL;
+}
+
+/* create the mempool */
+struct rte_mempool *
+rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
+	int socket_id, unsigned flags)
+{
+	struct rte_mempool *mp;
+
+	if (rte_xen_dom0_supported())
+		return rte_dom0_mempool_create(name, n, elt_size,
+					       cache_size, private_data_size,
+					       mp_init, mp_init_arg,
+					       obj_init, obj_init_arg,
+					       socket_id, flags);
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		private_data_size, socket_id, flags);
+	if (mp == NULL)
+		return NULL;
+
+	/* call the mempool priv initializer */
 	if (mp_init)
 		mp_init(mp, mp_init_arg);
 
-	/* mempool elements allocated together with mempool */
+	if (rte_mempool_populate_default(mp) < 0)
+		goto fail;
+
+	/* call the object initializers */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
+
+	return mp;
+
+ fail:
+	rte_mempool_free(mp);
+	return NULL;
+}
+
+/*
+ * Create the mempool over already allocated chunk of memory.
+ * That external memory buffer can consists of physically disjoint pages.
+ * Setting vaddr to NULL, makes mempool to fallback to original behaviour
+ * and allocate space for mempool and it's elements as one big chunk of
+ * physically continuos memory.
+ * */
+struct rte_mempool *
+rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
+		unsigned cache_size, unsigned private_data_size,
+		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
+		int socket_id, unsigned flags, void *vaddr,
+		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+{
+	struct rte_mempool *mp = NULL;
+	int ret;
+
+	/* no virtual address supplied, use rte_mempool_create() */
 	if (vaddr == NULL)
-		ret = rte_mempool_populate_default(mp);
-	else
-		ret = rte_mempool_populate_phys_tab(mp, vaddr,
-			paddr, pg_num, pg_shift, NULL, NULL);
-	if (ret < 0) {
-		rte_errno = -ret;
-		goto exit_unlock;
-	} else if (ret != (int)mp->size) {
+		return rte_mempool_create(name, n, elt_size, cache_size,
+			private_data_size, mp_init, mp_init_arg,
+			obj_init, obj_init_arg, socket_id, flags);
+
+	/* check that we have both VA and PA */
+	if (paddr == NULL) {
 		rte_errno = EINVAL;
-		goto exit_unlock;
+		return NULL;
 	}
 
-	/* call the initializer */
-	if (obj_init)
-		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
+	/* Check that pg_shift parameter is valid. */
+	if (pg_shift > MEMPOOL_PG_SHIFT_MAX) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
 
-	te->data = (void *) mp;
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		private_data_size, socket_id, flags);
+	if (mp == NULL)
+		return NULL;
 
-	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
-	TAILQ_INSERT_TAIL(mempool_list, te, next);
-	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
-	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+	/* call the mempool priv initializer */
+	if (mp_init)
+		mp_init(mp, mp_init_arg);
+
+	ret = rte_mempool_populate_phys_tab(mp, vaddr, paddr, pg_num, pg_shift,
+		NULL, NULL);
+	if (ret < 0 || ret != (int)mp->size)
+		goto fail;
+
+	/* call the object initializers */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
 
 	return mp;
 
-exit_unlock:
-	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+ fail:
 	rte_mempool_free(mp);
-
 	return NULL;
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 27/36] eal/xen: return machine address without knowing memseg id
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (25 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 26/36] mempool: introduce a function to create an empty mempool Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 28/36] mempool: rework support of xen dom0 Olivier Matz
                     ` (10 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

The conversion from guest physical address to machine physical address
is fast when the caller knows the memseg corresponding to the gpa.

But in case the user does not know this information, just find it
by browsing the segments. This feature will be used by next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/include/rte_memory.h   | 11 ++++++-----
 lib/librte_eal/linuxapp/eal/eal_xen_memory.c | 17 +++++++++++++++--
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h
index f8dbece..0661109 100644
--- a/lib/librte_eal/common/include/rte_memory.h
+++ b/lib/librte_eal/common/include/rte_memory.h
@@ -200,21 +200,22 @@ unsigned rte_memory_get_nrank(void);
 int rte_xen_dom0_supported(void);
 
 /**< Internal use only - phys to virt mapping for xen */
-phys_addr_t rte_xen_mem_phy2mch(uint32_t, const phys_addr_t);
+phys_addr_t rte_xen_mem_phy2mch(int32_t, const phys_addr_t);
 
 /**
  * Return the physical address of elt, which is an element of the pool mp.
  *
  * @param memseg_id
- *   The mempool is from which memory segment.
+ *   Identifier of the memory segment owning the physical address. If
+ *   set to -1, find it automatically.
  * @param phy_addr
  *   physical address of elt.
  *
  * @return
- *   The physical address or error.
+ *   The physical address or RTE_BAD_PHYS_ADDR on error.
  */
 static inline phys_addr_t
-rte_mem_phy2mch(uint32_t memseg_id, const phys_addr_t phy_addr)
+rte_mem_phy2mch(int32_t memseg_id, const phys_addr_t phy_addr)
 {
 	if (rte_xen_dom0_supported())
 		return rte_xen_mem_phy2mch(memseg_id, phy_addr);
@@ -250,7 +251,7 @@ static inline int rte_xen_dom0_supported(void)
 }
 
 static inline phys_addr_t
-rte_mem_phy2mch(uint32_t memseg_id __rte_unused, const phys_addr_t phy_addr)
+rte_mem_phy2mch(int32_t memseg_id __rte_unused, const phys_addr_t phy_addr)
 {
 	return phy_addr;
 }
diff --git a/lib/librte_eal/linuxapp/eal/eal_xen_memory.c b/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
index 495eef9..efbd374 100644
--- a/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
@@ -156,13 +156,26 @@ get_xen_memory_size(void)
  * Based on physical address to caculate MFN in Xen Dom0.
  */
 phys_addr_t
-rte_xen_mem_phy2mch(uint32_t memseg_id, const phys_addr_t phy_addr)
+rte_xen_mem_phy2mch(int32_t memseg_id, const phys_addr_t phy_addr)
 {
-	int mfn_id;
+	int mfn_id, i;
 	uint64_t mfn, mfn_offset;
 	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
 	struct rte_memseg *memseg = mcfg->memseg;
 
+	/* find the memory segment owning the physical address */
+	if (memseg_id == -1) {
+		for (i = 0; i < RTE_MAX_MEMSEG; i++) {
+			if ((phy_addr >= memseg[i].phys_addr) &&
+				(phys_addr < memseg[i].phys_addr + memseg[i].size)) {
+				memseg_id = i;
+				break;
+			}
+		}
+		if (memseg_id == -1)
+			return RTE_BAD_PHYS_ADDR;
+	}
+
 	mfn_id = (phy_addr - memseg[memseg_id].phys_addr) / RTE_PGSIZE_2M;
 
 	/*the MFN is contiguous in 2M */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 28/36] mempool: rework support of xen dom0
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (26 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 27/36] eal/xen: return machine address without knowing memseg id Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 29/36] mempool: create the internal ring when populating Olivier Matz
                     ` (9 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Avoid to have a specific file for that, and remove #ifdefs.
Now that we have introduced a function to populate a mempool
with a virtual area, the support of xen dom0 is much easier.

The only thing we need to do is to convert the guest physical
address into the machine physical address using rte_mem_phy2mch().
This function does nothing when not running xen.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/Makefile                |   3 -
 lib/librte_mempool/rte_dom0_mempool.c      | 133 -----------------------------
 lib/librte_mempool/rte_mempool.c           |  33 ++-----
 lib/librte_mempool/rte_mempool.h           |  89 -------------------
 lib/librte_mempool/rte_mempool_version.map |   1 -
 5 files changed, 5 insertions(+), 254 deletions(-)
 delete mode 100644 lib/librte_mempool/rte_dom0_mempool.c

diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index 706f844..43423e0 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -42,9 +42,6 @@ LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
-ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
-SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_dom0_mempool.c
-endif
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_MEMPOOL)-include := rte_mempool.h
 
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
deleted file mode 100644
index dad755c..0000000
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ /dev/null
@@ -1,133 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <stdio.h>
-#include <string.h>
-#include <stdint.h>
-#include <unistd.h>
-#include <stdarg.h>
-#include <inttypes.h>
-#include <errno.h>
-#include <sys/queue.h>
-
-#include <rte_common.h>
-#include <rte_log.h>
-#include <rte_debug.h>
-#include <rte_memory.h>
-#include <rte_memzone.h>
-#include <rte_atomic.h>
-#include <rte_launch.h>
-#include <rte_eal.h>
-#include <rte_eal_memconfig.h>
-#include <rte_per_lcore.h>
-#include <rte_lcore.h>
-#include <rte_branch_prediction.h>
-#include <rte_ring.h>
-#include <rte_errno.h>
-#include <rte_string_fns.h>
-#include <rte_spinlock.h>
-
-#include "rte_mempool.h"
-
-static void
-get_phys_map(void *va, phys_addr_t pa[], uint32_t pg_num,
-	uint32_t pg_sz, uint32_t memseg_id)
-{
-	uint32_t i;
-	uint64_t virt_addr, mfn_id;
-	struct rte_mem_config *mcfg;
-	uint32_t page_size = getpagesize();
-
-	/* get pointer to global configuration */
-	mcfg = rte_eal_get_configuration()->mem_config;
-	virt_addr = (uintptr_t) mcfg->memseg[memseg_id].addr;
-
-	for (i = 0; i != pg_num; i++) {
-		mfn_id = ((uintptr_t)va + i * pg_sz - virt_addr) / RTE_PGSIZE_2M;
-		pa[i] = mcfg->memseg[memseg_id].mfn[mfn_id] * page_size;
-	}
-}
-
-/* create the mempool for supporting Dom0 */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
-	unsigned cache_size, unsigned private_data_size,
-	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-	int socket_id, unsigned flags)
-{
-	struct rte_mempool *mp = NULL;
-	phys_addr_t *pa;
-	char *va;
-	size_t sz;
-	uint32_t pg_num, pg_shift, pg_sz, total_size;
-	const struct rte_memzone *mz;
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-
-	pg_sz = RTE_PGSIZE_2M;
-
-	pg_shift = rte_bsf32(pg_sz);
-	total_size = rte_mempool_calc_obj_size(elt_size, flags, NULL);
-
-	/* calc max memory size and max number of pages needed. */
-	sz = rte_mempool_xmem_size(elt_num, total_size, pg_shift) +
-		RTE_PGSIZE_2M;
-	pg_num = sz >> pg_shift;
-
-	/* extract physical mappings of the allocated memory. */
-	pa = calloc(pg_num, sizeof (*pa));
-	if (pa == NULL)
-		return mp;
-
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT "_elt", name);
-	mz = rte_memzone_reserve(mz_name, sz, socket_id, mz_flags);
-	if (mz == NULL) {
-		free(pa);
-		return mp;
-	}
-
-	va = (char *)RTE_ALIGN_CEIL((uintptr_t)mz->addr, RTE_PGSIZE_2M);
-	/* extract physical mappings of the allocated memory. */
-	get_phys_map(va, pa, pg_num, pg_sz, mz->memseg_id);
-
-	mp = rte_mempool_xmem_create(name, elt_num, elt_size,
-		cache_size, private_data_size,
-		mp_init, mp_init_arg,
-		obj_init, obj_init_arg,
-		socket_id, flags, va, pa, pg_num, pg_shift);
-
-	free(pa);
-
-	return mp;
-}
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 03d506a..5f9ec63 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -298,26 +298,6 @@ rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
 	return (size_t)paddr_idx << pg_shift;
 }
 
-#ifndef RTE_LIBRTE_XEN_DOM0
-/* stub if DOM0 support not configured */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name __rte_unused,
-			unsigned n __rte_unused,
-			unsigned elt_size __rte_unused,
-			unsigned cache_size __rte_unused,
-			unsigned private_data_size __rte_unused,
-			rte_mempool_ctor_t *mp_init __rte_unused,
-			void *mp_init_arg __rte_unused,
-			rte_mempool_obj_ctor_t *obj_init __rte_unused,
-			void *obj_init_arg __rte_unused,
-			int socket_id __rte_unused,
-			unsigned flags __rte_unused)
-{
-	rte_errno = EINVAL;
-	return NULL;
-}
-#endif
-
 /* create the internal ring */
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
@@ -490,6 +470,9 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 		     mp->populated_size < mp->size; off += phys_len) {
 
 		paddr = rte_mem_virt2phy(addr + off);
+		/* required for xen_dom0 to get the machine address */
+		paddr = rte_mem_phy2mch(-1, paddr);
+
 		if (paddr == RTE_BAD_PHYS_ADDR) {
 			ret = -EINVAL;
 			goto fail;
@@ -571,7 +554,8 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		if (rte_eal_has_hugepages())
+		/* use memzone physical address if it is valid */
+		if (rte_eal_has_hugepages() && !rte_xen_dom0_supported())
 			ret = rte_mempool_populate_phys(mp, mz->addr,
 				mz->phys_addr, mz->len,
 				rte_mempool_memchunk_mz_free,
@@ -747,13 +731,6 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 {
 	struct rte_mempool *mp;
 
-	if (rte_xen_dom0_supported())
-		return rte_dom0_mempool_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags);
-
 	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
 		private_data_size, socket_id, flags);
 	if (mp == NULL)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 4a8c76b..658d4a2 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -501,95 +501,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
 /**
- * Create a new mempool named *name* in memory on Xen Dom0.
- *
- * This function uses ``rte_mempool_xmem_create()`` to allocate memory. The
- * pool contains n elements of elt_size. Its size is set to n.
- * All elements of the mempool are allocated together with the mempool header,
- * and memory buffer can consist of set of disjoint physical pages.
- *
- * @param name
- *   The name of the mempool.
- * @param n
- *   The number of elements in the mempool. The optimum size (in terms of
- *   memory usage) for a mempool is when n is a power of two minus one:
- *   n = (2^q - 1).
- * @param elt_size
- *   The size of each element.
- * @param cache_size
- *   If cache_size is non-zero, the rte_mempool library will try to
- *   limit the accesses to the common lockless pool, by maintaining a
- *   per-lcore object cache. This argument must be lower or equal to
- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
- *   cache_size to have "n modulo cache_size == 0": if this is
- *   not the case, some elements will always stay in the pool and will
- *   never be used. The access to the per-lcore table is of course
- *   faster than the multi-producer/consumer pool. The cache can be
- *   disabled if the cache_size argument is set to 0; it can be useful to
- *   avoid losing objects in cache. Note that even if not used, the
- *   memory space for cache is always reserved in a mempool structure,
- *   except if CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE is set to 0.
- * @param private_data_size
- *   The size of the private data appended after the mempool
- *   structure. This is useful for storing some private data after the
- *   mempool structure, as is done for rte_mbuf_pool for example.
- * @param mp_init
- *   A function pointer that is called for initialization of the pool,
- *   before object initialization. The user can initialize the private
- *   data in this function if needed. This parameter can be NULL if
- *   not needed.
- * @param mp_init_arg
- *   An opaque pointer to data that can be used in the mempool
- *   constructor function.
- * @param obj_init
- *   A function pointer that is called for each object at
- *   initialization of the pool. The user can set some meta data in
- *   objects if needed. This parameter can be NULL if not needed.
- *   The obj_init() function takes the mempool pointer, the init_arg,
- *   the object pointer and the object number as parameters.
- * @param obj_init_arg
- *   An opaque pointer to data that can be used as an argument for
- *   each call to the object constructor function.
- * @param socket_id
- *   The *socket_id* argument is the socket identifier in the case of
- *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
- *   constraint for the reserved zone.
- * @param flags
- *   The *flags* arguments is an OR of following flags:
- *   - MEMPOOL_F_NO_SPREAD: By default, objects addresses are spread
- *     between channels in RAM: the pool allocator will add padding
- *     between objects depending on the hardware configuration. See
- *     Memory alignment constraints for details. If this flag is set,
- *     the allocator will just align them to a cache line.
- *   - MEMPOOL_F_NO_CACHE_ALIGN: By default, the returned objects are
- *     cache-aligned. This flag removes this constraint, and no
- *     padding will be present between objects. This flag implies
- *     MEMPOOL_F_NO_SPREAD.
- *   - MEMPOOL_F_SP_PUT: If this flag is set, the default behavior
- *     when using rte_mempool_put() or rte_mempool_put_bulk() is
- *     "single-producer". Otherwise, it is "multi-producers".
- *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
- *     when using rte_mempool_get() or rte_mempool_get_bulk() is
- *     "single-consumer". Otherwise, it is "multi-consumers".
- * @return
- *   The pointer to the new allocated mempool, on success. NULL on error
- *   with rte_errno set appropriately. Possible rte_errno values include:
- *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
- *    - E_RTE_SECONDARY - function was called from a secondary process instance
- *    - EINVAL - cache size provided is too large
- *    - ENOSPC - the maximum number of memzones has already been allocated
- *    - EEXIST - a memzone with the same name already exists
- *    - ENOMEM - no appropriate memory area found in which to create memzone
- */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
-		unsigned cache_size, unsigned private_data_size,
-		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		int socket_id, unsigned flags);
-
-
-/**
  * Call a function for each mempool element
  *
  * Iterate across all objects attached to a rte_mempool and call the
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index ca887b5..c4f2da0 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -1,7 +1,6 @@
 DPDK_2.0 {
 	global:
 
-	rte_dom0_mempool_create;
 	rte_mempool_audit;
 	rte_mempool_calc_obj_size;
 	rte_mempool_count;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 29/36] mempool: create the internal ring when populating
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (27 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 28/36] mempool: rework support of xen dom0 Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 30/36] mempool: populate a mempool with anonymous memory Olivier Matz
                     ` (8 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Instead of creating the internal ring at mempool creation, do
it when populating the mempool with the first memory chunk. The
objective here is to simplify the change of external handler
when it will be introduced.

For instance, this will be possible:

  mp = rte_mempool_create_empty(...)
  rte_mempool_set_ext_handler(mp, my_handler)
  rte_mempool_populate_default()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 12 +++++++++---
 lib/librte_mempool/rte_mempool.h |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 5f9ec63..eaae5a0 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -324,6 +324,7 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 		return -rte_errno;
 
 	mp->ring = r;
+	mp->flags |= MEMPOOL_F_RING_CREATED;
 	return 0;
 }
 
@@ -372,6 +373,14 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	unsigned i = 0;
 	size_t off;
 	struct rte_mempool_memhdr *memhdr;
+	int ret;
+
+	/* create the internal ring if not already done */
+	if ((mp->flags & MEMPOOL_F_RING_CREATED) == 0) {
+		ret = rte_mempool_ring_create(mp);
+		if (ret < 0)
+			return ret;
+	}
 
 	/* mempool is already populated */
 	if (mp->populated_size >= mp->size)
@@ -696,9 +705,6 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	STAILQ_INIT(&mp->elt_list);
 	STAILQ_INIT(&mp->mem_list);
 
-	if (rte_mempool_ring_create(mp) < 0)
-		goto exit_unlock;
-
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
 	 * The local_cache points to just past the elt_pa[] array.
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 658d4a2..721d8e7 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -234,6 +234,7 @@ struct rte_mempool {
 #define MEMPOOL_F_NO_CACHE_ALIGN 0x0002 /**< Do not align objs on cache lines.*/
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
+#define MEMPOOL_F_RING_CREATED   0x0010 /**< Internal: ring is created */
 
 /**
  * @internal When debug is enabled, store some statistics.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 30/36] mempool: populate a mempool with anonymous memory
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (28 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 29/36] mempool: create the internal ring when populating Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 31/36] mempool: make mempool populate and free api public Olivier Matz
                     ` (7 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Now that we can populate a mempool with any virtual memory,
it is easier to introduce a function to populate a mempool
with memory coming from an anonymous mapping, as it's done
in test-pmd.

The next commit will replace test-pmd anonymous mapping by
this function.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 64 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index eaae5a0..5c21f08 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -39,6 +39,7 @@
 #include <inttypes.h>
 #include <errno.h>
 #include <sys/queue.h>
+#include <sys/mman.h>
 
 #include <rte_common.h>
 #include <rte_log.h>
@@ -585,6 +586,69 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	return ret;
 }
 
+/* return the memory size required for mempool objects in anonymous mem */
+static size_t
+get_anon_size(const struct rte_mempool *mp)
+{
+	size_t size, total_elt_sz, pg_sz, pg_shift;
+
+	pg_sz = getpagesize();
+	pg_shift = rte_bsf32(pg_sz);
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+	size = rte_mempool_xmem_size(mp->size, total_elt_sz, pg_shift);
+
+	return size;
+}
+
+/* unmap a memory zone mapped by rte_mempool_populate_anon() */
+static void
+rte_mempool_memchunk_anon_free(struct rte_mempool_memhdr *memhdr,
+	void *opaque)
+{
+	munmap(opaque, get_anon_size(memhdr->mp));
+}
+
+/* populate the mempool with an anonymous mapping */
+__rte_unused static int
+rte_mempool_populate_anon(struct rte_mempool *mp)
+{
+	size_t size;
+	int ret;
+	char *addr;
+
+	/* mempool is already populated, error */
+	if (!STAILQ_EMPTY(&mp->mem_list)) {
+		rte_errno = EINVAL;
+		return 0;
+	}
+
+	/* get chunk of virtually continuous memory */
+	size = get_anon_size(mp);
+	addr = mmap(NULL, size, PROT_READ | PROT_WRITE,
+		MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+	if (addr == MAP_FAILED) {
+		rte_errno = errno;
+		return 0;
+	}
+	/* can't use MMAP_LOCKED, it does not exist on BSD */
+	if (mlock(addr, size) < 0) {
+		rte_errno = errno;
+		munmap(addr, size);
+		return 0;
+	}
+
+	ret = rte_mempool_populate_virt(mp, addr, size, getpagesize(),
+		rte_mempool_memchunk_anon_free, addr);
+	if (ret == 0)
+		goto fail;
+
+	return mp->populated_size;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return 0;
+}
+
 /* free a mempool */
 static void
 rte_mempool_free(struct rte_mempool *mp)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 31/36] mempool: make mempool populate and free api public
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (29 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 30/36] mempool: populate a mempool with anonymous memory Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 32/36] test-pmd: remove specific anon mempool code Olivier Matz
                     ` (6 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Add the following functions to the public mempool API:

- rte_mempool_create_empty()
- rte_mempool_populate_phys()
- rte_mempool_populate_phys_tab()
- rte_mempool_populate_virt()
- rte_mempool_populate_default()
- rte_mempool_populate_anon()
- rte_mempool_free()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c           |  14 +--
 lib/librte_mempool/rte_mempool.h           | 168 +++++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool_version.map |   9 +-
 3 files changed, 183 insertions(+), 8 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 5c21f08..4850f5d 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -365,7 +365,7 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
 /* Add objects in the pool, using a physically contiguous memory
  * zone. Return the number of objects added, or a negative value
  * on error. */
-static int
+int
 rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
 	void *opaque)
@@ -423,7 +423,7 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 
 /* Add objects in the pool, using a table of physical pages. Return the
  * number of objects added, or a negative value on error. */
-static int
+int
 rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
 	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque)
@@ -458,7 +458,7 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 
 /* Populate the mempool with a virtual area. Return the number of
  * objects added, or a negative value on error. */
-static int
+int
 rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
 	void *opaque)
@@ -518,7 +518,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 /* Default function to populate the mempool: allocate memory in memzones,
  * and populate them. Return the number of objects added, or a negative
  * value on error. */
-static int
+int
 rte_mempool_populate_default(struct rte_mempool *mp)
 {
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
@@ -609,7 +609,7 @@ rte_mempool_memchunk_anon_free(struct rte_mempool_memhdr *memhdr,
 }
 
 /* populate the mempool with an anonymous mapping */
-__rte_unused static int
+int
 rte_mempool_populate_anon(struct rte_mempool *mp)
 {
 	size_t size;
@@ -650,7 +650,7 @@ rte_mempool_populate_anon(struct rte_mempool *mp)
 }
 
 /* free a mempool */
-static void
+void
 rte_mempool_free(struct rte_mempool *mp)
 {
 	struct rte_mempool_list *mempool_list = NULL;
@@ -679,7 +679,7 @@ rte_mempool_free(struct rte_mempool *mp)
 }
 
 /* create an empty mempool */
-static struct rte_mempool *
+struct rte_mempool *
 rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	int socket_id, unsigned flags)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 721d8e7..fe4e6fd 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -502,6 +502,174 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
 /**
+ * Create an empty mempool
+ *
+ * The mempool is allocated and initialized, but it is not populated: no
+ * memory is allocated for the mempool elements. The user has to call
+ * rte_mempool_populate_*() or to add memory chunks to the pool. Once
+ * populated, the user may also want to initialize each object with
+ * rte_mempool_obj_iter().
+ *
+ * @param name
+ *   The name of the mempool.
+ * @param n
+ *   The maximum number of elements that can be added in the mempool.
+ *   The optimum size (in terms of memory usage) for a mempool is when n
+ *   is a power of two minus one: n = (2^q - 1).
+ * @param elt_size
+ *   The size of each element.
+ * @param cache_size
+ *   Size of the cache. See rte_mempool_create() for details.
+ * @param private_data_size
+ *   The size of the private data appended after the mempool
+ *   structure. This is useful for storing some private data after the
+ *   mempool structure, as is done for rte_mbuf_pool for example.
+ * @param socket_id
+ *   The *socket_id* argument is the socket identifier in the case of
+ *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
+ *   constraint for the reserved zone.
+ * @param flags
+ *   Flags controlling the behavior of the mempool. See
+ *   rte_mempool_create() for details.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. See rte_mempool_create() for details.
+ */
+struct rte_mempool *
+rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	int socket_id, unsigned flags);
+/**
+ * Free a mempool
+ *
+ * Unlink the mempool from global list, free the memory chunks, and all
+ * memory referenced by the mempool. The objects must not be used by
+ * other cores as they will be freed.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ */
+void
+rte_mempool_free(struct rte_mempool *mp);
+
+/**
+ * Add physically contiguous memory for objects in the pool at init
+ *
+ * Add a virtually and physically contiguous memory chunk in the pool
+ * where objects can be instanciated.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param vaddr
+ *   The virtual address of memory that should be used to store objects.
+ * @param paddr
+ *   The physical address
+ * @param len
+ *   The length of memory in bytes.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
+	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque);
+
+/**
+ * Add physical memory for objects in the pool at init
+ *
+ * Add a virtually contiguous memory chunk in the pool where objects can
+ * be instanciated. The physical addresses corresponding to the virtual
+ * area are described in paddr[], pg_num, pg_shift.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param vaddr
+ *   The virtual address of memory that should be used to store objects.
+ * @param paddr
+ *   An array of physical addresses of each page composing the virtual
+ *   area.
+ * @param pg_num
+ *   Number of elements in the paddr array.
+ * @param pg_shift
+ *   LOG2 of the physical pages size.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunks are not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
+	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque);
+
+/**
+ * Add virtually contiguous memory for objects in the pool at init
+ *
+ * Add a virtually contiguous memory chunk in the pool where objects can
+ * be instanciated.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param addr
+ *   The virtual address of memory that should be used to store objects.
+ *   Must be page-aligned.
+ * @param len
+ *   The length of memory in bytes. Must be page-aligned.
+ * @param pg_sz
+ *   The size of memory pages in this virtual area.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int
+rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
+	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque);
+
+/**
+ * Add memory for objects in the pool at init
+ *
+ * This is the default function used by rte_mempool_create() to populate
+ * the mempool. It adds memory allocated using rte_memzone_reserve().
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_default(struct rte_mempool *mp);
+
+/**
+ * Add memory from anonymous mapping for objects in the pool at init
+ *
+ * This function mmap an anonymous memory zone that is locked in
+ * memory to store the objects of the mempool.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_anon(struct rte_mempool *mp);
+
+/**
  * Call a function for each mempool element
  *
  * Iterate across all objects attached to a rte_mempool and call the
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index c4f2da0..7d1f670 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -16,11 +16,18 @@ DPDK_2.0 {
 	local: *;
 };
 
-DPDK_16.07 {
+DPDK_16.7 {
 	global:
 
 	rte_mempool_obj_iter;
 	rte_mempool_mem_iter;
+	rte_mempool_create_empty;
+	rte_mempool_populate_phys;
+	rte_mempool_populate_phys_tab;
+	rte_mempool_populate_virt;
+	rte_mempool_populate_default;
+	rte_mempool_populate_anon;
+	rte_mempool_free;
 
 	local: *;
 } DPDK_2.0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 32/36] test-pmd: remove specific anon mempool code
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (30 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 31/36] mempool: make mempool populate and free api public Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 33/36] mem: avoid memzone/mempool/ring name truncation Olivier Matz
                     ` (5 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Now that mempool library provide functions to populate with anonymous
mmap'd memory, we can remove this specific code from test-pmd.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test-pmd/Makefile        |   4 -
 app/test-pmd/mempool_anon.c  | 201 -------------------------------------------
 app/test-pmd/mempool_osdep.h |  54 ------------
 app/test-pmd/testpmd.c       |  23 +++--
 4 files changed, 14 insertions(+), 268 deletions(-)
 delete mode 100644 app/test-pmd/mempool_anon.c
 delete mode 100644 app/test-pmd/mempool_osdep.h

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 72426f3..40039a1 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -58,11 +58,7 @@ SRCS-y += txonly.c
 SRCS-y += csumonly.c
 SRCS-y += icmpecho.c
 SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c
-SRCS-y += mempool_anon.c
 
-ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
-CFLAGS_mempool_anon.o := -D_GNU_SOURCE
-endif
 CFLAGS_cmdline.o := -D_GNU_SOURCE
 
 # this application needs libraries first
diff --git a/app/test-pmd/mempool_anon.c b/app/test-pmd/mempool_anon.c
deleted file mode 100644
index 5e23848..0000000
--- a/app/test-pmd/mempool_anon.c
+++ /dev/null
@@ -1,201 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <sys/types.h>
-#include <sys/stat.h>
-#include "mempool_osdep.h"
-#include <rte_errno.h>
-
-#ifdef RTE_EXEC_ENV_LINUXAPP
-
-#include <fcntl.h>
-#include <unistd.h>
-#include <sys/mman.h>
-
-
-#define	PAGEMAP_FNAME		"/proc/self/pagemap"
-
-/*
- * the pfn (page frame number) are bits 0-54 (see pagemap.txt in linux
- * Documentation).
- */
-#define	PAGEMAP_PFN_BITS	54
-#define	PAGEMAP_PFN_MASK	RTE_LEN2MASK(PAGEMAP_PFN_BITS, phys_addr_t)
-
-
-static int
-get_phys_map(void *va, phys_addr_t pa[], uint32_t pg_num, uint32_t pg_sz)
-{
-	int32_t fd, rc;
-	uint32_t i, nb;
-	off_t ofs;
-
-	ofs = (uintptr_t)va / pg_sz * sizeof(*pa);
-	nb = pg_num * sizeof(*pa);
-
-	if ((fd = open(PAGEMAP_FNAME, O_RDONLY)) < 0)
-		return ENOENT;
-
-	if ((rc = pread(fd, pa, nb, ofs)) < 0 || (rc -= nb) != 0) {
-
-		RTE_LOG(ERR, USER1, "failed read of %u bytes from \'%s\' "
-			"at offset %zu, error code: %d\n",
-			nb, PAGEMAP_FNAME, (size_t)ofs, errno);
-		rc = ENOENT;
-	}
-
-	close(fd);
-
-	for (i = 0; i != pg_num; i++)
-		pa[i] = (pa[i] & PAGEMAP_PFN_MASK) * pg_sz;
-
-	return rc;
-}
-
-struct rte_mempool *
-mempool_anon_create(const char *name, unsigned elt_num, unsigned elt_size,
-		   unsigned cache_size, unsigned private_data_size,
-		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		   int socket_id, unsigned flags)
-{
-	struct rte_mempool *mp;
-	phys_addr_t *pa;
-	char *va, *uv;
-	uint32_t n, pg_num, pg_shift, pg_sz, total_size;
-	size_t sz;
-	ssize_t usz;
-	int32_t rc;
-
-	rc = ENOMEM;
-	mp = NULL;
-
-	pg_sz = getpagesize();
-	if (rte_is_power_of_2(pg_sz) == 0) {
-		rte_errno = EINVAL;
-		return mp;
-	}
-
-	pg_shift = rte_bsf32(pg_sz);
-
-	total_size = rte_mempool_calc_obj_size(elt_size, flags, NULL);
-
-	/* calc max memory size and max number of pages needed. */
-	sz = rte_mempool_xmem_size(elt_num, total_size, pg_shift);
-	pg_num = sz >> pg_shift;
-
-	/* get chunk of virtually continuos memory.*/
-	if ((va = mmap(NULL, sz, PROT_READ | PROT_WRITE,
-			MAP_SHARED | MAP_ANONYMOUS | MAP_LOCKED,
-			-1, 0)) == MAP_FAILED) {
-		RTE_LOG(ERR, USER1, "%s(%s) failed mmap of %zu bytes, "
-			"error code: %d\n",
-			__func__, name, sz, errno);
-		rte_errno = rc;
-		return mp;
-	}
-
-	/* extract physical mappings of the allocated memory. */
-	if ((pa = calloc(pg_num, sizeof (*pa))) != NULL &&
-			(rc = get_phys_map(va, pa, pg_num, pg_sz)) == 0) {
-
-		/*
-		 * Check that allocated size is big enough to hold elt_num
-		 * objects and a calcualte how many bytes are actually required.
-		 */
-
-		if ((usz = rte_mempool_xmem_usage(va, elt_num, total_size, pa,
-				pg_num, pg_shift)) < 0) {
-
-			n = -usz;
-			rc = ENOENT;
-			RTE_LOG(ERR, USER1, "%s(%s) only %u objects from %u "
-				"requested can  be created over "
-				"mmaped region %p of %zu bytes\n",
-				__func__, name, n, elt_num, va, sz);
-		} else {
-
-			/* unmap unused pages if any */
-			if ((size_t)usz < sz) {
-
-				uv = va + usz;
-				usz = sz - usz;
-
-				RTE_LOG(INFO, USER1,
-					"%s(%s): unmap unused %zu of %zu "
-					"mmaped bytes @%p\n",
-					__func__, name, (size_t)usz, sz, uv);
-				munmap(uv, usz);
-				sz -= usz;
-				pg_num = sz >> pg_shift;
-			}
-
-			if ((mp = rte_mempool_xmem_create(name, elt_num,
-					elt_size, cache_size, private_data_size,
-					mp_init, mp_init_arg,
-					obj_init, obj_init_arg,
-					socket_id, flags, va, pa, pg_num,
-					pg_shift)) != NULL)
-
-				RTE_VERIFY(elt_num == mp->size);
-		}
-	}
-
-	if (mp == NULL) {
-		munmap(va, sz);
-		rte_errno = rc;
-	}
-
-	free(pa);
-	return mp;
-}
-
-#else /* RTE_EXEC_ENV_LINUXAPP */
-
-
-struct rte_mempool *
-mempool_anon_create(__rte_unused const char *name,
-	__rte_unused unsigned elt_num, __rte_unused unsigned elt_size,
-	__rte_unused unsigned cache_size,
-	__rte_unused unsigned private_data_size,
-	__rte_unused rte_mempool_ctor_t *mp_init,
-	__rte_unused void *mp_init_arg,
-	__rte_unused rte_mempool_obj_cb_t *obj_init,
-	__rte_unused void *obj_init_arg,
-	__rte_unused int socket_id, __rte_unused unsigned flags)
-{
-	rte_errno = ENOTSUP;
-	return NULL;
-}
-
-#endif /* RTE_EXEC_ENV_LINUXAPP */
diff --git a/app/test-pmd/mempool_osdep.h b/app/test-pmd/mempool_osdep.h
deleted file mode 100644
index 7ce7297..0000000
--- a/app/test-pmd/mempool_osdep.h
+++ /dev/null
@@ -1,54 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _MEMPOOL_OSDEP_H_
-#define _MEMPOOL_OSDEP_H_
-
-#include <rte_mempool.h>
-
-/**
- * @file
- * mempool OS specific header.
- */
-
-/*
- * Create mempool over objects from mmap(..., MAP_ANONYMOUS, ...).
- */
-struct rte_mempool *
-mempool_anon_create(const char *name, unsigned n, unsigned elt_size,
-	unsigned cache_size, unsigned private_data_size,
-	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-	int socket_id, unsigned flags);
-
-#endif /*_RTE_MEMPOOL_OSDEP_H_ */
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 26a174c..9d11830 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -77,7 +77,6 @@
 #endif
 
 #include "testpmd.h"
-#include "mempool_osdep.h"
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 
@@ -427,17 +426,23 @@ mbuf_pool_create(uint16_t mbuf_seg_size, unsigned nb_mbuf,
 
 	/* if the former XEN allocation failed fall back to normal allocation */
 	if (rte_mp == NULL) {
-		if (mp_anon != 0)
-			rte_mp = mempool_anon_create(pool_name, nb_mbuf,
-					mb_size, (unsigned) mb_mempool_cache,
-					sizeof(struct rte_pktmbuf_pool_private),
-					rte_pktmbuf_pool_init, NULL,
-					rte_pktmbuf_init, NULL,
-					socket_id, 0);
-		else
+		if (mp_anon != 0) {
+			rte_mp = rte_mempool_create_empty(pool_name, nb_mbuf,
+				mb_size, (unsigned) mb_mempool_cache,
+				sizeof(struct rte_pktmbuf_pool_private),
+				socket_id, 0);
+
+			if (rte_mempool_populate_anon(rte_mp) == 0) {
+				rte_mempool_free(rte_mp);
+				rte_mp = NULL;
+			}
+			rte_pktmbuf_pool_init(rte_mp, NULL);
+			rte_mempool_obj_iter(rte_mp, rte_pktmbuf_init, NULL);
+		} else {
 			/* wrapper to rte_mempool_create() */
 			rte_mp = rte_pktmbuf_pool_create(pool_name, nb_mbuf,
 				mb_mempool_cache, 0, mbuf_seg_size, socket_id);
+		}
 	}
 
 	if (rte_mp == NULL) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 33/36] mem: avoid memzone/mempool/ring name truncation
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (31 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 32/36] test-pmd: remove specific anon mempool code Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 34/36] mempool: new flag when phys contig mem is not needed Olivier Matz
                     ` (4 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Check the return value of snprintf to ensure that the name of
the object is not truncated.

By the way, update the test to avoid to trigger an error in
that case.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c                    | 12 ++++++++----
 lib/librte_eal/common/eal_common_memzone.c | 10 +++++++++-
 lib/librte_mempool/rte_mempool.c           | 20 ++++++++++++++++----
 lib/librte_ring/rte_ring.c                 | 16 +++++++++++++---
 4 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 2bc3ac0..e0d5c61 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -407,21 +407,25 @@ test_mempool_same_name_twice_creation(void)
 {
 	struct rte_mempool *mp_tc;
 
-	mp_tc = rte_mempool_create("test_mempool_same_name_twice_creation", MEMPOOL_SIZE,
+	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
 						MEMPOOL_ELT_SIZE, 0, 0,
 						NULL, NULL,
 						NULL, NULL,
 						SOCKET_ID_ANY, 0);
-	if (NULL == mp_tc)
+	if (NULL == mp_tc) {
+		printf("cannot create mempool\n");
 		return -1;
+	}
 
-	mp_tc = rte_mempool_create("test_mempool_same_name_twice_creation", MEMPOOL_SIZE,
+	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
 						MEMPOOL_ELT_SIZE, 0, 0,
 						NULL, NULL,
 						NULL, NULL,
 						SOCKET_ID_ANY, 0);
-	if (NULL != mp_tc)
+	if (NULL != mp_tc) {
+		printf("should not be able to create mempool\n");
 		return -1;
+	}
 
 	return 0;
 }
diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 711c845..774eb5d 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -126,6 +126,7 @@ static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
+	struct rte_memzone *mz;
 	struct rte_mem_config *mcfg;
 	size_t requested_len;
 	int socket, i;
@@ -148,6 +149,13 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 		return NULL;
 	}
 
+	if (strlen(name) >= sizeof(mz->name) - 1) {
+		RTE_LOG(DEBUG, EAL, "%s(): memzone <%s>: name too long\n",
+			__func__, name);
+		rte_errno = EEXIST;
+		return NULL;
+	}
+
 	/* if alignment is not a power of two */
 	if (align && !rte_is_power_of_2(align)) {
 		RTE_LOG(ERR, EAL, "%s(): Invalid alignment: %u\n", __func__,
@@ -223,7 +231,7 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
-	struct rte_memzone *mz = get_next_free_memzone();
+	mz = get_next_free_memzone();
 
 	if (mz == NULL) {
 		RTE_LOG(ERR, EAL, "%s(): Cannot find free memzone but there is room "
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 4850f5d..1f998ef 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -303,11 +303,14 @@ rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
 {
-	int rg_flags = 0;
+	int rg_flags = 0, ret;
 	char rg_name[RTE_RING_NAMESIZE];
 	struct rte_ring *r;
 
-	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, mp->name);
+	ret = snprintf(rg_name, sizeof(rg_name),
+		RTE_MEMPOOL_MZ_FORMAT, mp->name);
+	if (ret < 0 || ret >= (int)sizeof(rg_name))
+		return -ENAMETOOLONG;
 
 	/* ring flags */
 	if (mp->flags & MEMPOOL_F_SP_PUT)
@@ -692,6 +695,7 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	struct rte_mempool_objsz objsz;
+	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -745,7 +749,11 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
 
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
+	ret = snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
+	if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+		rte_errno = ENAMETOOLONG;
+		goto exit_unlock;
+	}
 
 	mz = rte_memzone_reserve(mz_name, mempool_size, socket_id, mz_flags);
 	if (mz == NULL)
@@ -754,7 +762,11 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	/* init the mempool structure */
 	mp = mz->addr;
 	memset(mp, 0, sizeof(*mp));
-	snprintf(mp->name, sizeof(mp->name), "%s", name);
+	ret = snprintf(mp->name, sizeof(mp->name), "%s", name);
+	if (ret < 0 || ret >= (int)sizeof(mp->name)) {
+		rte_errno = ENAMETOOLONG;
+		goto exit_unlock;
+	}
 	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index d80faf3..ca0a108 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -122,6 +122,8 @@ int
 rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	unsigned flags)
 {
+	int ret;
+
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
 			  RTE_CACHE_LINE_MASK) != 0);
@@ -140,7 +142,9 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 
 	/* init the ring structure */
 	memset(r, 0, sizeof(*r));
-	snprintf(r->name, sizeof(r->name), "%s", name);
+	ret = snprintf(r->name, sizeof(r->name), "%s", name);
+	if (ret < 0 || ret >= (int)sizeof(r->name))
+		return -ENAMETOOLONG;
 	r->flags = flags;
 	r->prod.watermark = count;
 	r->prod.sp_enqueue = !!(flags & RING_F_SP_ENQ);
@@ -165,6 +169,7 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 	ssize_t ring_size;
 	int mz_flags = 0;
 	struct rte_ring_list* ring_list = NULL;
+	int ret;
 
 	ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list);
 
@@ -174,6 +179,13 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 		return NULL;
 	}
 
+	ret = snprintf(mz_name, sizeof(mz_name), "%s%s",
+		RTE_RING_MZ_PREFIX, name);
+	if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+		rte_errno = ENAMETOOLONG;
+		return NULL;
+	}
+
 	te = rte_zmalloc("RING_TAILQ_ENTRY", sizeof(*te), 0);
 	if (te == NULL) {
 		RTE_LOG(ERR, RING, "Cannot reserve memory for tailq\n");
@@ -181,8 +193,6 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 		return NULL;
 	}
 
-	snprintf(mz_name, sizeof(mz_name), "%s%s", RTE_RING_MZ_PREFIX, name);
-
 	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
 
 	/* reserve a memory zone for this ring. If we can't get rte_config or
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 34/36] mempool: new flag when phys contig mem is not needed
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (32 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 33/36] mem: avoid memzone/mempool/ring name truncation Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 35/36] app/test: rework mempool test Olivier Matz
                     ` (3 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Add a new flag to remove the constraint of having physically contiguous
objects inside a mempool.

Add this flag to the log history mempool to start, but we could add
it in most cases where objects are not mbufs.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/eal_common_log.c |  2 +-
 lib/librte_mempool/rte_mempool.c       | 23 ++++++++++++++++++++---
 lib/librte_mempool/rte_mempool.h       |  5 +++++
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
index 1ae8de7..9122b34 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -322,7 +322,7 @@ rte_eal_common_log_init(FILE *default_log)
 				LOG_ELT_SIZE, 0, 0,
 				NULL, NULL,
 				NULL, NULL,
-				SOCKET_ID_ANY, 0);
+				SOCKET_ID_ANY, MEMPOOL_F_NO_PHYS_CONTIG);
 
 	if ((log_history_mp == NULL) &&
 	    ((log_history_mp = rte_mempool_lookup(LOG_HISTORY_MP_NAME)) == NULL)){
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 1f998ef..7d4cabe 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -410,7 +410,11 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 
 	while (off + total_elt_sz <= len && mp->populated_size < mp->size) {
 		off += mp->header_size;
-		mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
+		if (paddr == RTE_BAD_PHYS_ADDR)
+			mempool_add_elem(mp, (char *)vaddr + off,
+				RTE_BAD_PHYS_ADDR);
+		else
+			mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
 		off += mp->elt_size + mp->trailer_size;
 		i++;
 	}
@@ -439,6 +443,10 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	if (mp->nb_mem_chunks != 0)
 		return -EEXIST;
 
+	if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+		return rte_mempool_populate_phys(mp, vaddr, RTE_BAD_PHYS_ADDR,
+			pg_num * pg_sz, free_cb, opaque);
+
 	for (i = 0; i < pg_num && mp->populated_size < mp->size; i += n) {
 
 		/* populate with the largest group of contiguous pages */
@@ -479,6 +487,10 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
 		return -EINVAL;
 
+	if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+		return rte_mempool_populate_phys(mp, addr, RTE_BAD_PHYS_ADDR,
+			len, free_cb, opaque);
+
 	for (off = 0; off + pg_sz <= len &&
 		     mp->populated_size < mp->size; off += phys_len) {
 
@@ -528,6 +540,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
 	size_t size, total_elt_sz, align, pg_sz, pg_shift;
+	phys_addr_t paddr;
 	unsigned mz_id, n;
 	int ret;
 
@@ -567,10 +580,14 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		/* use memzone physical address if it is valid */
+		if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+			paddr = RTE_BAD_PHYS_ADDR;
+		else
+			paddr = mz->phys_addr;
+
 		if (rte_eal_has_hugepages() && !rte_xen_dom0_supported())
 			ret = rte_mempool_populate_phys(mp, mz->addr,
-				mz->phys_addr, mz->len,
+				paddr, mz->len,
 				rte_mempool_memchunk_mz_free,
 				(void *)(uintptr_t)mz);
 		else
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index fe4e6fd..e6a257f 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -235,6 +235,7 @@ struct rte_mempool {
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
 #define MEMPOOL_F_RING_CREATED   0x0010 /**< Internal: ring is created */
+#define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
 
 /**
  * @internal When debug is enabled, store some statistics.
@@ -417,6 +418,8 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, void *);
  *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
  *     when using rte_mempool_get() or rte_mempool_get_bulk() is
  *     "single-consumer". Otherwise, it is "multi-consumers".
+ *   - MEMPOOL_F_NO_PHYS_CONTIG: If set, allocated objects won't
+ *     necessarilly be contiguous in physical memory.
  * @return
  *   The pointer to the new allocated mempool, on success. NULL on error
  *   with rte_errno set appropriately. Possible rte_errno values include:
@@ -1222,6 +1225,8 @@ rte_mempool_empty(const struct rte_mempool *mp)
  *   A pointer (virtual address) to the element of the pool.
  * @return
  *   The physical address of the elt element.
+ *   If the mempool was created with MEMPOOL_F_NO_PHYS_CONTIG, the
+ *   returned value is RTE_BAD_PHYS_ADDR.
  */
 static inline phys_addr_t
 rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 35/36] app/test: rework mempool test
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (33 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 34/36] mempool: new flag when phys contig mem is not needed Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 10:19   ` [PATCH 36/36] mempool: update copyright Olivier Matz
                     ` (2 subsequent siblings)
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Rework the mempool test to better indicate where it failed,
and, now that this feature is available, add the freeing of the
mempool after the test is done.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c | 232 +++++++++++++++++++++++++++---------------------
 1 file changed, 129 insertions(+), 103 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index e0d5c61..c96ed27 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -77,13 +77,13 @@
 #define MAX_KEEP 128
 #define MEMPOOL_SIZE ((rte_lcore_count()*(MAX_KEEP+RTE_MEMPOOL_CACHE_MAX_SIZE))-1)
 
-static struct rte_mempool *mp;
-static struct rte_mempool *mp_cache, *mp_nocache;
+#define RET_ERR() do {							\
+		printf("test failed at %s():%d\n", __func__, __LINE__); \
+		return -1;						\
+	} while(0)
 
 static rte_atomic32_t synchro;
 
-
-
 /*
  * save the object number in the first 4 bytes of object data. All
  * other bytes are set to 0.
@@ -93,13 +93,14 @@ my_obj_init(struct rte_mempool *mp, __attribute__((unused)) void *arg,
 	    void *obj, unsigned i)
 {
 	uint32_t *objnum = obj;
+
 	memset(obj, 0, mp->elt_size);
 	*objnum = i;
 }
 
 /* basic tests (done on one core) */
 static int
-test_mempool_basic(void)
+test_mempool_basic(struct rte_mempool *mp)
 {
 	uint32_t *objnum;
 	void **objtable;
@@ -113,23 +114,23 @@ test_mempool_basic(void)
 
 	printf("get an object\n");
 	if (rte_mempool_get(mp, &obj) < 0)
-		return -1;
+		RET_ERR();
 	rte_mempool_dump(stdout, mp);
 
 	/* tests that improve coverage */
 	printf("get object count\n");
 	if (rte_mempool_count(mp) != MEMPOOL_SIZE - 1)
-		return -1;
+		RET_ERR();
 
 	printf("get private data\n");
 	if (rte_mempool_get_priv(mp) != (char *)mp +
 			MEMPOOL_HEADER_SIZE(mp, mp->cache_size))
-		return -1;
+		RET_ERR();
 
 #ifndef RTE_EXEC_ENV_BSDAPP /* rte_mem_virt2phy() not supported on bsd */
 	printf("get physical address of an object\n");
 	if (rte_mempool_virt2phy(mp, obj) != rte_mem_virt2phy(obj))
-		return -1;
+		RET_ERR();
 #endif
 
 	printf("put the object back\n");
@@ -138,10 +139,10 @@ test_mempool_basic(void)
 
 	printf("get 2 objects\n");
 	if (rte_mempool_get(mp, &obj) < 0)
-		return -1;
+		RET_ERR();
 	if (rte_mempool_get(mp, &obj2) < 0) {
 		rte_mempool_put(mp, obj);
-		return -1;
+		RET_ERR();
 	}
 	rte_mempool_dump(stdout, mp);
 
@@ -155,11 +156,10 @@ test_mempool_basic(void)
 	 * on other cores may not be empty.
 	 */
 	objtable = malloc(MEMPOOL_SIZE * sizeof(void *));
-	if (objtable == NULL) {
-		return -1;
-	}
+	if (objtable == NULL)
+		RET_ERR();
 
-	for (i=0; i<MEMPOOL_SIZE; i++) {
+	for (i = 0; i < MEMPOOL_SIZE; i++) {
 		if (rte_mempool_get(mp, &objtable[i]) < 0)
 			break;
 	}
@@ -173,11 +173,11 @@ test_mempool_basic(void)
 		obj_data = obj;
 		objnum = obj;
 		if (*objnum > MEMPOOL_SIZE) {
-			printf("bad object number\n");
+			printf("bad object number(%d)\n", *objnum);
 			ret = -1;
 			break;
 		}
-		for (j=sizeof(*objnum); j<mp->elt_size; j++) {
+		for (j = sizeof(*objnum); j < mp->elt_size; j++) {
 			if (obj_data[j] != 0)
 				ret = -1;
 		}
@@ -196,14 +196,17 @@ static int test_mempool_creation_with_exceeded_cache_size(void)
 {
 	struct rte_mempool *mp_cov;
 
-	mp_cov = rte_mempool_create("test_mempool_creation_with_exceeded_cache_size", MEMPOOL_SIZE,
-					      MEMPOOL_ELT_SIZE,
-					      RTE_MEMPOOL_CACHE_MAX_SIZE + 32, 0,
-					      NULL, NULL,
-					      my_obj_init, NULL,
-					      SOCKET_ID_ANY, 0);
-	if(NULL != mp_cov) {
-		return -1;
+	mp_cov = rte_mempool_create("test_mempool_cache_too_big",
+		MEMPOOL_SIZE,
+		MEMPOOL_ELT_SIZE,
+		RTE_MEMPOOL_CACHE_MAX_SIZE + 32, 0,
+		NULL, NULL,
+		my_obj_init, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_cov != NULL) {
+		rte_mempool_free(mp_cov);
+		RET_ERR();
 	}
 
 	return 0;
@@ -241,8 +244,8 @@ static int test_mempool_single_producer(void)
 			continue;
 		}
 		if (rte_mempool_from_obj(obj) != mp_spsc) {
-			printf("test_mempool_single_producer there is an obj not owned by this mempool\n");
-			return -1;
+			printf("obj not owned by this mempool\n");
+			RET_ERR();
 		}
 		rte_mempool_sp_put(mp_spsc, obj);
 		rte_spinlock_lock(&scsp_spinlock);
@@ -288,7 +291,8 @@ static int test_mempool_single_consumer(void)
 }
 
 /*
- * test function for mempool test based on singple consumer and single producer, can run on one lcore only
+ * test function for mempool test based on singple consumer and single producer,
+ * can run on one lcore only
  */
 static int test_mempool_launch_single_consumer(__attribute__((unused)) void *arg)
 {
@@ -313,33 +317,41 @@ test_mempool_sp_sc(void)
 	unsigned lcore_next;
 
 	/* create a mempool with single producer/consumer ring */
-	if (NULL == mp_spsc) {
+	if (mp_spsc == NULL) {
 		mp_spsc = rte_mempool_create("test_mempool_sp_sc", MEMPOOL_SIZE,
-						MEMPOOL_ELT_SIZE, 0, 0,
-						my_mp_init, NULL,
-						my_obj_init, NULL,
-						SOCKET_ID_ANY, MEMPOOL_F_NO_CACHE_ALIGN | MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET);
-		if (NULL == mp_spsc) {
-			return -1;
-		}
+			MEMPOOL_ELT_SIZE, 0, 0,
+			my_mp_init, NULL,
+			my_obj_init, NULL,
+			SOCKET_ID_ANY,
+			MEMPOOL_F_NO_CACHE_ALIGN | MEMPOOL_F_SP_PUT |
+			MEMPOOL_F_SC_GET);
+		if (mp_spsc == NULL)
+			RET_ERR();
 	}
 	if (rte_mempool_lookup("test_mempool_sp_sc") != mp_spsc) {
 		printf("Cannot lookup mempool from its name\n");
-		return -1;
+		rte_mempool_free(mp_spsc);
+		RET_ERR();
 	}
 	lcore_next = rte_get_next_lcore(lcore_id, 0, 1);
-	if (RTE_MAX_LCORE <= lcore_next)
-		return -1;
-	if (rte_eal_lcore_role(lcore_next) != ROLE_RTE)
-		return -1;
+	if (RTE_MAX_LCORE <= lcore_next) {
+		rte_mempool_free(mp_spsc);
+		RET_ERR();
+	}
+	if (rte_eal_lcore_role(lcore_next) != ROLE_RTE) {
+		rte_mempool_free(mp_spsc);
+		RET_ERR();
+	}
 	rte_spinlock_init(&scsp_spinlock);
 	memset(scsp_obj_table, 0, sizeof(scsp_obj_table));
-	rte_eal_remote_launch(test_mempool_launch_single_consumer, NULL, lcore_next);
-	if(test_mempool_single_producer() < 0)
+	rte_eal_remote_launch(test_mempool_launch_single_consumer, NULL,
+		lcore_next);
+	if (test_mempool_single_producer() < 0)
 		ret = -1;
 
-	if(rte_eal_wait_lcore(lcore_next) < 0)
+	if (rte_eal_wait_lcore(lcore_next) < 0)
 		ret = -1;
+	rte_mempool_free(mp_spsc);
 
 	return ret;
 }
@@ -348,7 +360,7 @@ test_mempool_sp_sc(void)
  * it tests some more basic of mempool
  */
 static int
-test_mempool_basic_ex(struct rte_mempool * mp)
+test_mempool_basic_ex(struct rte_mempool *mp)
 {
 	unsigned i;
 	void **obj;
@@ -358,38 +370,41 @@ test_mempool_basic_ex(struct rte_mempool * mp)
 	if (mp == NULL)
 		return ret;
 
-	obj = rte_calloc("test_mempool_basic_ex", MEMPOOL_SIZE , sizeof(void *), 0);
+	obj = rte_calloc("test_mempool_basic_ex", MEMPOOL_SIZE,
+		sizeof(void *), 0);
 	if (obj == NULL) {
 		printf("test_mempool_basic_ex fail to rte_malloc\n");
 		return ret;
 	}
-	printf("test_mempool_basic_ex now mempool (%s) has %u free entries\n", mp->name, rte_mempool_free_count(mp));
+	printf("test_mempool_basic_ex now mempool (%s) has %u free entries\n",
+		mp->name, rte_mempool_free_count(mp));
 	if (rte_mempool_full(mp) != 1) {
-		printf("test_mempool_basic_ex the mempool is not full but it should be\n");
+		printf("test_mempool_basic_ex the mempool should be full\n");
 		goto fail_mp_basic_ex;
 	}
 
 	for (i = 0; i < MEMPOOL_SIZE; i ++) {
 		if (rte_mempool_mc_get(mp, &obj[i]) < 0) {
-			printf("fail_mp_basic_ex fail to get mempool object for [%u]\n", i);
+			printf("test_mp_basic_ex fail to get object for [%u]\n",
+				i);
 			goto fail_mp_basic_ex;
 		}
 	}
 	if (rte_mempool_mc_get(mp, &err_obj) == 0) {
-		printf("test_mempool_basic_ex get an impossible obj from mempool\n");
+		printf("test_mempool_basic_ex get an impossible obj\n");
 		goto fail_mp_basic_ex;
 	}
 	printf("number: %u\n", i);
 	if (rte_mempool_empty(mp) != 1) {
-		printf("test_mempool_basic_ex the mempool is not empty but it should be\n");
+		printf("test_mempool_basic_ex the mempool should be empty\n");
 		goto fail_mp_basic_ex;
 	}
 
-	for (i = 0; i < MEMPOOL_SIZE; i ++) {
+	for (i = 0; i < MEMPOOL_SIZE; i++)
 		rte_mempool_mp_put(mp, obj[i]);
-	}
+
 	if (rte_mempool_full(mp) != 1) {
-		printf("test_mempool_basic_ex the mempool is not full but it should be\n");
+		printf("test_mempool_basic_ex the mempool should be full\n");
 		goto fail_mp_basic_ex;
 	}
 
@@ -405,28 +420,30 @@ fail_mp_basic_ex:
 static int
 test_mempool_same_name_twice_creation(void)
 {
-	struct rte_mempool *mp_tc;
+	struct rte_mempool *mp_tc, *mp_tc2;
 
 	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
-						MEMPOOL_ELT_SIZE, 0, 0,
-						NULL, NULL,
-						NULL, NULL,
-						SOCKET_ID_ANY, 0);
-	if (NULL == mp_tc) {
-		printf("cannot create mempool\n");
-		return -1;
-	}
-
-	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
-						MEMPOOL_ELT_SIZE, 0, 0,
-						NULL, NULL,
-						NULL, NULL,
-						SOCKET_ID_ANY, 0);
-	if (NULL != mp_tc) {
-		printf("should not be able to create mempool\n");
-		return -1;
+		MEMPOOL_ELT_SIZE, 0, 0,
+		NULL, NULL,
+		NULL, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_tc == NULL)
+		RET_ERR();
+
+	mp_tc2 = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
+		MEMPOOL_ELT_SIZE, 0, 0,
+		NULL, NULL,
+		NULL, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_tc2 != NULL) {
+		rte_mempool_free(mp_tc);
+		rte_mempool_free(mp_tc2);
+		RET_ERR();
 	}
 
+	rte_mempool_free(mp_tc);
 	return 0;
 }
 
@@ -447,7 +464,7 @@ test_mempool_xmem_misc(void)
 	usz = rte_mempool_xmem_usage(NULL, elt_num, total_size, 0, 1,
 		MEMPOOL_PG_SHIFT_MAX);
 
-	if(sz != (size_t)usz)  {
+	if (sz != (size_t)usz)  {
 		printf("failure @ %s: rte_mempool_xmem_usage(%u, %u) "
 			"returns: %#zx, while expected: %#zx;\n",
 			__func__, elt_num, total_size, sz, (size_t)usz);
@@ -460,68 +477,77 @@ test_mempool_xmem_misc(void)
 static int
 test_mempool(void)
 {
+	struct rte_mempool *mp_cache = NULL;
+	struct rte_mempool *mp_nocache = NULL;
+
 	rte_atomic32_init(&synchro);
 
 	/* create a mempool (without cache) */
-	if (mp_nocache == NULL)
-		mp_nocache = rte_mempool_create("test_nocache", MEMPOOL_SIZE,
-						MEMPOOL_ELT_SIZE, 0, 0,
-						NULL, NULL,
-						my_obj_init, NULL,
-						SOCKET_ID_ANY, 0);
-	if (mp_nocache == NULL)
-		return -1;
+	mp_nocache = rte_mempool_create("test_nocache", MEMPOOL_SIZE,
+		MEMPOOL_ELT_SIZE, 0, 0,
+		NULL, NULL,
+		my_obj_init, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_nocache == NULL) {
+		printf("cannot allocate mp_nocache mempool\n");
+		goto err;
+	}
 
 	/* create a mempool (with cache) */
-	if (mp_cache == NULL)
-		mp_cache = rte_mempool_create("test_cache", MEMPOOL_SIZE,
-					      MEMPOOL_ELT_SIZE,
-					      RTE_MEMPOOL_CACHE_MAX_SIZE, 0,
-					      NULL, NULL,
-					      my_obj_init, NULL,
-					      SOCKET_ID_ANY, 0);
-	if (mp_cache == NULL)
-		return -1;
-
+	mp_cache = rte_mempool_create("test_cache", MEMPOOL_SIZE,
+		MEMPOOL_ELT_SIZE,
+		RTE_MEMPOOL_CACHE_MAX_SIZE, 0,
+		NULL, NULL,
+		my_obj_init, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_cache == NULL) {
+		printf("cannot allocate mp_cache mempool\n");
+		goto err;
+	}
 
 	/* retrieve the mempool from its name */
 	if (rte_mempool_lookup("test_nocache") != mp_nocache) {
 		printf("Cannot lookup mempool from its name\n");
-		return -1;
+		goto err;
 	}
 
 	rte_mempool_list_dump(stdout);
 
 	/* basic tests without cache */
-	mp = mp_nocache;
-	if (test_mempool_basic() < 0)
-		return -1;
+	if (test_mempool_basic(mp_nocache) < 0)
+		goto err;
 
 	/* basic tests with cache */
-	mp = mp_cache;
-	if (test_mempool_basic() < 0)
-		return -1;
+	if (test_mempool_basic(mp_cache) < 0)
+		goto err;
 
 	/* more basic tests without cache */
 	if (test_mempool_basic_ex(mp_nocache) < 0)
-		return -1;
+		goto err;
 
 	/* mempool operation test based on single producer and single comsumer */
 	if (test_mempool_sp_sc() < 0)
-		return -1;
+		goto err;
 
 	if (test_mempool_creation_with_exceeded_cache_size() < 0)
-		return -1;
+		goto err;
 
 	if (test_mempool_same_name_twice_creation() < 0)
-		return -1;
+		goto err;
 
 	if (test_mempool_xmem_misc() < 0)
-		return -1;
+		goto err;
 
 	rte_mempool_list_dump(stdout);
 
 	return 0;
+
+err:
+	rte_mempool_free(mp_nocache);
+	rte_mempool_free(mp_cache);
+	return -1;
 }
 
 static struct test_command mempool_cmd = {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH 36/36] mempool: update copyright
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (34 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 35/36] app/test: rework mempool test Olivier Matz
@ 2016-04-14 10:19   ` Olivier Matz
  2016-04-14 13:50   ` [PATCH 00/36] mempool: rework memory allocation Wiles, Keith
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
  37 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-14 10:19 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen

Update the copyright of files touched by this patch series.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 1 +
 lib/librte_mempool/rte_mempool.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 7d4cabe..7104a41 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2016 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index e6a257f..96bd047 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2016 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* Re: [PATCH 00/36] mempool: rework memory allocation
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (35 preceding siblings ...)
  2016-04-14 10:19   ` [PATCH 36/36] mempool: update copyright Olivier Matz
@ 2016-04-14 13:50   ` Wiles, Keith
  2016-04-14 14:01     ` Olivier MATZ
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
  37 siblings, 1 reply; 150+ messages in thread
From: Wiles, Keith @ 2016-04-14 13:50 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: Richardson, Bruce, stephen

>This series is a rework of mempool. For those who don't want to read
>all the cover letter, here is a sumary:
>
>- it is not possible to allocate large mempools if there is not enough
>  contiguous memory, this series solves this issue
>- introduce new APIs with less arguments: "create, populate, obj_init"
>- allow to free a mempool
>- split code in smaller functions, will ease the introduction of ext_handler
>- remove test-pmd anonymous mempool creation
>- remove most of dom0-specific mempool code
>- opens the door for a eal_memory rework: we probably don't need large
>  contiguous memory area anymore, working with pages would work.
>
>This breaks the ABI as it was indicated in the deprecation for 16.04.
>The API stays almost the same, no modification is needed in examples app
>or in test-pmd. Only kni and mellanox drivers are slightly modified.
>
>This patch applies on top of 16.04 + v5 of Keith's patch:
>"mempool: reduce rte_mempool structure size"

I have not digested this complete patch yet, but this one popped out at me as the External Memory Manager support is setting in the wings for 16.07 release. If this causes the EMM patch to be rewritten or updated that seems like a problem to me. Does this patch add the External Memory Manager support?
http://thread.gmane.org/gmane.comp.networking.dpdk.devel/32015/focus=35107


>
>Changes RFC -> v1:
>
>- remove the rte_deconst macro, and remove some const qualifier in
>  dump/audit functions
>- rework modifications in mellanox drivers to ensure the mempool is
>  virtually contiguous
>- fix mempool memory chunk iteration (bad pointer was used)
>- fix compilation on freebsd: replace MAP_LOCKED flag by mlock()
>- fix compilation on tilera (pointer arithmetics)
>- slightly rework and clean the mempool autotest
>- fix mempool autotest on bsd
>- more validation (especially mellanox drivers and kni that were not
>  tested in RFC)
>- passed autotests (x86_64-native-linuxapp-gcc and x86_64-native-bsdapp-gcc)
>- rebase on head, reorder the patches a bit and fix minor split issues
>
>
>Description of the initial issue
>--------------------------------
>
>The allocation of mbuf pool can fail even if there is enough memory.
>The problem is related to the way the memory is allocated and used in
>dpdk. It is particularly annoying with mbuf pools, but it can also fail
>in other use cases allocating a large amount of memory.
>
>- rte_malloc() allocates physically contiguous memory, which is needed
>  for mempools, but useless most of the time.
>
>  Allocating a large physically contiguous zone is often impossible
>  because the system provide hugepages which may not be contiguous.
>
>- rte_mempool_create() (and therefore rte_pktmbuf_pool_create())
>  requires a physically contiguous zone.
>
>- rte_mempool_xmem_create() does not solve the issue as it still
>  needs the memory to be virtually contiguous, and there is no
>  way in dpdk to allocate a virtually contiguous memory that is
>  not also physically contiguous.
>
>How to reproduce the issue
>--------------------------
>
>- start the dpdk with some 2MB hugepages (it can also occur with 1GB)
>- allocate a large mempool
>- even if there is enough memory, the allocation can fail
>
>Example:
>
>  git clone http://dpdk.org/git/dpdk
>  cd dpdk
>  make config T=x86_64-native-linuxapp-gcc
>  make -j32
>  mkdir -p /mnt/huge
>  mount -t hugetlbfs nodev /mnt/huge
>  echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
>
>  # we try to allocate a mempool whose size is ~450MB, it fails
>  ./build/app/testpmd -l 2,4 -- --total-num-mbufs=200000 -i
>
>The EAL logs "EAL: Virtual area found at..." shows that there are
>several zones, but all smaller than 450MB.
>
>Workarounds:
>
>- Use 1GB hugepages: it sometimes work, but for very large
>  pools (millions of mbufs) there is the same issue. Moreover,
>  it would consume 1GB memory at least which can be a lot
>  in some cases.
>
>- Reboot the machine or allocate hugepages at boot time: this increases
>  the chances to have more contiguous memory, but does not completely
>  solve the issue
>
>Solutions
>---------
>
>Below is a list of proposed solutions. I implemented a quick and dirty
>PoC of solution 1, but it's not working in all conditions and it's
>really an ugly hack.  This series implement the solution 4 which looks
>the best to me, knowing it does not prevent to do more enhancements
>in dpdk memory in the future (solution 3 for instance).
>
>Solution 1: in application
>--------------------------
>
>- allocate several hugepages using rte_malloc() or rte_memzone_reserve()
>  (only keeping complete hugepages)
>- parse memsegs and /proc/maps to check which files mmaps these pages
>- mmap the files in a contiguous virtual area
>- use rte_mempool_xmem_create()
>
>Cons:
>
>- 1a. parsing the memsegs of rte config in the application does not
>  use a public API, and can be broken if internal dpdk code changes
>- 1b. some memory is lost due to malloc headers. Also, if the memory is
>  very fragmented (ex: all 2MB pages are physically separated), it does
>  not work at all because we cannot get any complete page. It is not
>  possible to use a lower level allocator since commit fafcc11985a.
>- 1c. we cannot use rte_pktmbuf_pool_create(), so we need to use mempool
>  api and do a part of the job manually
>- 1d. it breaks secondary processes as the virtual addresses won't be
>  mmap'd at the same place in secondary process
>- 1e. it only fixes the issue for the mbuf pool of the application,
>  internal pools in dpdk libraries are not modified
>- 1f. this is a pure linux solution (rte_map files)
>- 1g. The application has to be aware of RTE_EAL_SINGLE_SEGMENTS option
>  that changes the way hugepages are mapped. By the way, it's strange
>  to have such a compile-time option, we should probably have only
>  one behavior that works all the time.
>
>Solution 2: in dpdk memory allocator
>------------------------------------
>
>- do the same than solution 1 in a new function rte_malloc_non_contig():
>  allocate several chunks and mmap them in a contiguous virtual memory
>- a flag has to be added in malloc header to do the proper cleanup in
>  rte_free() (free all the chunks, munmap the memory)
>- introduce a new rte_mem_get_physmap(*physmap,addr, len) that returns
>  the virt2phys mapping of a virtual area in dpdk
>- add a mempool flag MEMPOOL_F_NON_PHYS_CONTIG to use
>  rte_malloc_non_contig() to allocate the area storing the objects
>
>Cons:
>
>- 2a. same than 1b: it breaks secondary processes if the mempool flag is
>  used.
>- 2b. same as 1d: some memory is lost due to malloc headers, and it
>  cannot work if memory is too fragmented.
>- 2c. rte_malloc_virt2phy() cannot be used on these zones. It would
>  return the physical address of the first page. It would be better to
>  return an error in this case.
>- 2d. need to check how to implement this on bsd (TBD)
>
>Solution 3: in dpdk eal memory
>------------------------------
>
>- Rework the way hugepages are mmap'd in dpdk: instead of having several
>  rte_map* files, just mmap one file per node. It may drastically
>  simplify EAL memory management in dpdk.
>- An API should be added to retrieve the physical mapping of a virtual
>  area (ex: rte_mem_get_physmap(*physmap, addr, len))
>- rte_malloc() and rte_memzone_reserve() won't allocate physically
>  contiguous memory anymore (TBD)
>- Update mempool to always use the rte_mempool_xmem_create() version
>
>Cons:
>
>- 3a. lot of rework in eal memory, it will induce some behavior changes
>  and maybe api changes
>- 3b. possible conflicts with xen_dom0 mempool
>
>Solution 4: in mempool
>----------------------
>
>- Introduce a new API to fill a mempool with zones that are not
>  virtually contiguous. It requires to add new functions to create and
>  populate a mempool. Example (TBD):
>
>  - rte_mempool_create_empty(name, n, elt_size, cache_size, priv_size)
>  - rte_mempool_populate(mp, addr, len): add virtual memory for objects
>  - rte_mempool_mempool_obj_iter(mp, obj_cb, arg): call a cb for each object
>
>- update rte_mempool_create() to allocate objects in several memory
>  chunks by default if there is no large enough physically contiguous
>  memory.
>
>Tests done
>----------
>
>Compilation
>~~~~~~~~~~~
>
>The following targets:
>
> x86_64-native-linuxapp-gcc
> i686-native-linuxapp-gcc
> x86_x32-native-linuxapp-gcc
> x86_64-native-linuxapp-clang
> x86_64-native-bsdapp-gcc
> ppc_64-power8-linuxapp-gcc
> tile-tilegx-linuxapp-gcc (only the mempool files, the target does not compile)
>
>Libraries with and without debug, in static and shared mode + examples.
>
>autotests
>~~~~~~~~~
>
>Passed all autotests on x86_64-native-linuxapp-gcc (including kni) and
>mempool-related autotests on x86_64-native-bsdapp-gcc.
>
>test-pmd
>~~~~~~~~
>
># now starts fine, was failing before if mempool was too fragmented
>./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -- -i --port-topology=chained
>
># still ok
>./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -m 256 -- -i --port-topology=chained --mp-anon
>set fwd txonly
>start
>stop
>
># fail, but was failing before too. The problem is because the physical
># addresses are not properly set when using --no-huge. The mempool phys addr
># are now correct, but the zones allocated through memzone_reserve() are
># still wrong. This could be fixed in a future series.
>./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -m 256 --no-huge -- -i ---port-topology=chained
>set fwd txonly
>start
>stop
>
>
>Olivier Matz (36):
>  mempool: fix comments and style
>  mempool: replace elt_size by total_elt_size
>  mempool: uninline function to check cookies
>  mempool: use sizeof to get the size of header and trailer
>  mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t
>  mempool: update library version
>  mempool: list objects when added in the mempool
>  mempool: remove const attribute in mempool_walk
>  mempool: remove const qualifier in dump and audit
>  mempool: use the list to iterate the mempool elements
>  mempool: use the list to audit all elements
>  mempool: use the list to initialize mempool objects
>  mempool: create the internal ring in a specific function
>  mempool: store physaddr in mempool objects
>  mempool: remove MEMPOOL_IS_CONTIG()
>  mempool: store memory chunks in a list
>  mempool: new function to iterate the memory chunks
>  mempool: simplify xmem_usage
>  mempool: introduce a free callback for memory chunks
>  mempool: make page size optional when getting xmem size
>  mempool: default allocation in several memory chunks
>  eal: lock memory when using no-huge
>  mempool: support no-hugepage mode
>  mempool: replace mempool physaddr by a memzone pointer
>  mempool: introduce a function to free a mempool
>  mempool: introduce a function to create an empty mempool
>  eal/xen: return machine address without knowing memseg id
>  mempool: rework support of xen dom0
>  mempool: create the internal ring when populating
>  mempool: populate a mempool with anonymous memory
>  mempool: make mempool populate and free api public
>  test-pmd: remove specific anon mempool code
>  mem: avoid memzone/mempool/ring name truncation
>  mempool: new flag when phys contig mem is not needed
>  app/test: rework mempool test
>  mempool: update copyright
>
> app/test-pmd/Makefile                        |    4 -
> app/test-pmd/mempool_anon.c                  |  201 -----
> app/test-pmd/mempool_osdep.h                 |   54 --
> app/test-pmd/testpmd.c                       |   23 +-
> app/test/test_mempool.c                      |  243 +++---
> doc/guides/rel_notes/release_16_04.rst       |    2 +-
> drivers/net/mlx4/mlx4.c                      |  140 ++--
> drivers/net/mlx5/mlx5_rxtx.c                 |  140 ++--
> drivers/net/mlx5/mlx5_rxtx.h                 |    4 +-
> drivers/net/xenvirt/rte_eth_xenvirt.h        |    2 +-
> drivers/net/xenvirt/rte_mempool_gntalloc.c   |    4 +-
> lib/librte_eal/common/eal_common_log.c       |    2 +-
> lib/librte_eal/common/eal_common_memzone.c   |   10 +-
> lib/librte_eal/common/include/rte_memory.h   |   11 +-
> lib/librte_eal/linuxapp/eal/eal_memory.c     |    2 +-
> lib/librte_eal/linuxapp/eal/eal_xen_memory.c |   17 +-
> lib/librte_kni/rte_kni.c                     |   12 +-
> lib/librte_mempool/Makefile                  |    5 +-
> lib/librte_mempool/rte_dom0_mempool.c        |  133 ----
> lib/librte_mempool/rte_mempool.c             | 1042 +++++++++++++++++---------
> lib/librte_mempool/rte_mempool.h             |  594 +++++++--------
> lib/librte_mempool/rte_mempool_version.map   |   18 +-
> lib/librte_ring/rte_ring.c                   |   16 +-
> 23 files changed, 1377 insertions(+), 1302 deletions(-)
> delete mode 100644 app/test-pmd/mempool_anon.c
> delete mode 100644 app/test-pmd/mempool_osdep.h
> delete mode 100644 lib/librte_mempool/rte_dom0_mempool.c
>
>-- 
>2.1.4
>
>


Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 00/36] mempool: rework memory allocation
  2016-04-14 13:50   ` [PATCH 00/36] mempool: rework memory allocation Wiles, Keith
@ 2016-04-14 14:01     ` Olivier MATZ
  2016-04-14 14:03       ` Wiles, Keith
  2016-04-14 14:20       ` Hunt, David
  0 siblings, 2 replies; 150+ messages in thread
From: Olivier MATZ @ 2016-04-14 14:01 UTC (permalink / raw)
  To: Wiles, Keith, dev; +Cc: Richardson, Bruce, stephen



On 04/14/2016 03:50 PM, Wiles, Keith wrote:
>> This series is a rework of mempool. For those who don't want to read
>> all the cover letter, here is a sumary:
>>
>> - it is not possible to allocate large mempools if there is not enough
>>   contiguous memory, this series solves this issue
>> - introduce new APIs with less arguments: "create, populate, obj_init"
>> - allow to free a mempool
>> - split code in smaller functions, will ease the introduction of ext_handler
>> - remove test-pmd anonymous mempool creation
>> - remove most of dom0-specific mempool code
>> - opens the door for a eal_memory rework: we probably don't need large
>>   contiguous memory area anymore, working with pages would work.
>>
>> This breaks the ABI as it was indicated in the deprecation for 16.04.
>> The API stays almost the same, no modification is needed in examples app
>> or in test-pmd. Only kni and mellanox drivers are slightly modified.
>>
>> This patch applies on top of 16.04 + v5 of Keith's patch:
>> "mempool: reduce rte_mempool structure size"
>
> I have not digested this complete patch yet, but this one popped out at me as the External Memory Manager support is setting in the wings for 16.07 release. If this causes the EMM patch to be rewritten or updated that seems like a problem to me. Does this patch add the External Memory Manager support?
> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/32015/focus=35107

I've reworked the series you are referring to, and rebased it on top
of this series. Please see:
http://dpdk.org/ml/archives/dev/2016-April/037509.html

Regards,
Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 00/36] mempool: rework memory allocation
  2016-04-14 14:01     ` Olivier MATZ
@ 2016-04-14 14:03       ` Wiles, Keith
  2016-04-14 14:20       ` Hunt, David
  1 sibling, 0 replies; 150+ messages in thread
From: Wiles, Keith @ 2016-04-14 14:03 UTC (permalink / raw)
  To: Olivier MATZ, dev; +Cc: Richardson, Bruce, stephen

>
>
>On 04/14/2016 03:50 PM, Wiles, Keith wrote:
>>> This series is a rework of mempool. For those who don't want to read
>>> all the cover letter, here is a sumary:
>>>
>>> - it is not possible to allocate large mempools if there is not enough
>>>   contiguous memory, this series solves this issue
>>> - introduce new APIs with less arguments: "create, populate, obj_init"
>>> - allow to free a mempool
>>> - split code in smaller functions, will ease the introduction of ext_handler
>>> - remove test-pmd anonymous mempool creation
>>> - remove most of dom0-specific mempool code
>>> - opens the door for a eal_memory rework: we probably don't need large
>>>   contiguous memory area anymore, working with pages would work.
>>>
>>> This breaks the ABI as it was indicated in the deprecation for 16.04.
>>> The API stays almost the same, no modification is needed in examples app
>>> or in test-pmd. Only kni and mellanox drivers are slightly modified.
>>>
>>> This patch applies on top of 16.04 + v5 of Keith's patch:
>>> "mempool: reduce rte_mempool structure size"
>>
>> I have not digested this complete patch yet, but this one popped out at me as the External Memory Manager support is setting in the wings for 16.07 release. If this causes the EMM patch to be rewritten or updated that seems like a problem to me. Does this patch add the External Memory Manager support?
>> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/32015/focus=35107
>
>I've reworked the series you are referring to, and rebased it on top
>of this series. Please see:
>http://dpdk.org/ml/archives/dev/2016-April/037509.html

Thanks I just saw that update :-)

>
>Regards,
>Olivier
>


Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 01/36] mempool: fix comments and style
  2016-04-14 10:19   ` [PATCH 01/36] mempool: fix comments and style Olivier Matz
@ 2016-04-14 14:15     ` Wiles, Keith
  0 siblings, 0 replies; 150+ messages in thread
From: Wiles, Keith @ 2016-04-14 14:15 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: Richardson, Bruce, stephen

>No functional change, just fix some comments and styling issues.
>Also avoid to duplicate comments between rte_mempool_create()
>and rte_mempool_xmem_create().
>
>Signed-off-by: Olivier Matz <olivier.matz@6wind.com>

Acked by: Keith Wiles <keith.wiles@intel.com>
>---
> lib/librte_mempool/rte_mempool.c | 17 +++++++++---
> lib/librte_mempool/rte_mempool.h | 59 +++++++++-------------------------------
> 2 files changed, 26 insertions(+), 50 deletions(-)
>
>diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
>index 7a0e07e..ce78476 100644
>--- a/lib/librte_mempool/rte_mempool.c
>+++ b/lib/librte_mempool/rte_mempool.c
>@@ -152,6 +152,13 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
> 	rte_ring_sp_enqueue(mp->ring, obj);
> }
> 
>+/* Iterate through objects at the given address
>+ *
>+ * Given the pointer to the memory, and its topology in physical memory
>+ * (the physical addresses table), iterate through the "elt_num" objects
>+ * of size "total_elt_sz" aligned at "align". For each object in this memory
>+ * chunk, invoke a callback. It returns the effective number of objects
>+ * in this memory. */
> uint32_t
> rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
> 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
>@@ -341,10 +348,8 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
> 	return sz;
> }
> 
>-/*
>- * Calculate how much memory would be actually required with the
>- * given memory footprint to store required number of elements.
>- */
>+/* Callback used by rte_mempool_xmem_usage(): it sets the opaque
>+ * argument to the end of the object. */
> static void
> mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
> 	__rte_unused uint32_t idx)
>@@ -352,6 +357,10 @@ mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
> 	*(uintptr_t *)arg = (uintptr_t)end;
> }
> 
>+/*
>+ * Calculate how much memory would be actually required with the
>+ * given memory footprint to store required number of elements.
>+ */
> ssize_t
> rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
> 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
>diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
>index 8595e77..bd78df5 100644
>--- a/lib/librte_mempool/rte_mempool.h
>+++ b/lib/librte_mempool/rte_mempool.h
>@@ -214,7 +214,7 @@ struct rte_mempool {
> 
> }  __rte_cache_aligned;
> 
>-#define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread in memory. */
>+#define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread among memory channels. */
> #define MEMPOOL_F_NO_CACHE_ALIGN 0x0002 /**< Do not align objs on cache lines.*/
> #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
> #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
>@@ -270,7 +270,8 @@ struct rte_mempool {
> /* return the header of a mempool object (internal) */
> static inline struct rte_mempool_objhdr *__mempool_get_header(void *obj)
> {
>-	return (struct rte_mempool_objhdr *)RTE_PTR_SUB(obj, sizeof(struct rte_mempool_objhdr));
>+	return (struct rte_mempool_objhdr *)RTE_PTR_SUB(obj,
>+		sizeof(struct rte_mempool_objhdr));
> }
> 
> /**
>@@ -544,8 +545,9 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
> /**
>  * Create a new mempool named *name* in memory.
>  *
>- * This function uses ``memzone_reserve()`` to allocate memory. The
>- * pool contains n elements of elt_size. Its size is set to n.
>+ * The pool contains n elements of elt_size. Its size is set to n.
>+ * This function uses ``memzone_reserve()`` to allocate the mempool header
>+ * (and the objects if vaddr is NULL).
>  * Depending on the input parameters, mempool elements can be either allocated
>  * together with the mempool header, or an externally provided memory buffer
>  * could be used to store mempool objects. In later case, that external
>@@ -560,18 +562,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
>  * @param elt_size
>  *   The size of each element.
>  * @param cache_size
>- *   If cache_size is non-zero, the rte_mempool library will try to
>- *   limit the accesses to the common lockless pool, by maintaining a
>- *   per-lcore object cache. This argument must be lower or equal to
>- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
>- *   cache_size to have "n modulo cache_size == 0": if this is
>- *   not the case, some elements will always stay in the pool and will
>- *   never be used. The access to the per-lcore table is of course
>- *   faster than the multi-producer/consumer pool. The cache can be
>- *   disabled if the cache_size argument is set to 0; it can be useful to
>- *   avoid losing objects in cache. Note that even if not used, the
>- *   memory space for cache is always reserved in a mempool structure,
>- *   except if CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE is set to 0.
>+ *   Size of the cache. See rte_mempool_create() for details.
>  * @param private_data_size
>  *   The size of the private data appended after the mempool
>  *   structure. This is useful for storing some private data after the
>@@ -585,35 +576,17 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
>  *   An opaque pointer to data that can be used in the mempool
>  *   constructor function.
>  * @param obj_init
>- *   A function pointer that is called for each object at
>- *   initialization of the pool. The user can set some meta data in
>- *   objects if needed. This parameter can be NULL if not needed.
>- *   The obj_init() function takes the mempool pointer, the init_arg,
>- *   the object pointer and the object number as parameters.
>+ *   A function called for each object at initialization of the pool.
>+ *   See rte_mempool_create() for details.
>  * @param obj_init_arg
>- *   An opaque pointer to data that can be used as an argument for
>- *   each call to the object constructor function.
>+ *   An opaque pointer passed to the object constructor function.
>  * @param socket_id
>  *   The *socket_id* argument is the socket identifier in the case of
>  *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
>  *   constraint for the reserved zone.
>  * @param flags
>- *   The *flags* arguments is an OR of following flags:
>- *   - MEMPOOL_F_NO_SPREAD: By default, objects addresses are spread
>- *     between channels in RAM: the pool allocator will add padding
>- *     between objects depending on the hardware configuration. See
>- *     Memory alignment constraints for details. If this flag is set,
>- *     the allocator will just align them to a cache line.
>- *   - MEMPOOL_F_NO_CACHE_ALIGN: By default, the returned objects are
>- *     cache-aligned. This flag removes this constraint, and no
>- *     padding will be present between objects. This flag implies
>- *     MEMPOOL_F_NO_SPREAD.
>- *   - MEMPOOL_F_SP_PUT: If this flag is set, the default behavior
>- *     when using rte_mempool_put() or rte_mempool_put_bulk() is
>- *     "single-producer". Otherwise, it is "multi-producers".
>- *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
>- *     when using rte_mempool_get() or rte_mempool_get_bulk() is
>- *     "single-consumer". Otherwise, it is "multi-consumers".
>+ *   Flags controlling the behavior of the mempool. See
>+ *   rte_mempool_create() for details.
>  * @param vaddr
>  *   Virtual address of the externally allocated memory buffer.
>  *   Will be used to store mempool objects.
>@@ -626,13 +599,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
>  *   LOG2 of the physical pages size.
>  * @return
>  *   The pointer to the new allocated mempool, on success. NULL on error
>- *   with rte_errno set appropriately. Possible rte_errno values include:
>- *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
>- *    - E_RTE_SECONDARY - function was called from a secondary process instance
>- *    - EINVAL - cache size provided is too large
>- *    - ENOSPC - the maximum number of memzones has already been allocated
>- *    - EEXIST - a memzone with the same name already exists
>- *    - ENOMEM - no appropriate memory area found in which to create memzone
>+ *   with rte_errno set appropriately. See rte_mempool_create() for details.
>  */
> struct rte_mempool *
> rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
>-- 
>2.1.4
>
>


Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 02/36] mempool: replace elt_size by total_elt_size
  2016-04-14 10:19   ` [PATCH 02/36] mempool: replace elt_size by total_elt_size Olivier Matz
@ 2016-04-14 14:18     ` Wiles, Keith
  0 siblings, 0 replies; 150+ messages in thread
From: Wiles, Keith @ 2016-04-14 14:18 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: Richardson, Bruce, stephen

>In some mempool functions, we use the size of the elements as arguments or in
>variables. There is a confusion between the size including or not including
>the header and trailer.
>
>To avoid this confusion:
>- update the API documentation
>- rename the variables and argument names as "elt_size" when the size does not
>  include the header and trailer, or else as "total_elt_size".
>
>Signed-off-by: Olivier Matz <olivier.matz@6wind.com>

Acked by: Keith Wiles <keith.wiles@intel.com>
>---
> lib/librte_mempool/rte_mempool.c | 21 +++++++++++----------
> lib/librte_mempool/rte_mempool.h | 19 +++++++++++--------
> 2 files changed, 22 insertions(+), 18 deletions(-)
>
>diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
>index ce78476..90b5b1b 100644
>--- a/lib/librte_mempool/rte_mempool.c
>+++ b/lib/librte_mempool/rte_mempool.c
>@@ -156,13 +156,13 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
>  *
>  * Given the pointer to the memory, and its topology in physical memory
>  * (the physical addresses table), iterate through the "elt_num" objects
>- * of size "total_elt_sz" aligned at "align". For each object in this memory
>+ * of size "elt_sz" aligned at "align". For each object in this memory
>  * chunk, invoke a callback. It returns the effective number of objects
>  * in this memory. */
> uint32_t
>-rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
>-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
>-	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
>+rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
>+	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
>+	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
> {
> 	uint32_t i, j, k;
> 	uint32_t pgn, pgf;
>@@ -178,7 +178,7 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
> 	while (i != elt_num && j != pg_num) {
> 
> 		start = RTE_ALIGN_CEIL(va, align);
>-		end = start + elt_sz;
>+		end = start + total_elt_sz;
> 
> 		/* index of the first page for the next element. */
> 		pgf = (end >> pg_shift) - (start >> pg_shift);
>@@ -255,6 +255,7 @@ mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
> 		mempool_obj_populate, &arg);
> }
> 
>+/* get the header, trailer and total size of a mempool element. */
> uint32_t
> rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
> 	struct rte_mempool_objsz *sz)
>@@ -332,17 +333,17 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
>  * Calculate maximum amount of memory required to store given number of objects.
>  */
> size_t
>-rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
>+rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
> {
> 	size_t n, pg_num, pg_sz, sz;
> 
> 	pg_sz = (size_t)1 << pg_shift;
> 
>-	if ((n = pg_sz / elt_sz) > 0) {
>+	if ((n = pg_sz / total_elt_sz) > 0) {
> 		pg_num = (elt_num + n - 1) / n;
> 		sz = pg_num << pg_shift;
> 	} else {
>-		sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num;
>+		sz = RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
> 	}
> 
> 	return sz;
>@@ -362,7 +363,7 @@ mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
>  * given memory footprint to store required number of elements.
>  */
> ssize_t
>-rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
>+rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
> 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
> {
> 	uint32_t n;
>@@ -373,7 +374,7 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
> 	va = (uintptr_t)vaddr;
> 	uv = va;
> 
>-	if ((n = rte_mempool_obj_iter(vaddr, elt_num, elt_sz, 1,
>+	if ((n = rte_mempool_obj_iter(vaddr, elt_num, total_elt_sz, 1,
> 			paddr, pg_num, pg_shift, mempool_lelem_iter,
> 			&uv)) != elt_num) {
> 		return -(ssize_t)n;
>diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
>index bd78df5..ca4657f 100644
>--- a/lib/librte_mempool/rte_mempool.h
>+++ b/lib/librte_mempool/rte_mempool.h
>@@ -1289,7 +1289,7 @@ struct rte_mempool *rte_mempool_lookup(const char *name);
>  * calculates header, trailer, body and total sizes of the mempool object.
>  *
>  * @param elt_size
>- *   The size of each element.
>+ *   The size of each element, without header and trailer.
>  * @param flags
>  *   The flags used for the mempool creation.
>  *   Consult rte_mempool_create() for more information about possible values.
>@@ -1315,14 +1315,15 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
>  *
>  * @param elt_num
>  *   Number of elements.
>- * @param elt_sz
>- *   The size of each element.
>+ * @param total_elt_sz
>+ *   The size of each element, including header and trailer, as returned
>+ *   by rte_mempool_calc_obj_size().
>  * @param pg_shift
>  *   LOG2 of the physical pages size.
>  * @return
>  *   Required memory size aligned at page boundary.
>  */
>-size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
>+size_t rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz,
> 	uint32_t pg_shift);
> 
> /**
>@@ -1336,8 +1337,9 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
>  *   Will be used to store mempool objects.
>  * @param elt_num
>  *   Number of elements.
>- * @param elt_sz
>- *   The size of each element.
>+ * @param total_elt_sz
>+ *   The size of each element, including header and trailer, as returned
>+ *   by rte_mempool_calc_obj_size().
>  * @param paddr
>  *   Array of physical addresses of the pages that comprises given memory
>  *   buffer.
>@@ -1351,8 +1353,9 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
>  *   buffer is too small, return a negative value whose absolute value
>  *   is the actual number of elements that can be stored in that buffer.
>  */
>-ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
>-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
>+ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num,
>+	size_t total_elt_sz, const phys_addr_t paddr[], uint32_t pg_num,
>+	uint32_t pg_shift);
> 
> /**
>  * Walk list of all memory pools
>-- 
>2.1.4
>
>


Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 00/36] mempool: rework memory allocation
  2016-04-14 14:01     ` Olivier MATZ
  2016-04-14 14:03       ` Wiles, Keith
@ 2016-04-14 14:20       ` Hunt, David
  1 sibling, 0 replies; 150+ messages in thread
From: Hunt, David @ 2016-04-14 14:20 UTC (permalink / raw)
  To: Olivier MATZ, Wiles, Keith, dev; +Cc: Richardson, Bruce, stephen



On 4/14/2016 3:01 PM, Olivier MATZ wrote:
>
> On 04/14/2016 03:50 PM, Wiles, Keith wrote:
--snip--
>> I have not digested this complete patch yet, but this one popped out 
>> at me as the External Memory Manager support is setting in the wings 
>> for 16.07 release. If this causes the EMM patch to be rewritten or 
>> updated that seems like a problem to me. Does this patch add the 
>> External Memory Manager support?
>> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/32015/focus=35107 
>>
>
> I've reworked the series you are referring to, and rebased it on top
> of this series. Please see:
> http://dpdk.org/ml/archives/dev/2016-April/037509.html
>

Thanks for your help on this, Olivier. Much appreciated.

Regards,
David.

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 10/36] mempool: use the list to iterate the mempool elements
  2016-04-14 10:19   ` [PATCH 10/36] mempool: use the list to iterate the mempool elements Olivier Matz
@ 2016-04-14 15:33     ` Wiles, Keith
  2016-04-15  7:31       ` Olivier Matz
  2016-05-11 10:02     ` [PATCH v2 " Olivier Matz
  1 sibling, 1 reply; 150+ messages in thread
From: Wiles, Keith @ 2016-04-14 15:33 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: Richardson, Bruce, stephen

>
> static void
>-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
>-		     uint32_t index __rte_unused)
>+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
>+	__rte_unused uint32_t index)

I have seen this use of __rte_unused or attributes attached to variables and structures in couple different ways.

I have seen the placement of the attribute after and before the variable I prefer the attribute to be after, but could adapt I hope.
Do we have a rule about where the attribute is put in this case and others. I have seen the attributes for structures are always at the end of the structure, which is some cases it may not compile in other places.

I would like to suggest we place the attributes at the end of structure e.g. __rte_cached_aligned and I would like to see the __rte_unused after the variable as a style in the code.

Thanks

> {
> 	struct txq_mp2mr_mbuf_check_data *data = arg;
>-	struct rte_mbuf *buf =
>-		(void *)((uintptr_t)start + data->mp->header_size);
>
Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 14/36] mempool: store physaddr in mempool objects
  2016-04-14 10:19   ` [PATCH 14/36] mempool: store physaddr in mempool objects Olivier Matz
@ 2016-04-14 15:40     ` Wiles, Keith
  2016-04-15  7:34       ` Olivier Matz
  0 siblings, 1 reply; 150+ messages in thread
From: Wiles, Keith @ 2016-04-14 15:40 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: Richardson, Bruce, stephen

>Store the physical address of the object in its header. It simplifies
>rte_mempool_virt2phy() and prepares the removing of the paddr[] table
>in the mempool header.
>
>Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>---
> lib/librte_mempool/rte_mempool.c | 17 +++++++++++------
> lib/librte_mempool/rte_mempool.h | 11 ++++++-----
> 2 files changed, 17 insertions(+), 11 deletions(-)
>
>diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
>index 839b828..b8e46fc 100644
>--- a/lib/librte_mempool/rte_mempool.c
>+++ b/lib/librte_mempool/rte_mempool.c
>@@ -132,19 +132,22 @@ static unsigned optimize_object_size(unsigned obj_size)
> typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
> 	void * /*obj_start*/,
> 	void * /*obj_end*/,
>-	uint32_t /*obj_index */);
>+	uint32_t /*obj_index */,
>+	phys_addr_t /*physaddr*/);

What is the reason to comment out the variable names, if no reason I would suggest we remove the comment marks and leave the var names.
>

Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 26/36] mempool: introduce a function to create an empty mempool
  2016-04-14 10:19   ` [PATCH 26/36] mempool: introduce a function to create an empty mempool Olivier Matz
@ 2016-04-14 15:57     ` Wiles, Keith
  2016-04-15  7:42       ` Olivier Matz
  0 siblings, 1 reply; 150+ messages in thread
From: Wiles, Keith @ 2016-04-14 15:57 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: Richardson, Bruce, stephen

>Introduce a new function rte_mempool_create_empty()
>that allocates a mempool that is not populated.
>
>The functions rte_mempool_create() and rte_mempool_xmem_create()
>now make use of it, making their code much easier to read.
>Currently, they are the only users of rte_mempool_create_empty()
>but the function will be made public in next commits.
>
>Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>+/* create an empty mempool */
>+static struct rte_mempool *
>+rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
>+	unsigned cache_size, unsigned private_data_size,
>+	int socket_id, unsigned flags)
> {

When two processes need to use the same mempool, do we have a race condition with one doing a rte_mempool_create_empty() and the other process tries to use it when it finds that mempool before being fully initialized by the first process?

Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 10/36] mempool: use the list to iterate the mempool elements
  2016-04-14 15:33     ` Wiles, Keith
@ 2016-04-15  7:31       ` Olivier Matz
  2016-04-15 13:19         ` Wiles, Keith
  0 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-04-15  7:31 UTC (permalink / raw)
  To: Wiles, Keith, dev; +Cc: Richardson, Bruce, stephen

Hi,

On 04/14/2016 05:33 PM, Wiles, Keith wrote:
>>
>> static void
>> -txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
>> -		     uint32_t index __rte_unused)
>> +txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
>> +	__rte_unused uint32_t index)
> 
> I have seen this use of __rte_unused or attributes attached to variables and structures in couple different ways.
> 
> I have seen the placement of the attribute after and before the variable I prefer the attribute to be after, but could adapt I hope.
> Do we have a rule about where the attribute is put in this case and others. I have seen the attributes for structures are always at the end of the structure, which is some cases it may not compile in other places.
> 
> I would like to suggest we place the attributes at the end of structure e.g. __rte_cached_aligned and I would like to see the __rte_unused after the variable as a style in the code.

I agree the __rte_unused shouldn't have moved in this patch.

About the default place of attributes, I've seen an exemple where
putting it at the end didn't what I expected:

	struct {
		int foo;
	} __rte_cache_aligned;

The expected behavior is to define a structure which is cache aligned.
But if "rte_memory.h" is not included, it will define a structure which
is not cache aligned, and a global variable called __rte_cache_aligned,
without any compiler warning.

The gcc doc gives some hints about where to place the attributes:
https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html

Given its length, I'm not sure the dpdk coding rules should contain
anything more than "refer to gcc documentation" ;)

Regards,
Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 14/36] mempool: store physaddr in mempool objects
  2016-04-14 15:40     ` Wiles, Keith
@ 2016-04-15  7:34       ` Olivier Matz
  0 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-15  7:34 UTC (permalink / raw)
  To: Wiles, Keith, dev; +Cc: Richardson, Bruce, stephen



On 04/14/2016 05:40 PM, Wiles, Keith wrote:
>> Store the physical address of the object in its header. It simplifies
>> rte_mempool_virt2phy() and prepares the removing of the paddr[] table
>> in the mempool header.
>>
>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>> ---
>> lib/librte_mempool/rte_mempool.c | 17 +++++++++++------
>> lib/librte_mempool/rte_mempool.h | 11 ++++++-----
>> 2 files changed, 17 insertions(+), 11 deletions(-)
>>
>> diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
>> index 839b828..b8e46fc 100644
>> --- a/lib/librte_mempool/rte_mempool.c
>> +++ b/lib/librte_mempool/rte_mempool.c
>> @@ -132,19 +132,22 @@ static unsigned optimize_object_size(unsigned obj_size)
>> typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
>> 	void * /*obj_start*/,
>> 	void * /*obj_end*/,
>> -	uint32_t /*obj_index */);
>> +	uint32_t /*obj_index */,
>> +	phys_addr_t /*physaddr*/);
> 
> What is the reason to comment out the variable names, if no reason I would suggest we remove the comment marks and leave the var names.

I just kept the initial style here.
Anyway this code is removed later in the series. See "mempool: simplify
xmem_usage".


Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 26/36] mempool: introduce a function to create an empty mempool
  2016-04-14 15:57     ` Wiles, Keith
@ 2016-04-15  7:42       ` Olivier Matz
  2016-04-15 13:26         ` Wiles, Keith
  0 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-04-15  7:42 UTC (permalink / raw)
  To: Wiles, Keith, dev; +Cc: Richardson, Bruce, stephen



On 04/14/2016 05:57 PM, Wiles, Keith wrote:
>> Introduce a new function rte_mempool_create_empty()
>> that allocates a mempool that is not populated.
>>
>> The functions rte_mempool_create() and rte_mempool_xmem_create()
>> now make use of it, making their code much easier to read.
>> Currently, they are the only users of rte_mempool_create_empty()
>> but the function will be made public in next commits.
>>
>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>> +/* create an empty mempool */
>> +static struct rte_mempool *
>> +rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
>> +	unsigned cache_size, unsigned private_data_size,
>> +	int socket_id, unsigned flags)
>> {
> 
> When two processes need to use the same mempool, do we have a race condition with one doing a rte_mempool_create_empty() and the other process tries to use it when it finds that mempool before being fully initialized by the first process?
> 

I'm not an expert of the dpdk multiprocess model. But I would
say that there are a lot of possible race conditions like this
(ex: a port is created but not started), and I assume that
applications doing multiprocess have their synchronization.

If we really want a solution in mempool, we could:

- remove the TAILQ_INSERT_TAIL() from rte_mempool_create()
- create a new function rte_mempool_share() that adds the
  mempool in the tailq for multiprocess. This function would
  be called at the end of rte_mempool_create(), or by the
  user if using rte_mempool_create_empty().

I may be mistaking but I don't feel it's really required. Any
comment from a multiprocess expert is welcome though.


Regards,
Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 06/36] mempool: update library version
  2016-04-14 10:19   ` [PATCH 06/36] mempool: update library version Olivier Matz
@ 2016-04-15 12:38     ` Olivier Matz
  0 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-04-15 12:38 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen



On 04/14/2016 12:19 PM, Olivier Matz wrote:
> Next changes of this patch series are too heavy to keep a compat
> layer. So bump the version number of the library.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---
>  doc/guides/rel_notes/release_16_04.rst     | 2 +-
>  lib/librte_mempool/Makefile                | 2 +-
>  lib/librte_mempool/rte_mempool_version.map | 6 ++++++
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
> index d0a09ef..5fe172d 100644
> --- a/doc/guides/rel_notes/release_16_04.rst
> +++ b/doc/guides/rel_notes/release_16_04.rst
> @@ -513,7 +513,7 @@ The libraries prepended with a plus sign were incremented in this version.
>       librte_kvargs.so.1
>       librte_lpm.so.2
>       librte_mbuf.so.2
> -     librte_mempool.so.1
> +   + librte_mempool.so.2
>       librte_meter.so.1
>     + librte_pipeline.so.3
>       librte_pmd_bond.so.1

This will go in doc/guides/rel_notes/release_16_07.rst
(the file did not exist at the time the patch series was created).

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 10/36] mempool: use the list to iterate the mempool elements
  2016-04-15  7:31       ` Olivier Matz
@ 2016-04-15 13:19         ` Wiles, Keith
  0 siblings, 0 replies; 150+ messages in thread
From: Wiles, Keith @ 2016-04-15 13:19 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: Richardson, Bruce, stephen

>Hi,
>
>On 04/14/2016 05:33 PM, Wiles, Keith wrote:
>>>
>>> static void
>>> -txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
>>> -		     uint32_t index __rte_unused)
>>> +txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
>>> +	__rte_unused uint32_t index)
>> 
>> I have seen this use of __rte_unused or attributes attached to variables and structures in couple different ways.
>> 
>> I have seen the placement of the attribute after and before the variable I prefer the attribute to be after, but could adapt I hope.
>> Do we have a rule about where the attribute is put in this case and others. I have seen the attributes for structures are always at the end of the structure, which is some cases it may not compile in other places.
>> 
>> I would like to suggest we place the attributes at the end of structure e.g. __rte_cached_aligned and I would like to see the __rte_unused after the variable as a style in the code.
>
>I agree the __rte_unused shouldn't have moved in this patch.
>
>About the default place of attributes, I've seen an exemple where
>putting it at the end didn't what I expected:
>
>	struct {
>		int foo;
>	} __rte_cache_aligned;
>
>The expected behavior is to define a structure which is cache aligned.
>But if "rte_memory.h" is not included, it will define a structure which
>is not cache aligned, and a global variable called __rte_cache_aligned,
>without any compiler warning.
>
>The gcc doc gives some hints about where to place the attributes:
>https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html

Then I suggest we start using the suggested syntax in the above link.

I did a quick read of the text and did not see where the above unused would be placed, did someone find it?

>
>Given its length, I'm not sure the dpdk coding rules should contain
>anything more than "refer to gcc documentation" ;)
>
>Regards,
>Olivier
>


Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH 26/36] mempool: introduce a function to create an empty mempool
  2016-04-15  7:42       ` Olivier Matz
@ 2016-04-15 13:26         ` Wiles, Keith
  0 siblings, 0 replies; 150+ messages in thread
From: Wiles, Keith @ 2016-04-15 13:26 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: Richardson, Bruce, stephen

>
>
>On 04/14/2016 05:57 PM, Wiles, Keith wrote:
>>> Introduce a new function rte_mempool_create_empty()
>>> that allocates a mempool that is not populated.
>>>
>>> The functions rte_mempool_create() and rte_mempool_xmem_create()
>>> now make use of it, making their code much easier to read.
>>> Currently, they are the only users of rte_mempool_create_empty()
>>> but the function will be made public in next commits.
>>>
>>> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
>>> +/* create an empty mempool */
>>> +static struct rte_mempool *
>>> +rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
>>> +	unsigned cache_size, unsigned private_data_size,
>>> +	int socket_id, unsigned flags)
>>> {
>> 
>> When two processes need to use the same mempool, do we have a race condition with one doing a rte_mempool_create_empty() and the other process tries to use it when it finds that mempool before being fully initialized by the first process?
>> 
>
>I'm not an expert of the dpdk multiprocess model. But I would
>say that there are a lot of possible race conditions like this
>(ex: a port is created but not started), and I assume that
>applications doing multiprocess have their synchronization.
>
>If we really want a solution in mempool, we could:
>
>- remove the TAILQ_INSERT_TAIL() from rte_mempool_create()
>- create a new function rte_mempool_share() that adds the
>  mempool in the tailq for multiprocess. This function would
>  be called at the end of rte_mempool_create(), or by the
>  user if using rte_mempool_create_empty().
>
>I may be mistaking but I don't feel it's really required. Any
>comment from a multiprocess expert is welcome though.

Yes, I agree we should have the developer handle the multiprocess synchronization. The only thing I think we can do is provide a sync point API, but that is all I can think of to do.

Maybe instead of adding a fix for each place in DPDK, we require the developer to add the sync up when he has done all of the inits in his code or we provide one. Maybe a minor issue and we can ignore my comments for now.
>
>
>Regards,
>Olivier
>


Regards,
Keith





^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH v2 10/36] mempool: use the list to iterate the mempool elements
  2016-04-14 10:19   ` [PATCH 10/36] mempool: use the list to iterate the mempool elements Olivier Matz
  2016-04-14 15:33     ` Wiles, Keith
@ 2016-05-11 10:02     ` Olivier Matz
  1 sibling, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-11 10:02 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Now that the mempool objects are chained into a list, we can use it to
browse them. This implies a rework of rte_mempool_obj_iter() API, that
does not need to take as many arguments as before. The previous function
is kept as a private function, and renamed in this commit. It will be
removed in a next commit of the patch series.

The only internal users of this function are the mellanox drivers. The
code is updated accordingly.

Introducing an API compatibility for this function has been considered,
but it is not easy to do without keeping the old code, as the previous
function could also be used to browse elements that were not added in a
mempool. Moreover, the API is already be broken by other patches in this
version.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---

v1 -> v2:
- do not change the place of __rte_unused in txq_mp2mr_mbuf_check(),
  as suggested by Keith.

 drivers/net/mlx4/mlx4.c                    | 51 +++++++---------------
 drivers/net/mlx5/mlx5_rxtx.c               | 51 +++++++---------------
 lib/librte_mempool/rte_mempool.c           | 36 ++++++++++++---
 lib/librte_mempool/rte_mempool.h           | 70 ++++++++----------------------
 lib/librte_mempool/rte_mempool_version.map |  3 +-
 5 files changed, 82 insertions(+), 129 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 41453cb..437eca6 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1320,7 +1320,6 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 }
 
 struct txq_mp2mr_mbuf_check_data {
-	const struct rte_mempool *mp;
 	int ret;
 };
 
@@ -1328,34 +1327,26 @@ struct txq_mp2mr_mbuf_check_data {
  * Callback function for rte_mempool_obj_iter() to check whether a given
  * mempool object looks like a mbuf.
  *
- * @param[in, out] arg
- *   Context data (struct txq_mp2mr_mbuf_check_data). Contains mempool pointer
- *   and return value.
- * @param[in] start
- *   Object start address.
- * @param[in] end
- *   Object end address.
+ * @param[in] mp
+ *   The mempool pointer
+ * @param[in] arg
+ *   Context data (struct txq_mp2mr_mbuf_check_data). Contains the
+ *   return value.
+ * @param[in] obj
+ *   Object address.
  * @param index
- *   Unused.
- *
- * @return
- *   Nonzero value when object is not a mbuf.
+ *   Object index, unused.
  */
 static void
-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
-		     uint32_t index __rte_unused)
+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
+	uint32_t index __rte_unused)
 {
 	struct txq_mp2mr_mbuf_check_data *data = arg;
-	struct rte_mbuf *buf =
-		(void *)((uintptr_t)start + data->mp->header_size);
+	struct rte_mbuf *buf = obj;
 
-	(void)index;
 	/* Check whether mbuf structure fits element size and whether mempool
 	 * pointer is valid. */
-	if (((uintptr_t)end >= (uintptr_t)(buf + 1)) &&
-	    (buf->pool == data->mp))
-		data->ret = 0;
-	else
+	if (sizeof(*buf) > mp->elt_size || buf->pool != mp)
 		data->ret = -1;
 }
 
@@ -1373,24 +1364,12 @@ txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
-		.mp = mp,
-		.ret = -1,
+		.ret = 0,
 	};
 
-	/* Discard empty mempools. */
-	if (mp->size == 0)
-		return;
 	/* Register mempool only if the first element looks like a mbuf. */
-	rte_mempool_obj_iter((void *)mp->elt_va_start,
-			     1,
-			     mp->header_size + mp->elt_size + mp->trailer_size,
-			     1,
-			     mp->elt_pa,
-			     mp->pg_num,
-			     mp->pg_shift,
-			     txq_mp2mr_mbuf_check,
-			     &data);
-	if (data.ret)
+	if (rte_mempool_obj_iter(mp, txq_mp2mr_mbuf_check, &data) == 0 ||
+			data.ret == -1)
 		return;
 	txq_mp2mr(txq, mp);
 }
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 88226b6..9ec17fc 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -262,7 +262,6 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 }
 
 struct txq_mp2mr_mbuf_check_data {
-	const struct rte_mempool *mp;
 	int ret;
 };
 
@@ -270,34 +269,26 @@ struct txq_mp2mr_mbuf_check_data {
  * Callback function for rte_mempool_obj_iter() to check whether a given
  * mempool object looks like a mbuf.
  *
- * @param[in, out] arg
- *   Context data (struct txq_mp2mr_mbuf_check_data). Contains mempool pointer
- *   and return value.
- * @param[in] start
- *   Object start address.
- * @param[in] end
- *   Object end address.
+ * @param[in] mp
+ *   The mempool pointer
+ * @param[in] arg
+ *   Context data (struct txq_mp2mr_mbuf_check_data). Contains the
+ *   return value.
+ * @param[in] obj
+ *   Object address.
  * @param index
- *   Unused.
- *
- * @return
- *   Nonzero value when object is not a mbuf.
+ *   Object index, unused.
  */
 static void
-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
-		     uint32_t index __rte_unused)
+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
+	uint32_t index __rte_unused)
 {
 	struct txq_mp2mr_mbuf_check_data *data = arg;
-	struct rte_mbuf *buf =
-		(void *)((uintptr_t)start + data->mp->header_size);
+	struct rte_mbuf *buf = obj;
 
-	(void)index;
 	/* Check whether mbuf structure fits element size and whether mempool
 	 * pointer is valid. */
-	if (((uintptr_t)end >= (uintptr_t)(buf + 1)) &&
-	    (buf->pool == data->mp))
-		data->ret = 0;
-	else
+	if (sizeof(*buf) > mp->elt_size || buf->pool != mp)
 		data->ret = -1;
 }
 
@@ -315,24 +306,12 @@ txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
-		.mp = mp,
-		.ret = -1,
+		.ret = 0,
 	};
 
-	/* Discard empty mempools. */
-	if (mp->size == 0)
-		return;
 	/* Register mempool only if the first element looks like a mbuf. */
-	rte_mempool_obj_iter((void *)mp->elt_va_start,
-			     1,
-			     mp->header_size + mp->elt_size + mp->trailer_size,
-			     1,
-			     mp->elt_pa,
-			     mp->pg_num,
-			     mp->pg_shift,
-			     txq_mp2mr_mbuf_check,
-			     &data);
-	if (data.ret)
+	if (rte_mempool_obj_iter(mp, txq_mp2mr_mbuf_check, &data) == 0 ||
+			data.ret == -1)
 		return;
 	txq_mp2mr(txq, mp);
 }
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index c38eee4..e9fad7f 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -126,6 +126,14 @@ static unsigned optimize_object_size(unsigned obj_size)
 	return new_obj_size * RTE_MEMPOOL_ALIGN;
 }
 
+/**
+ * A mempool object iterator callback function.
+ */
+typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
+	void * /*obj_start*/,
+	void * /*obj_end*/,
+	uint32_t /*obj_index */);
+
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
@@ -160,8 +168,8 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
  * of size "elt_sz" aligned at "align". For each object in this memory
  * chunk, invoke a callback. It returns the effective number of objects
  * in this memory. */
-uint32_t
-rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
+static uint32_t
+rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
 	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
 {
@@ -219,6 +227,24 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	return i;
 }
 
+/* call obj_cb() for each mempool element */
+uint32_t
+rte_mempool_obj_iter(struct rte_mempool *mp,
+	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg)
+{
+	struct rte_mempool_objhdr *hdr;
+	void *obj;
+	unsigned n = 0;
+
+	STAILQ_FOREACH(hdr, &mp->elt_list, next) {
+		obj = (char *)hdr + sizeof(*hdr);
+		obj_cb(mp, obj_cb_arg, obj, n);
+		n++;
+	}
+
+	return n;
+}
+
 /*
  * Populate  mempool with the objects.
  */
@@ -250,7 +276,7 @@ mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
 	arg.obj_init = obj_init;
 	arg.obj_init_arg = obj_init_arg;
 
-	mp->size = rte_mempool_obj_iter((void *)mp->elt_va_start,
+	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		num, elt_sz, align,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
 		mempool_obj_populate, &arg);
@@ -364,7 +390,7 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	va = (uintptr_t)vaddr;
 	uv = va;
 
-	if ((n = rte_mempool_obj_iter(vaddr, elt_num, total_elt_sz, 1,
+	if ((n = rte_mempool_obj_mem_iter(vaddr, elt_num, total_elt_sz, 1,
 			paddr, pg_num, pg_shift, mempool_lelem_iter,
 			&uv)) != elt_num) {
 		return -(ssize_t)n;
@@ -792,7 +818,7 @@ mempool_audit_cookies(struct rte_mempool *mp)
 	arg.obj_end = mp->elt_va_start;
 	arg.obj_num = 0;
 
-	num = rte_mempool_obj_iter((void *)mp->elt_va_start,
+	num = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		mp->size, elt_sz, 1,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
 		mempool_obj_audit, &arg);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index a80335f..f5f6752 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -329,63 +329,13 @@ void __mempool_check_cookies(const struct rte_mempool *mp,
 /**
  * An object callback function for mempool.
  *
- * Arguments are the mempool, the opaque pointer given by the user in
- * rte_mempool_create(), the pointer to the element and the index of
- * the element in the pool.
+ * Used by rte_mempool_create() and rte_mempool_obj_iter().
  */
 typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
 		void *opaque, void *obj, unsigned obj_idx);
 typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
 
 /**
- * A mempool object iterator callback function.
- */
-typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
-	void * /*obj_start*/,
-	void * /*obj_end*/,
-	uint32_t /*obj_index */);
-
-/**
- * Call a function for each mempool object in a memory chunk
- *
- * Iterate across objects of the given size and alignment in the
- * provided chunk of memory. The given memory buffer can consist of
- * disjointed physical pages.
- *
- * For each object, call the provided callback (if any). This function
- * is used to populate a mempool, or walk through all the elements of a
- * mempool, or estimate how many elements of the given size could be
- * created in the given memory buffer.
- *
- * @param vaddr
- *   Virtual address of the memory buffer.
- * @param elt_num
- *   Maximum number of objects to iterate through.
- * @param elt_sz
- *   Size of each object.
- * @param align
- *   Alignment of each object.
- * @param paddr
- *   Array of physical addresses of the pages that comprises given memory
- *   buffer.
- * @param pg_num
- *   Number of elements in the paddr array.
- * @param pg_shift
- *   LOG2 of the physical pages size.
- * @param obj_iter
- *   Object iterator callback function (could be NULL).
- * @param obj_iter_arg
- *   User defined parameter for the object iterator callback function.
- *
- * @return
- *   Number of objects iterated through.
- */
-uint32_t rte_mempool_obj_iter(void *vaddr,
-	uint32_t elt_num, size_t elt_sz, size_t align,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
-	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg);
-
-/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -638,6 +588,24 @@ rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
 
 
 /**
+ * Call a function for each mempool element
+ *
+ * Iterate across all objects attached to a rte_mempool and call the
+ * callback function on it.
+ *
+ * @param mp
+ *   A pointer to an initialized mempool.
+ * @param obj_cb
+ *   A function pointer that is called for each object.
+ * @param obj_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   Number of objects iterated.
+ */
+uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
+	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg);
+
+/**
  * Dump the status of the mempool to the console.
  *
  * @param f
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 8c157d0..4db75ca 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -9,7 +9,6 @@ DPDK_2.0 {
 	rte_mempool_dump;
 	rte_mempool_list_dump;
 	rte_mempool_lookup;
-	rte_mempool_obj_iter;
 	rte_mempool_walk;
 	rte_mempool_xmem_create;
 	rte_mempool_xmem_size;
@@ -21,5 +20,7 @@ DPDK_2.0 {
 DPDK_16.07 {
 	global:
 
+	rte_mempool_obj_iter;
+
 	local: *;
 } DPDK_2.0;
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 00/35] mempool: rework memory allocation
  2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
                     ` (36 preceding siblings ...)
  2016-04-14 13:50   ` [PATCH 00/36] mempool: rework memory allocation Wiles, Keith
@ 2016-05-18 11:04   ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 01/35] mempool: rework comments and style Olivier Matz
                       ` (35 more replies)
  37 siblings, 36 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

This series is a rework of mempool. For those who don't want to read
all the cover letter, here is a sumary:

- it is not possible to allocate large mempools if there is not enough
  contiguous memory, this series solves this issue
- introduce new APIs with less arguments: "create, populate, obj_init"
- allow to free a mempool
- split code in smaller functions, will ease the introduction of ext_handler
- remove test-pmd anonymous mempool creation
- remove most of dom0-specific mempool code
- opens the door for a eal_memory rework: we probably don't need large
  contiguous memory area anymore, working with pages would work.

This breaks the ABI as it was indicated in the deprecation for 16.04.
The API stays almost the same, no modification is needed in examples app
or in test-pmd. Only kni and mellanox drivers are slightly modified.

Changes v2 -> v3:
- fix some checkpatch issues
- rework titles and commit logs
- fix compilation with debug + shared libraries:
  rte_mempool_check_cookies() must be exported
- rebase on head

Changes v1 -> v2:
- do not change the place of __rte_unused in txq_mp2mr_mbuf_check(),
  as suggested by Keith.

Changes RFC -> v1:
- remove the rte_deconst macro, and remove some const qualifier in
  dump/audit functions
- rework modifications in mellanox drivers to ensure the mempool is
  virtually contiguous
- fix mempool memory chunk iteration (bad pointer was used)
- fix compilation on freebsd: replace MAP_LOCKED flag by mlock()
- fix compilation on tilera (pointer arithmetics)
- slightly rework and clean the mempool autotest
- fix mempool autotest on bsd
- more validation (especially mellanox drivers and kni that were not
  tested in RFC)
- passed autotests (x86_64-native-linuxapp-gcc and x86_64-native-bsdapp-gcc)
- rebase on head, reorder the patches a bit and fix minor split issues


Description of the initial issue
--------------------------------

The allocation of mbuf pool can fail even if there is enough memory.
The problem is related to the way the memory is allocated and used in
dpdk. It is particularly annoying with mbuf pools, but it can also fail
in other use cases allocating a large amount of memory.

- rte_malloc() allocates physically contiguous memory, which is needed
  for mempools, but useless most of the time.

  Allocating a large physically contiguous zone is often impossible
  because the system provide hugepages which may not be contiguous.

- rte_mempool_create() (and therefore rte_pktmbuf_pool_create())
  requires a physically contiguous zone.

- rte_mempool_xmem_create() does not solve the issue as it still
  needs the memory to be virtually contiguous, and there is no
  way in dpdk to allocate a virtually contiguous memory that is
  not also physically contiguous.

How to reproduce the issue
--------------------------

- start the dpdk with some 2MB hugepages (it can also occur with 1GB)
- allocate a large mempool
- even if there is enough memory, the allocation can fail

Example:

  git clone http://dpdk.org/git/dpdk
  cd dpdk
  make config T=x86_64-native-linuxapp-gcc
  make -j32
  mkdir -p /mnt/huge
  mount -t hugetlbfs nodev /mnt/huge
  echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages

  # we try to allocate a mempool whose size is ~450MB, it fails
  ./build/app/testpmd -l 2,4 -- --total-num-mbufs=200000 -i

The EAL logs "EAL: Virtual area found at..." shows that there are
several zones, but all smaller than 450MB.

Workarounds:

- Use 1GB hugepages: it sometimes work, but for very large
  pools (millions of mbufs) there is the same issue. Moreover,
  it would consume 1GB memory at least which can be a lot
  in some cases.

- Reboot the machine or allocate hugepages at boot time: this increases
  the chances to have more contiguous memory, but does not completely
  solve the issue

Solutions
---------

Below is a list of proposed solutions. I implemented a quick and dirty
PoC of solution 1, but it's not working in all conditions and it's
really an ugly hack.  This series implement the solution 4 which looks
the best to me, knowing it does not prevent to do more enhancements
in dpdk memory in the future (solution 3 for instance).

Solution 1: in application
--------------------------

- allocate several hugepages using rte_malloc() or rte_memzone_reserve()
  (only keeping complete hugepages)
- parse memsegs and /proc/maps to check which files mmaps these pages
- mmap the files in a contiguous virtual area
- use rte_mempool_xmem_create()

Cons:

- 1a. parsing the memsegs of rte config in the application does not
  use a public API, and can be broken if internal dpdk code changes
- 1b. some memory is lost due to malloc headers. Also, if the memory is
  very fragmented (ex: all 2MB pages are physically separated), it does
  not work at all because we cannot get any complete page. It is not
  possible to use a lower level allocator since commit fafcc11985a.
- 1c. we cannot use rte_pktmbuf_pool_create(), so we need to use mempool
  api and do a part of the job manually
- 1d. it breaks secondary processes as the virtual addresses won't be
  mmap'd at the same place in secondary process
- 1e. it only fixes the issue for the mbuf pool of the application,
  internal pools in dpdk libraries are not modified
- 1f. this is a pure linux solution (rte_map files)
- 1g. The application has to be aware of RTE_EAL_SINGLE_SEGMENTS option
  that changes the way hugepages are mapped. By the way, it's strange
  to have such a compile-time option, we should probably have only
  one behavior that works all the time.

Solution 2: in dpdk memory allocator
------------------------------------

- do the same than solution 1 in a new function rte_malloc_non_contig():
  allocate several chunks and mmap them in a contiguous virtual memory
- a flag has to be added in malloc header to do the proper cleanup in
  rte_free() (free all the chunks, munmap the memory)
- introduce a new rte_mem_get_physmap(*physmap,addr, len) that returns
  the virt2phys mapping of a virtual area in dpdk
- add a mempool flag MEMPOOL_F_NON_PHYS_CONTIG to use
  rte_malloc_non_contig() to allocate the area storing the objects

Cons:

- 2a. same than 1b: it breaks secondary processes if the mempool flag is
  used.
- 2b. same as 1d: some memory is lost due to malloc headers, and it
  cannot work if memory is too fragmented.
- 2c. rte_malloc_virt2phy() cannot be used on these zones. It would
  return the physical address of the first page. It would be better to
  return an error in this case.
- 2d. need to check how to implement this on bsd (TBD)

Solution 3: in dpdk eal memory
------------------------------

- Rework the way hugepages are mmap'd in dpdk: instead of having several
  rte_map* files, just mmap one file per node. It may drastically
  simplify EAL memory management in dpdk.
- An API should be added to retrieve the physical mapping of a virtual
  area (ex: rte_mem_get_physmap(*physmap, addr, len))
- rte_malloc() and rte_memzone_reserve() won't allocate physically
  contiguous memory anymore (TBD)
- Update mempool to always use the rte_mempool_xmem_create() version

Cons:

- 3a. lot of rework in eal memory, it will induce some behavior changes
  and maybe api changes
- 3b. possible conflicts with xen_dom0 mempool

Solution 4: in mempool
----------------------

- Introduce a new API to fill a mempool with zones that are not
  virtually contiguous. It requires to add new functions to create and
  populate a mempool. Example (TBD):

  - rte_mempool_create_empty(name, n, elt_size, cache_size, priv_size)
  - rte_mempool_populate(mp, addr, len): add virtual memory for objects
  - rte_mempool_mempool_obj_iter(mp, obj_cb, arg): call a cb for each object

- update rte_mempool_create() to allocate objects in several memory
  chunks by default if there is no large enough physically contiguous
  memory.

Tests done
----------

Compilation
~~~~~~~~~~~

The following targets:

 x86_64-native-linuxapp-gcc
 i686-native-linuxapp-gcc
 x86_x32-native-linuxapp-gcc
 x86_64-native-linuxapp-clang
 x86_64-native-bsdapp-gcc
 ppc_64-power8-linuxapp-gcc
 tile-tilegx-linuxapp-gcc (only the mempool files, the target does not compile)

Libraries with and without debug, in static and shared mode + examples.

autotests
~~~~~~~~~

Passed all autotests on x86_64-native-linuxapp-gcc (including kni) and
mempool-related autotests on x86_64-native-bsdapp-gcc.

test-pmd
~~~~~~~~

# now starts fine, was failing before if mempool was too fragmented
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -- -i --port-topology=chained

# still ok
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -m 256 -- -i --port-topology=chained --mp-anon
set fwd txonly
start
stop

# fail, but was failing before too. The problem is because the physical
# addresses are not properly set when using --no-huge. The mempool phys addr
# are now correct, but the zones allocated through memzone_reserve() are
# still wrong. This could be fixed in a future series.
./x86_64-native-linuxapp-gcc/app/testpmd -l 0,2,4 -n 4 -m 256 --no-huge -- -i ---port-topology=chained
set fwd txonly
start
stop

*** BLURB HERE ***

Olivier Matz (35):
  mempool: rework comments and style
  mempool: rename element size variables
  mempool: uninline function to check cookies
  mempool: use sizeof to get the size of header and trailer
  mempool: rename object constructor typedef
  mempool: list objects when added
  mempool: remove const qualifier when browsing pools
  mempool: remove const qualifier in dump and audit
  mempool: use the list to iterate the elements
  mempool: use the list to audit all elements
  mempool: use the list to initialize objects
  mempool: create internal ring in a specific function
  mempool: store physical address in objects
  mempool: remove macro to check if contiguous
  mempool: store memory chunks in a list
  mempool: add function to iterate the memory chunks
  mempool: simplify the memory usage calculation
  mempool: introduce a free callback for memory chunks
  mempool: get memory size with unspecified page size
  mempool: allocate in several memory chunks by default
  eal: lock memory when not using hugepages
  mempool: support no hugepage mode
  mempool: replace physical address by a memzone pointer
  mempool: introduce a function to free a pool
  mempool: introduce a function to create an empty pool
  eal/xen: return machine address without knowing memseg id
  mempool: rework support of Xen dom0
  mempool: create the internal ring when populating
  mempool: populate with anonymous memory
  mempool: make mempool populate and free api public
  app/testpmd: remove anonymous mempool code
  mem: avoid memzone/mempool/ring name truncation
  mempool: add flag for removing phys contiguous constraint
  app/test: rework mempool test
  doc: update release notes about mempool allocation

 app/test-pmd/Makefile                        |    4 -
 app/test-pmd/mempool_anon.c                  |  201 -----
 app/test-pmd/mempool_osdep.h                 |   54 --
 app/test-pmd/testpmd.c                       |   23 +-
 app/test/test_mempool.c                      |  243 +++---
 doc/guides/rel_notes/deprecation.rst         |    8 -
 doc/guides/rel_notes/release_16_07.rst       |    9 +
 drivers/net/mlx4/mlx4.c                      |  140 ++--
 drivers/net/mlx5/mlx5_rxtx.c                 |  140 ++--
 drivers/net/mlx5/mlx5_rxtx.h                 |    4 +-
 drivers/net/xenvirt/rte_eth_xenvirt.h        |    2 +-
 drivers/net/xenvirt/rte_mempool_gntalloc.c   |    4 +-
 lib/librte_eal/common/eal_common_log.c       |    2 +-
 lib/librte_eal/common/eal_common_memzone.c   |   10 +-
 lib/librte_eal/common/include/rte_memory.h   |   11 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c     |    2 +-
 lib/librte_eal/linuxapp/eal/eal_xen_memory.c |   18 +-
 lib/librte_kni/rte_kni.c                     |   12 +-
 lib/librte_mempool/Makefile                  |    3 -
 lib/librte_mempool/rte_dom0_mempool.c        |  133 ----
 lib/librte_mempool/rte_mempool.c             | 1061 +++++++++++++++++---------
 lib/librte_mempool/rte_mempool.h             |  597 +++++++--------
 lib/librte_mempool/rte_mempool_version.map   |   19 +-
 lib/librte_ring/rte_ring.c                   |   16 +-
 24 files changed, 1402 insertions(+), 1314 deletions(-)
 delete mode 100644 app/test-pmd/mempool_anon.c
 delete mode 100644 app/test-pmd/mempool_osdep.h
 delete mode 100644 lib/librte_mempool/rte_dom0_mempool.c

-- 
2.8.0.rc3

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH v3 01/35] mempool: rework comments and style
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 02/35] mempool: rename element size variables Olivier Matz
                       ` (34 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

No functional change, just fix some comments and styling issues.
Also avoid to duplicate comments between rte_mempool_create()
and rte_mempool_xmem_create().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked by: Keith Wiles <keith.wiles@intel.com>
---
 lib/librte_mempool/rte_mempool.c | 17 ++++++++++--
 lib/librte_mempool/rte_mempool.h | 59 +++++++++-------------------------------
 2 files changed, 27 insertions(+), 49 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0724942..daf4d06 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -152,6 +152,14 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	rte_ring_sp_enqueue(mp->ring, obj);
 }
 
+/* Iterate through objects at the given address
+ *
+ * Given the pointer to the memory, and its topology in physical memory
+ * (the physical addresses table), iterate through the "elt_num" objects
+ * of size "total_elt_sz" aligned at "align". For each object in this memory
+ * chunk, invoke a callback. It returns the effective number of objects
+ * in this memory.
+ */
 uint32_t
 rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
@@ -341,9 +349,8 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
 	return sz;
 }
 
-/*
- * Calculate how much memory would be actually required with the
- * given memory footprint to store required number of elements.
+/* Callback used by rte_mempool_xmem_usage(): it sets the opaque
+ * argument to the end of the object.
  */
 static void
 mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
@@ -352,6 +359,10 @@ mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
 	*(uintptr_t *)arg = (uintptr_t)end;
 }
 
+/*
+ * Calculate how much memory would be actually required with the
+ * given memory footprint to store required number of elements.
+ */
 ssize_t
 rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 0f3ef4a..dd70469 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -214,7 +214,7 @@ struct rte_mempool {
 
 }  __rte_cache_aligned;
 
-#define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread in memory. */
+#define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread among memory channels. */
 #define MEMPOOL_F_NO_CACHE_ALIGN 0x0002 /**< Do not align objs on cache lines.*/
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
@@ -272,7 +272,8 @@ struct rte_mempool {
 /* return the header of a mempool object (internal) */
 static inline struct rte_mempool_objhdr *__mempool_get_header(void *obj)
 {
-	return (struct rte_mempool_objhdr *)RTE_PTR_SUB(obj, sizeof(struct rte_mempool_objhdr));
+	return (struct rte_mempool_objhdr *)RTE_PTR_SUB(obj,
+		sizeof(struct rte_mempool_objhdr));
 }
 
 /**
@@ -546,8 +547,9 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 /**
  * Create a new mempool named *name* in memory.
  *
- * This function uses ``memzone_reserve()`` to allocate memory. The
- * pool contains n elements of elt_size. Its size is set to n.
+ * The pool contains n elements of elt_size. Its size is set to n.
+ * This function uses ``memzone_reserve()`` to allocate the mempool header
+ * (and the objects if vaddr is NULL).
  * Depending on the input parameters, mempool elements can be either allocated
  * together with the mempool header, or an externally provided memory buffer
  * could be used to store mempool objects. In later case, that external
@@ -562,18 +564,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  * @param elt_size
  *   The size of each element.
  * @param cache_size
- *   If cache_size is non-zero, the rte_mempool library will try to
- *   limit the accesses to the common lockless pool, by maintaining a
- *   per-lcore object cache. This argument must be lower or equal to
- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
- *   cache_size to have "n modulo cache_size == 0": if this is
- *   not the case, some elements will always stay in the pool and will
- *   never be used. The access to the per-lcore table is of course
- *   faster than the multi-producer/consumer pool. The cache can be
- *   disabled if the cache_size argument is set to 0; it can be useful to
- *   avoid losing objects in cache. Note that even if not used, the
- *   memory space for cache is always reserved in a mempool structure,
- *   except if CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE is set to 0.
+ *   Size of the cache. See rte_mempool_create() for details.
  * @param private_data_size
  *   The size of the private data appended after the mempool
  *   structure. This is useful for storing some private data after the
@@ -587,35 +578,17 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  *   An opaque pointer to data that can be used in the mempool
  *   constructor function.
  * @param obj_init
- *   A function pointer that is called for each object at
- *   initialization of the pool. The user can set some meta data in
- *   objects if needed. This parameter can be NULL if not needed.
- *   The obj_init() function takes the mempool pointer, the init_arg,
- *   the object pointer and the object number as parameters.
+ *   A function called for each object at initialization of the pool.
+ *   See rte_mempool_create() for details.
  * @param obj_init_arg
- *   An opaque pointer to data that can be used as an argument for
- *   each call to the object constructor function.
+ *   An opaque pointer passed to the object constructor function.
  * @param socket_id
  *   The *socket_id* argument is the socket identifier in the case of
  *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
  *   constraint for the reserved zone.
  * @param flags
- *   The *flags* arguments is an OR of following flags:
- *   - MEMPOOL_F_NO_SPREAD: By default, objects addresses are spread
- *     between channels in RAM: the pool allocator will add padding
- *     between objects depending on the hardware configuration. See
- *     Memory alignment constraints for details. If this flag is set,
- *     the allocator will just align them to a cache line.
- *   - MEMPOOL_F_NO_CACHE_ALIGN: By default, the returned objects are
- *     cache-aligned. This flag removes this constraint, and no
- *     padding will be present between objects. This flag implies
- *     MEMPOOL_F_NO_SPREAD.
- *   - MEMPOOL_F_SP_PUT: If this flag is set, the default behavior
- *     when using rte_mempool_put() or rte_mempool_put_bulk() is
- *     "single-producer". Otherwise, it is "multi-producers".
- *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
- *     when using rte_mempool_get() or rte_mempool_get_bulk() is
- *     "single-consumer". Otherwise, it is "multi-consumers".
+ *   Flags controlling the behavior of the mempool. See
+ *   rte_mempool_create() for details.
  * @param vaddr
  *   Virtual address of the externally allocated memory buffer.
  *   Will be used to store mempool objects.
@@ -628,13 +601,7 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
  *   LOG2 of the physical pages size.
  * @return
  *   The pointer to the new allocated mempool, on success. NULL on error
- *   with rte_errno set appropriately. Possible rte_errno values include:
- *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
- *    - E_RTE_SECONDARY - function was called from a secondary process instance
- *    - EINVAL - cache size provided is too large
- *    - ENOSPC - the maximum number of memzones has already been allocated
- *    - EEXIST - a memzone with the same name already exists
- *    - ENOMEM - no appropriate memory area found in which to create memzone
+ *   with rte_errno set appropriately. See rte_mempool_create() for details.
  */
 struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 02/35] mempool: rename element size variables
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 01/35] mempool: rework comments and style Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 03/35] mempool: uninline function to check cookies Olivier Matz
                       ` (33 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

This commit replaces elt_size by total_elt_size when appropriate.

In some mempool functions, we use the size of the elements as arguments
or variables. There is a confusion between the size including or not
including the header and trailer.

To avoid this confusion:
- update the API documentation
- rename the variables and argument names as "elt_size" when the size
  does not include the header and trailer, or else as "total_elt_size".

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked by: Keith Wiles <keith.wiles@intel.com>
---
 lib/librte_mempool/rte_mempool.c | 21 +++++++++++----------
 lib/librte_mempool/rte_mempool.h | 19 +++++++++++--------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index daf4d06..fe90ed3 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -156,14 +156,14 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
  *
  * Given the pointer to the memory, and its topology in physical memory
  * (the physical addresses table), iterate through the "elt_num" objects
- * of size "total_elt_sz" aligned at "align". For each object in this memory
+ * of size "elt_sz" aligned at "align". For each object in this memory
  * chunk, invoke a callback. It returns the effective number of objects
  * in this memory.
  */
 uint32_t
-rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
-	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
+rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
+	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
 {
 	uint32_t i, j, k;
 	uint32_t pgn, pgf;
@@ -179,7 +179,7 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t elt_sz, size_t align,
 	while (i != elt_num && j != pg_num) {
 
 		start = RTE_ALIGN_CEIL(va, align);
-		end = start + elt_sz;
+		end = start + total_elt_sz;
 
 		/* index of the first page for the next element. */
 		pgf = (end >> pg_shift) - (start >> pg_shift);
@@ -256,6 +256,7 @@ mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
 		mempool_obj_populate, &arg);
 }
 
+/* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 	struct rte_mempool_objsz *sz)
@@ -333,17 +334,17 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  * Calculate maximum amount of memory required to store given number of objects.
  */
 size_t
-rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift)
+rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 {
 	size_t n, pg_num, pg_sz, sz;
 
 	pg_sz = (size_t)1 << pg_shift;
 
-	if ((n = pg_sz / elt_sz) > 0) {
+	if ((n = pg_sz / total_elt_sz) > 0) {
 		pg_num = (elt_num + n - 1) / n;
 		sz = pg_num << pg_shift;
 	} else {
-		sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num;
+		sz = RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
 	}
 
 	return sz;
@@ -364,7 +365,7 @@ mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
  * given memory footprint to store required number of elements.
  */
 ssize_t
-rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
+rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
 	uint32_t n;
@@ -375,7 +376,7 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
 	va = (uintptr_t)vaddr;
 	uv = va;
 
-	if ((n = rte_mempool_obj_iter(vaddr, elt_num, elt_sz, 1,
+	if ((n = rte_mempool_obj_iter(vaddr, elt_num, total_elt_sz, 1,
 			paddr, pg_num, pg_shift, mempool_lelem_iter,
 			&uv)) != elt_num) {
 		return -(ssize_t)n;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index dd70469..640f622 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1291,7 +1291,7 @@ struct rte_mempool *rte_mempool_lookup(const char *name);
  * calculates header, trailer, body and total sizes of the mempool object.
  *
  * @param elt_size
- *   The size of each element.
+ *   The size of each element, without header and trailer.
  * @param flags
  *   The flags used for the mempool creation.
  *   Consult rte_mempool_create() for more information about possible values.
@@ -1317,14 +1317,15 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  *
  * @param elt_num
  *   Number of elements.
- * @param elt_sz
- *   The size of each element.
+ * @param total_elt_sz
+ *   The size of each element, including header and trailer, as returned
+ *   by rte_mempool_calc_obj_size().
  * @param pg_shift
  *   LOG2 of the physical pages size.
  * @return
  *   Required memory size aligned at page boundary.
  */
-size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
+size_t rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz,
 	uint32_t pg_shift);
 
 /**
@@ -1338,8 +1339,9 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
  *   Will be used to store mempool objects.
  * @param elt_num
  *   Number of elements.
- * @param elt_sz
- *   The size of each element.
+ * @param total_elt_sz
+ *   The size of each element, including header and trailer, as returned
+ *   by rte_mempool_calc_obj_size().
  * @param paddr
  *   Array of physical addresses of the pages that comprises given memory
  *   buffer.
@@ -1353,8 +1355,9 @@ size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz,
  *   buffer is too small, return a negative value whose absolute value
  *   is the actual number of elements that can be stored in that buffer.
  */
-ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
+ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num,
+	size_t total_elt_sz, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift);
 
 /**
  * Walk list of all memory pools
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 03/35] mempool: uninline function to check cookies
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 01/35] mempool: rework comments and style Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 02/35] mempool: rename element size variables Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 04/35] mempool: use sizeof to get the size of header and trailer Olivier Matz
                       ` (32 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

There's no reason to keep this function inlined. Move it to
rte_mempool.c. We need to export the function for when compiling
with shared libraries + debug. We also need to keep the macro,
because we don't want to call an empty function when debug is
disabled.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c           | 82 ++++++++++++++++++++++++++++--
 lib/librte_mempool/rte_mempool.h           | 78 ++--------------------------
 lib/librte_mempool/rte_mempool_version.map |  8 +++
 3 files changed, 90 insertions(+), 78 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index fe90ed3..46a5d59 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -699,8 +699,6 @@ rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
 	return count;
 }
 
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-/* check cookies before and after objects */
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
@@ -711,6 +709,80 @@ struct mempool_audit_arg {
 	uint32_t obj_num;
 };
 
+/* check and update cookies or panic (internal) */
+void rte_mempool_check_cookies(const struct rte_mempool *mp,
+	void * const *obj_table_const, unsigned n, int free)
+{
+#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
+	struct rte_mempool_objhdr *hdr;
+	struct rte_mempool_objtlr *tlr;
+	uint64_t cookie;
+	void *tmp;
+	void *obj;
+	void **obj_table;
+
+	/* Force to drop the "const" attribute. This is done only when
+	 * DEBUG is enabled */
+	tmp = (void *) obj_table_const;
+	obj_table = (void **) tmp;
+
+	while (n--) {
+		obj = obj_table[n];
+
+		if (rte_mempool_from_obj(obj) != mp)
+			rte_panic("MEMPOOL: object is owned by another "
+				  "mempool\n");
+
+		hdr = __mempool_get_header(obj);
+		cookie = hdr->cookie;
+
+		if (free == 0) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (put)\n");
+			}
+			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
+		} else if (free == 1) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (get)\n");
+			}
+			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE1;
+		} else if (free == 2) {
+			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1 &&
+			    cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
+				rte_log_set_history(0);
+				RTE_LOG(CRIT, MEMPOOL,
+					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+					obj, (const void *) mp, cookie);
+				rte_panic("MEMPOOL: bad header cookie (audit)\n");
+			}
+		}
+		tlr = __mempool_get_trailer(obj);
+		cookie = tlr->cookie;
+		if (cookie != RTE_MEMPOOL_TRAILER_COOKIE) {
+			rte_log_set_history(0);
+			RTE_LOG(CRIT, MEMPOOL,
+				"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
+				obj, (const void *) mp, cookie);
+			rte_panic("MEMPOOL: bad trailer cookie\n");
+		}
+	}
+#else
+	RTE_SET_USED(mp);
+	RTE_SET_USED(obj_table_const);
+	RTE_SET_USED(n);
+	RTE_SET_USED(free);
+#endif
+}
+
+#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 static void
 mempool_obj_audit(void *arg, void *start, void *end, uint32_t idx)
 {
@@ -753,13 +825,13 @@ mempool_audit_cookies(const struct rte_mempool *mp)
 			arg.obj_num, mp->size);
 	}
 }
+#else
+#define mempool_audit_cookies(mp) do {} while(0)
+#endif
 
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic error "-Wcast-qual"
 #endif
-#else
-#define mempool_audit_cookies(mp) do {} while(0)
-#endif
 
 /* check cookies before and after objects */
 static void
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 640f622..a6b82cf 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -312,80 +312,12 @@ static inline struct rte_mempool_objtlr *__mempool_get_trailer(void *obj)
  *   - 1: object is supposed to be free, mark it as allocated
  *   - 2: just check that cookie is valid (free or allocated)
  */
+void rte_mempool_check_cookies(const struct rte_mempool *mp,
+	void * const *obj_table_const, unsigned n, int free);
+
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#ifndef __INTEL_COMPILER
-#pragma GCC diagnostic ignored "-Wcast-qual"
-#endif
-static inline void __mempool_check_cookies(const struct rte_mempool *mp,
-					   void * const *obj_table_const,
-					   unsigned n, int free)
-{
-	struct rte_mempool_objhdr *hdr;
-	struct rte_mempool_objtlr *tlr;
-	uint64_t cookie;
-	void *tmp;
-	void *obj;
-	void **obj_table;
-
-	/* Force to drop the "const" attribute. This is done only when
-	 * DEBUG is enabled */
-	tmp = (void *) obj_table_const;
-	obj_table = (void **) tmp;
-
-	while (n--) {
-		obj = obj_table[n];
-
-		if (rte_mempool_from_obj(obj) != mp)
-			rte_panic("MEMPOOL: object is owned by another "
-				  "mempool\n");
-
-		hdr = __mempool_get_header(obj);
-		cookie = hdr->cookie;
-
-		if (free == 0) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (put)\n");
-			}
-			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
-		}
-		else if (free == 1) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (get)\n");
-			}
-			hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE1;
-		}
-		else if (free == 2) {
-			if (cookie != RTE_MEMPOOL_HEADER_COOKIE1 &&
-			    cookie != RTE_MEMPOOL_HEADER_COOKIE2) {
-				rte_log_set_history(0);
-				RTE_LOG(CRIT, MEMPOOL,
-					"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-					obj, (const void *) mp, cookie);
-				rte_panic("MEMPOOL: bad header cookie (audit)\n");
-			}
-		}
-		tlr = __mempool_get_trailer(obj);
-		cookie = tlr->cookie;
-		if (cookie != RTE_MEMPOOL_TRAILER_COOKIE) {
-			rte_log_set_history(0);
-			RTE_LOG(CRIT, MEMPOOL,
-				"obj=%p, mempool=%p, cookie=%" PRIx64 "\n",
-				obj, (const void *) mp, cookie);
-			rte_panic("MEMPOOL: bad trailer cookie\n");
-		}
-	}
-}
-#ifndef __INTEL_COMPILER
-#pragma GCC diagnostic error "-Wcast-qual"
-#endif
+#define __mempool_check_cookies(mp, obj_table_const, n, free) \
+	rte_mempool_check_cookies(mp, obj_table_const, n, free)
 #else
 #define __mempool_check_cookies(mp, obj_table_const, n, free) do {} while(0)
 #endif /* RTE_LIBRTE_MEMPOOL_DEBUG */
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 17151e0..ff80ac2 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -17,3 +17,11 @@ DPDK_2.0 {
 
 	local: *;
 };
+
+DPDK_16.07 {
+	global:
+
+	rte_mempool_check_cookies;
+
+	local: *;
+} DPDK_2.0;
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 04/35] mempool: use sizeof to get the size of header and trailer
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (2 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 03/35] mempool: uninline function to check cookies Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 05/35] mempool: rename object constructor typedef Olivier Matz
                       ` (31 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Since commits d2e0ca22f and 97e7e685b the headers and trailers
of the mempool are defined as a structure. We can get their
size using a sizeof instead of doing a calculation that will
become wrong at the first structure update.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 46a5d59..992edcd 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -265,24 +265,13 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 
 	sz = (sz != NULL) ? sz : &lsz;
 
-	/*
-	 * In header, we have at least the pointer to the pool, and
-	 * optionaly a 64 bits cookie.
-	 */
-	sz->header_size = 0;
-	sz->header_size += sizeof(struct rte_mempool *); /* ptr to pool */
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-	sz->header_size += sizeof(uint64_t); /* cookie */
-#endif
+	sz->header_size = sizeof(struct rte_mempool_objhdr);
 	if ((flags & MEMPOOL_F_NO_CACHE_ALIGN) == 0)
 		sz->header_size = RTE_ALIGN_CEIL(sz->header_size,
 			RTE_MEMPOOL_ALIGN);
 
-	/* trailer contains the cookie in debug mode */
-	sz->trailer_size = 0;
-#ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-	sz->trailer_size += sizeof(uint64_t); /* cookie */
-#endif
+	sz->trailer_size = sizeof(struct rte_mempool_objtlr);
+
 	/* element size is 8 bytes-aligned at least */
 	sz->elt_size = RTE_ALIGN_CEIL(elt_size, sizeof(uint64_t));
 
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 05/35] mempool: rename object constructor typedef
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (3 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 04/35] mempool: use sizeof to get the size of header and trailer Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 06/35] mempool: list objects when added Olivier Matz
                       ` (30 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

This commit renames mempool_obj_ctor_t as mempool_obj_cb_t.

In next commits, we will add the ability to populate the
mempool and iterate through objects using the same function.
We will use the same callback type for that. As the callback is
not a constructor anymore, rename it into rte_mempool_obj_cb_t.

The rte_mempool_obj_iter_t that was used to iterate over objects
will be removed in next commits.

No functional change.
In this commit, the API is preserved through a compat typedef.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test-pmd/mempool_anon.c                |  4 ++--
 app/test-pmd/mempool_osdep.h               |  2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.h      |  2 +-
 drivers/net/xenvirt/rte_mempool_gntalloc.c |  4 ++--
 lib/librte_mempool/rte_dom0_mempool.c      |  2 +-
 lib/librte_mempool/rte_mempool.c           |  8 ++++----
 lib/librte_mempool/rte_mempool.h           | 27 ++++++++++++++-------------
 7 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/app/test-pmd/mempool_anon.c b/app/test-pmd/mempool_anon.c
index 4730432..5e23848 100644
--- a/app/test-pmd/mempool_anon.c
+++ b/app/test-pmd/mempool_anon.c
@@ -86,7 +86,7 @@ struct rte_mempool *
 mempool_anon_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	struct rte_mempool *mp;
@@ -190,7 +190,7 @@ mempool_anon_create(__rte_unused const char *name,
 	__rte_unused unsigned private_data_size,
 	__rte_unused rte_mempool_ctor_t *mp_init,
 	__rte_unused void *mp_init_arg,
-	__rte_unused rte_mempool_obj_ctor_t *obj_init,
+	__rte_unused rte_mempool_obj_cb_t *obj_init,
 	__rte_unused void *obj_init_arg,
 	__rte_unused int socket_id, __rte_unused unsigned flags)
 {
diff --git a/app/test-pmd/mempool_osdep.h b/app/test-pmd/mempool_osdep.h
index 6b8df68..7ce7297 100644
--- a/app/test-pmd/mempool_osdep.h
+++ b/app/test-pmd/mempool_osdep.h
@@ -48,7 +48,7 @@ struct rte_mempool *
 mempool_anon_create(const char *name, unsigned n, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 	int socket_id, unsigned flags);
 
 #endif /*_RTE_MEMPOOL_OSDEP_H_ */
diff --git a/drivers/net/xenvirt/rte_eth_xenvirt.h b/drivers/net/xenvirt/rte_eth_xenvirt.h
index fc15a63..4995a9b 100644
--- a/drivers/net/xenvirt/rte_eth_xenvirt.h
+++ b/drivers/net/xenvirt/rte_eth_xenvirt.h
@@ -51,7 +51,7 @@ struct rte_mempool *
 rte_mempool_gntalloc_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags);
 
 
diff --git a/drivers/net/xenvirt/rte_mempool_gntalloc.c b/drivers/net/xenvirt/rte_mempool_gntalloc.c
index e5c681e..73e82f8 100644
--- a/drivers/net/xenvirt/rte_mempool_gntalloc.c
+++ b/drivers/net/xenvirt/rte_mempool_gntalloc.c
@@ -78,7 +78,7 @@ static struct _mempool_gntalloc_info
 _create_mempool(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	struct _mempool_gntalloc_info mgi;
@@ -253,7 +253,7 @@ struct rte_mempool *
 rte_mempool_gntalloc_create(const char *name, unsigned elt_num, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags)
 {
 	int rv;
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
index 0d6d750..0051bd5 100644
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ b/lib/librte_mempool/rte_dom0_mempool.c
@@ -83,7 +83,7 @@ struct rte_mempool *
 rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 	int socket_id, unsigned flags)
 {
 	struct rte_mempool *mp = NULL;
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 992edcd..ca609f0 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -128,7 +128,7 @@ static unsigned optimize_object_size(unsigned obj_size)
 
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg)
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
@@ -225,7 +225,7 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 
 struct mempool_populate_arg {
 	struct rte_mempool     *mp;
-	rte_mempool_obj_ctor_t *obj_init;
+	rte_mempool_obj_cb_t   *obj_init;
 	void                   *obj_init_arg;
 };
 
@@ -240,7 +240,7 @@ mempool_obj_populate(void *arg, void *start, void *end, uint32_t idx)
 
 static void
 mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
-	rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg)
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
 {
 	uint32_t elt_sz;
 	struct mempool_populate_arg arg;
@@ -431,7 +431,7 @@ struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags, void *vaddr,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index a6b82cf..d35833a 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -323,6 +323,17 @@ void rte_mempool_check_cookies(const struct rte_mempool *mp,
 #endif /* RTE_LIBRTE_MEMPOOL_DEBUG */
 
 /**
+ * An object callback function for mempool.
+ *
+ * Arguments are the mempool, the opaque pointer given by the user in
+ * rte_mempool_create(), the pointer to the element and the index of
+ * the element in the pool.
+ */
+typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
+		void *opaque, void *obj, unsigned obj_idx);
+typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
+
+/**
  * A mempool object iterator callback function.
  */
 typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
@@ -371,16 +382,6 @@ uint32_t rte_mempool_obj_iter(void *vaddr,
 	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg);
 
 /**
- * An object constructor callback function for mempool.
- *
- * Arguments are the mempool, the opaque pointer given by the user in
- * rte_mempool_create(), the pointer to the element and the index of
- * the element in the pool.
- */
-typedef void (rte_mempool_obj_ctor_t)(struct rte_mempool *, void *,
-				      void *, unsigned);
-
-/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -473,7 +474,7 @@ struct rte_mempool *
 rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 		   unsigned cache_size, unsigned private_data_size,
 		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		   int socket_id, unsigned flags);
 
 /**
@@ -539,7 +540,7 @@ struct rte_mempool *
 rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags, void *vaddr,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
@@ -628,7 +629,7 @@ struct rte_mempool *
 rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
 		unsigned cache_size, unsigned private_data_size,
 		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
 		int socket_id, unsigned flags);
 
 
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 06/35] mempool: list objects when added
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (4 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 05/35] mempool: rename object constructor typedef Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 07/35] mempool: remove const qualifier when browsing pools Olivier Matz
                       ` (29 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Introduce a list entry in object header so they can be listed and
browsed. The objective is to introduce a more simple way to browse the
elements of a mempool.

The next commits will update rte_mempool_obj_iter() to use this list,
and remove the previous complex implementation.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c |  2 ++
 lib/librte_mempool/rte_mempool.h | 15 ++++++++++++---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index ca609f0..c29a4c7 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -138,6 +138,7 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
+	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
@@ -587,6 +588,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->cache_size = cache_size;
 	mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
 	mp->private_data_size = private_data_size;
+	STAILQ_INIT(&mp->elt_list);
 
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index d35833a..acc75f8 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -150,11 +150,13 @@ struct rte_mempool_objsz {
  * Mempool object header structure
  *
  * Each object stored in mempools are prefixed by this header structure,
- * it allows to retrieve the mempool pointer from the object. When debug
- * is enabled, a cookie is also added in this structure preventing
- * corruptions and double-frees.
+ * it allows to retrieve the mempool pointer from the object and to
+ * iterate on all objects attached to a mempool. When debug is enabled,
+ * a cookie is also added in this structure preventing corruptions and
+ * double-frees.
  */
 struct rte_mempool_objhdr {
+	STAILQ_ENTRY(rte_mempool_objhdr) next; /**< Next in list. */
 	struct rte_mempool *mp;          /**< The mempool owning the object. */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	uint64_t cookie;                 /**< Debug cookie. */
@@ -162,6 +164,11 @@ struct rte_mempool_objhdr {
 };
 
 /**
+ * A list of object headers type
+ */
+STAILQ_HEAD(rte_mempool_objhdr_list, rte_mempool_objhdr);
+
+/**
  * Mempool object trailer structure
  *
  * In debug mode, each object stored in mempools are suffixed by this
@@ -194,6 +201,8 @@ struct rte_mempool {
 
 	struct rte_mempool_cache *local_cache; /**< Per-lcore local cache */
 
+	struct rte_mempool_objhdr_list elt_list; /**< List of objects in pool */
+
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	/** Per-lcore statistics. */
 	struct rte_mempool_debug_stats stats[RTE_MAX_LCORE];
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 07/35] mempool: remove const qualifier when browsing pools
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (5 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 06/35] mempool: list objects when added Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 08/35] mempool: remove const qualifier in dump and audit Olivier Matz
                       ` (28 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

This commit removes the const qualifier for the mempool in
rte_mempool_walk() callback prototype.

Indeed, most functions that can be done on a mempool require a non-const
mempool pointer, except the dump and the audit. Therefore, the
mempool_walk() is more useful if the mempool pointer is not const.

This is required by next commit where the mellanox drivers use
rte_mempool_walk() to iterate the mempools, then rte_mempool_obj_iter()
to iterate the objects in each mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx4/mlx4.c          | 2 +-
 drivers/net/mlx5/mlx5_rxtx.c     | 2 +-
 drivers/net/mlx5/mlx5_rxtx.h     | 2 +-
 lib/librte_mempool/rte_mempool.c | 2 +-
 lib/librte_mempool/rte_mempool.h | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index c5d8535..2dcb5b4 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1369,7 +1369,7 @@ txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
  *   Pointer to TX queue structure.
  */
 static void
-txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
+txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 1832a21..73a9a1c 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -311,7 +311,7 @@ txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
  *   Pointer to TX queue structure.
  */
 void
-txq_mp2mr_iter(const struct rte_mempool *mp, void *arg)
+txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 0e2b607..db054d6 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -342,7 +342,7 @@ uint16_t mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
 /* mlx5_rxtx.c */
 
 struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, const struct rte_mempool *);
-void txq_mp2mr_iter(const struct rte_mempool *, void *);
+void txq_mp2mr_iter(struct rte_mempool *, void *);
 uint16_t mlx5_tx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst_sp(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst(void *, struct rte_mbuf **, uint16_t);
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index c29a4c7..8c220ac 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -971,7 +971,7 @@ rte_mempool_lookup(const char *name)
 	return mp;
 }
 
-void rte_mempool_walk(void (*func)(const struct rte_mempool *, void *),
+void rte_mempool_walk(void (*func)(struct rte_mempool *, void *),
 		      void *arg)
 {
 	struct rte_tailq_entry *te = NULL;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index acc75f8..95f7505 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1309,7 +1309,7 @@ ssize_t rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num,
  * @param arg
  *   Argument passed to iterator
  */
-void rte_mempool_walk(void (*func)(const struct rte_mempool *, void *arg),
+void rte_mempool_walk(void (*func)(struct rte_mempool *, void *arg),
 		      void *arg);
 
 #ifdef __cplusplus
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 08/35] mempool: remove const qualifier in dump and audit
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (6 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 07/35] mempool: remove const qualifier when browsing pools Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 09/35] mempool: use the list to iterate the elements Olivier Matz
                       ` (27 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

In next commits, we will use an iterator to walk through the objects in
mempool in rte_mempool_audit(). This iterator takes a "struct
rte_mempool *" as a parameter because it is assumed that the callback
function can modify the mempool.

The previous approach was to introduce a RTE_DECONST() macro, but
after discussion it seems that removing the const qualifier is better
to avoid fooling the compiler, and also because these functions are
not used in datapath (possible compiler optimizations due to const
are not critical).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 8 ++++----
 lib/librte_mempool/rte_mempool.h | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 8c220ac..ad1895d 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -787,7 +787,7 @@ mempool_obj_audit(void *arg, void *start, void *end, uint32_t idx)
 }
 
 static void
-mempool_audit_cookies(const struct rte_mempool *mp)
+mempool_audit_cookies(struct rte_mempool *mp)
 {
 	uint32_t elt_sz, num;
 	struct mempool_audit_arg arg;
@@ -845,7 +845,7 @@ mempool_audit_cache(const struct rte_mempool *mp)
 
 /* check the consistency of mempool (size, cookies, ...) */
 void
-rte_mempool_audit(const struct rte_mempool *mp)
+rte_mempool_audit(struct rte_mempool *mp)
 {
 	mempool_audit_cache(mp);
 	mempool_audit_cookies(mp);
@@ -856,7 +856,7 @@ rte_mempool_audit(const struct rte_mempool *mp)
 
 /* dump the status of the mempool on the console */
 void
-rte_mempool_dump(FILE *f, const struct rte_mempool *mp)
+rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 {
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	struct rte_mempool_debug_stats sum;
@@ -927,7 +927,7 @@ rte_mempool_dump(FILE *f, const struct rte_mempool *mp)
 void
 rte_mempool_list_dump(FILE *f)
 {
-	const struct rte_mempool *mp = NULL;
+	struct rte_mempool *mp = NULL;
 	struct rte_tailq_entry *te;
 	struct rte_mempool_list *mempool_list;
 
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 95f7505..c7a2fd6 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -650,7 +650,7 @@ rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
  * @param mp
  *   A pointer to the mempool structure.
  */
-void rte_mempool_dump(FILE *f, const struct rte_mempool *mp);
+void rte_mempool_dump(FILE *f, struct rte_mempool *mp);
 
 /**
  * @internal Put several objects back in the mempool; used internally.
@@ -1188,7 +1188,7 @@ rte_mempool_virt2phy(const struct rte_mempool *mp, const void *elt)
  * @param mp
  *   A pointer to the mempool structure.
  */
-void rte_mempool_audit(const struct rte_mempool *mp);
+void rte_mempool_audit(struct rte_mempool *mp);
 
 /**
  * Return a pointer to the private data in an mempool structure.
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 09/35] mempool: use the list to iterate the elements
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (7 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 08/35] mempool: remove const qualifier in dump and audit Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 10/35] mempool: use the list to audit all elements Olivier Matz
                       ` (26 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Now that the mempool objects are chained into a list, we can use it to
browse them. This implies a rework of rte_mempool_obj_iter() API, that
does not need to take as many arguments as before. The previous function
is kept as a private function, and renamed in this commit. It will be
removed in a next commit of the patch series.

The only internal users of this function are the mellanox drivers. The
code is updated accordingly.

Introducing an API compatibility for this function has been considered,
but it is not easy to do without keeping the old code, as the previous
function could also be used to browse elements that were not added in a
mempool. Moreover, the API is already be broken by other patches in this
version.

The library version was already updated in
commit 213af31e0960 ("mempool: reduce structure size if no cache needed")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx4/mlx4.c                    | 51 +++++++--------------
 drivers/net/mlx5/mlx5_rxtx.c               | 51 +++++++--------------
 lib/librte_mempool/rte_mempool.c           | 37 +++++++++++++---
 lib/librte_mempool/rte_mempool.h           | 71 +++++++++---------------------
 lib/librte_mempool/rte_mempool_version.map |  2 +-
 5 files changed, 83 insertions(+), 129 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 2dcb5b4..ce518cf 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1320,7 +1320,6 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 }
 
 struct txq_mp2mr_mbuf_check_data {
-	const struct rte_mempool *mp;
 	int ret;
 };
 
@@ -1328,34 +1327,26 @@ struct txq_mp2mr_mbuf_check_data {
  * Callback function for rte_mempool_obj_iter() to check whether a given
  * mempool object looks like a mbuf.
  *
- * @param[in, out] arg
- *   Context data (struct txq_mp2mr_mbuf_check_data). Contains mempool pointer
- *   and return value.
- * @param[in] start
- *   Object start address.
- * @param[in] end
- *   Object end address.
+ * @param[in] mp
+ *   The mempool pointer
+ * @param[in] arg
+ *   Context data (struct txq_mp2mr_mbuf_check_data). Contains the
+ *   return value.
+ * @param[in] obj
+ *   Object address.
  * @param index
- *   Unused.
- *
- * @return
- *   Nonzero value when object is not a mbuf.
+ *   Object index, unused.
  */
 static void
-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
-		     uint32_t index __rte_unused)
+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
+	uint32_t index __rte_unused)
 {
 	struct txq_mp2mr_mbuf_check_data *data = arg;
-	struct rte_mbuf *buf =
-		(void *)((uintptr_t)start + data->mp->header_size);
+	struct rte_mbuf *buf = obj;
 
-	(void)index;
 	/* Check whether mbuf structure fits element size and whether mempool
 	 * pointer is valid. */
-	if (((uintptr_t)end >= (uintptr_t)(buf + 1)) &&
-	    (buf->pool == data->mp))
-		data->ret = 0;
-	else
+	if (sizeof(*buf) > mp->elt_size || buf->pool != mp)
 		data->ret = -1;
 }
 
@@ -1373,24 +1364,12 @@ txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
-		.mp = mp,
-		.ret = -1,
+		.ret = 0,
 	};
 
-	/* Discard empty mempools. */
-	if (mp->size == 0)
-		return;
 	/* Register mempool only if the first element looks like a mbuf. */
-	rte_mempool_obj_iter((void *)mp->elt_va_start,
-			     1,
-			     mp->header_size + mp->elt_size + mp->trailer_size,
-			     1,
-			     mp->elt_pa,
-			     mp->pg_num,
-			     mp->pg_shift,
-			     txq_mp2mr_mbuf_check,
-			     &data);
-	if (data.ret)
+	if (rte_mempool_obj_iter(mp, txq_mp2mr_mbuf_check, &data) == 0 ||
+			data.ret == -1)
 		return;
 	txq_mp2mr(txq, mp);
 }
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index 73a9a1c..f2fe98b 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -262,7 +262,6 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 }
 
 struct txq_mp2mr_mbuf_check_data {
-	const struct rte_mempool *mp;
 	int ret;
 };
 
@@ -270,34 +269,26 @@ struct txq_mp2mr_mbuf_check_data {
  * Callback function for rte_mempool_obj_iter() to check whether a given
  * mempool object looks like a mbuf.
  *
- * @param[in, out] arg
- *   Context data (struct txq_mp2mr_mbuf_check_data). Contains mempool pointer
- *   and return value.
- * @param[in] start
- *   Object start address.
- * @param[in] end
- *   Object end address.
+ * @param[in] mp
+ *   The mempool pointer
+ * @param[in] arg
+ *   Context data (struct txq_mp2mr_mbuf_check_data). Contains the
+ *   return value.
+ * @param[in] obj
+ *   Object address.
  * @param index
- *   Unused.
- *
- * @return
- *   Nonzero value when object is not a mbuf.
+ *   Object index, unused.
  */
 static void
-txq_mp2mr_mbuf_check(void *arg, void *start, void *end,
-		     uint32_t index __rte_unused)
+txq_mp2mr_mbuf_check(struct rte_mempool *mp, void *arg, void *obj,
+	uint32_t index __rte_unused)
 {
 	struct txq_mp2mr_mbuf_check_data *data = arg;
-	struct rte_mbuf *buf =
-		(void *)((uintptr_t)start + data->mp->header_size);
+	struct rte_mbuf *buf = obj;
 
-	(void)index;
 	/* Check whether mbuf structure fits element size and whether mempool
 	 * pointer is valid. */
-	if (((uintptr_t)end >= (uintptr_t)(buf + 1)) &&
-	    (buf->pool == data->mp))
-		data->ret = 0;
-	else
+	if (sizeof(*buf) > mp->elt_size || buf->pool != mp)
 		data->ret = -1;
 }
 
@@ -315,24 +306,12 @@ txq_mp2mr_iter(struct rte_mempool *mp, void *arg)
 {
 	struct txq *txq = arg;
 	struct txq_mp2mr_mbuf_check_data data = {
-		.mp = mp,
-		.ret = -1,
+		.ret = 0,
 	};
 
-	/* Discard empty mempools. */
-	if (mp->size == 0)
-		return;
 	/* Register mempool only if the first element looks like a mbuf. */
-	rte_mempool_obj_iter((void *)mp->elt_va_start,
-			     1,
-			     mp->header_size + mp->elt_size + mp->trailer_size,
-			     1,
-			     mp->elt_pa,
-			     mp->pg_num,
-			     mp->pg_shift,
-			     txq_mp2mr_mbuf_check,
-			     &data);
-	if (data.ret)
+	if (rte_mempool_obj_iter(mp, txq_mp2mr_mbuf_check, &data) == 0 ||
+			data.ret == -1)
 		return;
 	txq_mp2mr(txq, mp);
 }
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index ad1895d..256ec04 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2016 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -126,6 +127,14 @@ static unsigned optimize_object_size(unsigned obj_size)
 	return new_obj_size * RTE_MEMPOOL_ALIGN;
 }
 
+/**
+ * A mempool object iterator callback function.
+ */
+typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
+	void * /*obj_start*/,
+	void * /*obj_end*/,
+	uint32_t /*obj_index */);
+
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
@@ -161,8 +170,8 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
  * chunk, invoke a callback. It returns the effective number of objects
  * in this memory.
  */
-uint32_t
-rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
+static uint32_t
+rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
 	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
 {
@@ -220,6 +229,24 @@ rte_mempool_obj_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	return i;
 }
 
+/* call obj_cb() for each mempool element */
+uint32_t
+rte_mempool_obj_iter(struct rte_mempool *mp,
+	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg)
+{
+	struct rte_mempool_objhdr *hdr;
+	void *obj;
+	unsigned n = 0;
+
+	STAILQ_FOREACH(hdr, &mp->elt_list, next) {
+		obj = (char *)hdr + sizeof(*hdr);
+		obj_cb(mp, obj_cb_arg, obj, n);
+		n++;
+	}
+
+	return n;
+}
+
 /*
  * Populate  mempool with the objects.
  */
@@ -251,7 +278,7 @@ mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
 	arg.obj_init = obj_init;
 	arg.obj_init_arg = obj_init_arg;
 
-	mp->size = rte_mempool_obj_iter((void *)mp->elt_va_start,
+	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		num, elt_sz, align,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
 		mempool_obj_populate, &arg);
@@ -366,7 +393,7 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	va = (uintptr_t)vaddr;
 	uv = va;
 
-	if ((n = rte_mempool_obj_iter(vaddr, elt_num, total_elt_sz, 1,
+	if ((n = rte_mempool_obj_mem_iter(vaddr, elt_num, total_elt_sz, 1,
 			paddr, pg_num, pg_shift, mempool_lelem_iter,
 			&uv)) != elt_num) {
 		return -(ssize_t)n;
@@ -798,7 +825,7 @@ mempool_audit_cookies(struct rte_mempool *mp)
 	arg.obj_end = mp->elt_va_start;
 	arg.obj_num = 0;
 
-	num = rte_mempool_obj_iter((void *)mp->elt_va_start,
+	num = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		mp->size, elt_sz, 1,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
 		mempool_obj_audit, &arg);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index c7a2fd6..bdb217b 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -2,6 +2,7 @@
  *   BSD LICENSE
  *
  *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2016 6WIND S.A.
  *   All rights reserved.
  *
  *   Redistribution and use in source and binary forms, with or without
@@ -334,63 +335,13 @@ void rte_mempool_check_cookies(const struct rte_mempool *mp,
 /**
  * An object callback function for mempool.
  *
- * Arguments are the mempool, the opaque pointer given by the user in
- * rte_mempool_create(), the pointer to the element and the index of
- * the element in the pool.
+ * Used by rte_mempool_create() and rte_mempool_obj_iter().
  */
 typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
 		void *opaque, void *obj, unsigned obj_idx);
 typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
 
 /**
- * A mempool object iterator callback function.
- */
-typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
-	void * /*obj_start*/,
-	void * /*obj_end*/,
-	uint32_t /*obj_index */);
-
-/**
- * Call a function for each mempool object in a memory chunk
- *
- * Iterate across objects of the given size and alignment in the
- * provided chunk of memory. The given memory buffer can consist of
- * disjointed physical pages.
- *
- * For each object, call the provided callback (if any). This function
- * is used to populate a mempool, or walk through all the elements of a
- * mempool, or estimate how many elements of the given size could be
- * created in the given memory buffer.
- *
- * @param vaddr
- *   Virtual address of the memory buffer.
- * @param elt_num
- *   Maximum number of objects to iterate through.
- * @param elt_sz
- *   Size of each object.
- * @param align
- *   Alignment of each object.
- * @param paddr
- *   Array of physical addresses of the pages that comprises given memory
- *   buffer.
- * @param pg_num
- *   Number of elements in the paddr array.
- * @param pg_shift
- *   LOG2 of the physical pages size.
- * @param obj_iter
- *   Object iterator callback function (could be NULL).
- * @param obj_iter_arg
- *   User defined parameter for the object iterator callback function.
- *
- * @return
- *   Number of objects iterated through.
- */
-uint32_t rte_mempool_obj_iter(void *vaddr,
-	uint32_t elt_num, size_t elt_sz, size_t align,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
-	rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg);
-
-/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -643,6 +594,24 @@ rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
 
 
 /**
+ * Call a function for each mempool element
+ *
+ * Iterate across all objects attached to a rte_mempool and call the
+ * callback function on it.
+ *
+ * @param mp
+ *   A pointer to an initialized mempool.
+ * @param obj_cb
+ *   A function pointer that is called for each object.
+ * @param obj_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   Number of objects iterated.
+ */
+uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
+	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg);
+
+/**
  * Dump the status of the mempool to the console.
  *
  * @param f
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index ff80ac2..72bc967 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -9,7 +9,6 @@ DPDK_2.0 {
 	rte_mempool_dump;
 	rte_mempool_list_dump;
 	rte_mempool_lookup;
-	rte_mempool_obj_iter;
 	rte_mempool_walk;
 	rte_mempool_xmem_create;
 	rte_mempool_xmem_size;
@@ -22,6 +21,7 @@ DPDK_16.07 {
 	global:
 
 	rte_mempool_check_cookies;
+	rte_mempool_obj_iter;
 
 	local: *;
 } DPDK_2.0;
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 10/35] mempool: use the list to audit all elements
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (8 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 09/35] mempool: use the list to iterate the elements Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 11/35] mempool: use the list to initialize objects Olivier Matz
                       ` (25 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Use the new rte_mempool_obj_iter() instead the old rte_mempool_obj_iter()
to iterate among objects to audit them (check for cookies).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 41 ++++++----------------------------------
 1 file changed, 6 insertions(+), 35 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 256ec04..6847fc4 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -721,12 +721,6 @@ rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
 
-struct mempool_audit_arg {
-	const struct rte_mempool *mp;
-	uintptr_t obj_end;
-	uint32_t obj_num;
-};
-
 /* check and update cookies or panic (internal) */
 void rte_mempool_check_cookies(const struct rte_mempool *mp,
 	void * const *obj_table_const, unsigned n, int free)
@@ -802,45 +796,22 @@ void rte_mempool_check_cookies(const struct rte_mempool *mp,
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 static void
-mempool_obj_audit(void *arg, void *start, void *end, uint32_t idx)
+mempool_obj_audit(struct rte_mempool *mp, __rte_unused void *opaque,
+	void *obj, __rte_unused unsigned idx)
 {
-	struct mempool_audit_arg *pa = arg;
-	void *obj;
-
-	obj = (char *)start + pa->mp->header_size;
-	pa->obj_end = (uintptr_t)end;
-	pa->obj_num = idx + 1;
-	__mempool_check_cookies(pa->mp, &obj, 1, 2);
+	__mempool_check_cookies(mp, &obj, 1, 2);
 }
 
 static void
 mempool_audit_cookies(struct rte_mempool *mp)
 {
-	uint32_t elt_sz, num;
-	struct mempool_audit_arg arg;
-
-	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-
-	arg.mp = mp;
-	arg.obj_end = mp->elt_va_start;
-	arg.obj_num = 0;
-
-	num = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
-		mp->size, elt_sz, 1,
-		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_audit, &arg);
+	unsigned num;
 
+	num = rte_mempool_obj_iter(mp, mempool_obj_audit, NULL);
 	if (num != mp->size) {
-			rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
+		rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
 			"iterated only over %u elements\n",
 			mp, mp->size, num);
-	} else if (arg.obj_end != mp->elt_va_end || arg.obj_num != mp->size) {
-			rte_panic("rte_mempool_obj_iter(mempool=%p, size=%u) "
-			"last callback va_end: %#tx (%#tx expeceted), "
-			"num of objects: %u (%u expected)\n",
-			mp, mp->size,
-			arg.obj_end, mp->elt_va_end,
-			arg.obj_num, mp->size);
 	}
 }
 #else
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 11/35] mempool: use the list to initialize objects
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (9 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 10/35] mempool: use the list to audit all elements Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 12/35] mempool: create internal ring in a specific function Olivier Matz
                       ` (24 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Before this patch, the mempool elements were initialized at the time
they were added to the mempool. This patch changes this to do the
initialization of all objects once the mempool is populated, using
rte_mempool_obj_iter() introduced in previous commits.

Thanks to this modification, we are getting closer to a new API
that would allow us to do:
  mempool_init()
  mempool_populate(mem1)
  mempool_populate(mem2)
  mempool_populate(mem3)
  mempool_init_obj()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 36 +++++++++++++-----------------------
 1 file changed, 13 insertions(+), 23 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 6847fc4..3c7507f 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -136,8 +136,7 @@ typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
 	uint32_t /*obj_index */);
 
 static void
-mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
+mempool_add_elem(struct rte_mempool *mp, void *obj)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
@@ -154,9 +153,6 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, uint32_t obj_idx,
 	tlr = __mempool_get_trailer(obj);
 	tlr->cookie = RTE_MEMPOOL_TRAILER_COOKIE;
 #endif
-	/* call the initializer */
-	if (obj_init)
-		obj_init(mp, obj_init_arg, obj, obj_idx);
 
 	/* enqueue in ring */
 	rte_ring_sp_enqueue(mp->ring, obj);
@@ -251,37 +247,27 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
  * Populate  mempool with the objects.
  */
 
-struct mempool_populate_arg {
-	struct rte_mempool     *mp;
-	rte_mempool_obj_cb_t   *obj_init;
-	void                   *obj_init_arg;
-};
-
 static void
-mempool_obj_populate(void *arg, void *start, void *end, uint32_t idx)
+mempool_obj_populate(void *arg, void *start, void *end,
+	__rte_unused uint32_t idx)
 {
-	struct mempool_populate_arg *pa = arg;
+	struct rte_mempool *mp = arg;
 
-	mempool_add_elem(pa->mp, start, idx, pa->obj_init, pa->obj_init_arg);
-	pa->mp->elt_va_end = (uintptr_t)end;
+	mempool_add_elem(mp, start);
+	mp->elt_va_end = (uintptr_t)end;
 }
 
 static void
-mempool_populate(struct rte_mempool *mp, size_t num, size_t align,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg)
+mempool_populate(struct rte_mempool *mp, size_t num, size_t align)
 {
 	uint32_t elt_sz;
-	struct mempool_populate_arg arg;
 
 	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-	arg.mp = mp;
-	arg.obj_init = obj_init;
-	arg.obj_init_arg = obj_init_arg;
 
 	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
 		num, elt_sz, align,
 		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_populate, &arg);
+		mempool_obj_populate, mp);
 }
 
 /* get the header, trailer and total size of a mempool element. */
@@ -651,7 +637,11 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (mp_init)
 		mp_init(mp, mp_init_arg);
 
-	mempool_populate(mp, n, 1, obj_init, obj_init_arg);
+	mempool_populate(mp, n, 1);
+
+	/* call the initializer */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
 
 	te->data = (void *) mp;
 
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 12/35] mempool: create internal ring in a specific function
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (10 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 11/35] mempool: use the list to initialize objects Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 13/35] mempool: store physical address in objects Olivier Matz
                       ` (23 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

This makes the code of rte_mempool_create() clearer, and it will make
the introduction of external mempool handler easier (in another patch
series). Indeed, this function contains the specific part when a ring is
used, but it could be replaced by something else in the future.

This commit also adds a socket_id field in the mempool structure that
is used by this new function.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 56 ++++++++++++++++++++++++++--------------
 lib/librte_mempool/rte_mempool.h |  1 +
 2 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 3c7507f..61e191e 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -434,6 +434,36 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 					       MEMPOOL_PG_SHIFT_MAX);
 }
 
+/* create the internal ring */
+static int
+rte_mempool_ring_create(struct rte_mempool *mp)
+{
+	int rg_flags = 0;
+	char rg_name[RTE_RING_NAMESIZE];
+	struct rte_ring *r;
+
+	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, mp->name);
+
+	/* ring flags */
+	if (mp->flags & MEMPOOL_F_SP_PUT)
+		rg_flags |= RING_F_SP_ENQ;
+	if (mp->flags & MEMPOOL_F_SC_GET)
+		rg_flags |= RING_F_SC_DEQ;
+
+	/* Allocate the ring that will be used to store objects.
+	 * Ring functions will return appropriate errors if we are
+	 * running as a secondary process etc., so no checks made
+	 * in this function for that condition.
+	 */
+	r = rte_ring_create(rg_name, rte_align32pow2(mp->size + 1),
+		mp->socket_id, rg_flags);
+	if (r == NULL)
+		return -rte_errno;
+
+	mp->ring = r;
+	return 0;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -450,15 +480,12 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
 {
 	char mz_name[RTE_MEMZONE_NAMESIZE];
-	char rg_name[RTE_RING_NAMESIZE];
 	struct rte_mempool_list *mempool_list;
 	struct rte_mempool *mp = NULL;
 	struct rte_tailq_entry *te = NULL;
-	struct rte_ring *r = NULL;
 	const struct rte_memzone *mz;
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-	int rg_flags = 0;
 	void *obj;
 	struct rte_mempool_objsz objsz;
 	void *startaddr;
@@ -501,12 +528,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		flags |= MEMPOOL_F_NO_SPREAD;
 
-	/* ring flags */
-	if (flags & MEMPOOL_F_SP_PUT)
-		rg_flags |= RING_F_SP_ENQ;
-	if (flags & MEMPOOL_F_SC_GET)
-		rg_flags |= RING_F_SC_DEQ;
-
 	/* calculate mempool object sizes. */
 	if (!rte_mempool_calc_obj_size(elt_size, flags, &objsz)) {
 		rte_errno = EINVAL;
@@ -515,15 +536,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 	rte_rwlock_write_lock(RTE_EAL_MEMPOOL_RWLOCK);
 
-	/* allocate the ring that will be used to store objects */
-	/* Ring functions will return appropriate errors if we are
-	 * running as a secondary process etc., so no checks made
-	 * in this function for that condition */
-	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, name);
-	r = rte_ring_create(rg_name, rte_align32pow2(n+1), socket_id, rg_flags);
-	if (r == NULL)
-		goto exit_unlock;
-
 	/*
 	 * reserve a memory zone for this mempool: private data is
 	 * cache-aligned
@@ -592,7 +604,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->phys_addr = mz->phys_addr;
-	mp->ring = r;
+	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
 	mp->elt_size = objsz.elt_size;
@@ -603,6 +615,9 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->private_data_size = private_data_size;
 	STAILQ_INIT(&mp->elt_list);
 
+	if (rte_mempool_ring_create(mp) < 0)
+		goto exit_unlock;
+
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
 	 * The local_cache points to just past the elt_pa[] array.
@@ -654,7 +669,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	rte_ring_free(r);
+	if (mp != NULL)
+		rte_ring_free(mp->ring);
 	rte_free(te);
 
 	return NULL;
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index bdb217b..12215f6 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -189,6 +189,7 @@ struct rte_mempool {
 	struct rte_ring *ring;           /**< Ring to store objects. */
 	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
 	int flags;                       /**< Flags of the mempool. */
+	int socket_id;                   /**< Socket id passed at mempool creation. */
 	uint32_t size;                   /**< Size of the mempool. */
 	uint32_t cache_size;             /**< Size of per-lcore local cache. */
 	uint32_t cache_flushthresh;
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 13/35] mempool: store physical address in objects
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (11 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 12/35] mempool: create internal ring in a specific function Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-25 17:51       ` Jain, Deepak K
  2016-05-18 11:04     ` [PATCH v3 14/35] mempool: remove macro to check if contiguous Olivier Matz
                       ` (22 subsequent siblings)
  35 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Store the physical address of the object in its header. It simplifies
rte_mempool_virt2phy() and prepares the removing of the paddr[] table
in the mempool header.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++++++++++------
 lib/librte_mempool/rte_mempool.h | 11 ++++++-----
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 61e191e..ce12db5 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -133,19 +133,22 @@ static unsigned optimize_object_size(unsigned obj_size)
 typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
 	void * /*obj_start*/,
 	void * /*obj_end*/,
-	uint32_t /*obj_index */);
+	uint32_t /*obj_index */,
+	phys_addr_t /*physaddr*/);
 
 static void
-mempool_add_elem(struct rte_mempool *mp, void *obj)
+mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
 
 	obj = (char *)obj + mp->header_size;
+	physaddr += mp->header_size;
 
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
+	hdr->physaddr = physaddr;
 	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
@@ -175,6 +178,7 @@ rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	uint32_t pgn, pgf;
 	uintptr_t end, start, va;
 	uintptr_t pg_sz;
+	phys_addr_t physaddr;
 
 	pg_sz = (uintptr_t)1 << pg_shift;
 	va = (uintptr_t)vaddr;
@@ -210,9 +214,10 @@ rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 		 * otherwise, just skip that chunk unused.
 		 */
 		if (k == pgn) {
+			physaddr = paddr[k] + (start & (pg_sz - 1));
 			if (obj_iter != NULL)
 				obj_iter(obj_iter_arg, (void *)start,
-					(void *)end, i);
+					(void *)end, i, physaddr);
 			va = end;
 			j += pgf;
 			i++;
@@ -249,11 +254,11 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 
 static void
 mempool_obj_populate(void *arg, void *start, void *end,
-	__rte_unused uint32_t idx)
+	__rte_unused uint32_t idx, phys_addr_t physaddr)
 {
 	struct rte_mempool *mp = arg;
 
-	mempool_add_elem(mp, start);
+	mempool_add_elem(mp, start, physaddr);
 	mp->elt_va_end = (uintptr_t)end;
 }
 
@@ -358,7 +363,7 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
  */
 static void
 mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
-	__rte_unused uint32_t idx)
+	__rte_unused uint32_t idx, __rte_unused phys_addr_t physaddr)
 {
 	*(uintptr_t *)arg = (uintptr_t)end;
 }
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 12215f6..4f95bdf 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -159,6 +159,7 @@ struct rte_mempool_objsz {
 struct rte_mempool_objhdr {
 	STAILQ_ENTRY(rte_mempool_objhdr) next; /**< Next in list. */
 	struct rte_mempool *mp;          /**< The mempool owning the object. */
+	phys_addr_t physaddr;            /**< Physical address of the object. */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	uint64_t cookie;                 /**< Debug cookie. */
 #endif
@@ -1131,13 +1132,13 @@ rte_mempool_empty(const struct rte_mempool *mp)
  *   The physical address of the elt element.
  */
 static inline phys_addr_t
-rte_mempool_virt2phy(const struct rte_mempool *mp, const void *elt)
+rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
 {
 	if (rte_eal_has_hugepages()) {
-		uintptr_t off;
-
-		off = (const char *)elt - (const char *)mp->elt_va_start;
-		return mp->elt_pa[off >> mp->pg_shift] + (off & mp->pg_mask);
+		const struct rte_mempool_objhdr *hdr;
+		hdr = (const struct rte_mempool_objhdr *)RTE_PTR_SUB(elt,
+			sizeof(*hdr));
+		return hdr->physaddr;
 	} else {
 		/*
 		 * If huge pages are disabled, we cannot assume the
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 14/35] mempool: remove macro to check if contiguous
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (12 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 13/35] mempool: store physical address in objects Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 15/35] mempool: store memory chunks in a list Olivier Matz
                       ` (21 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

This commit removes MEMPOOL_IS_CONTIG().

The next commits will change the behavior of the mempool library so that
the objects will never be allocated in the same memzone than the mempool
header. Therefore, there is no reason to keep this macro that would
always return 0.

This macro was only used in app/test.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c          | 7 +++----
 lib/librte_mempool/rte_mempool.h | 7 -------
 2 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 10e1fa4..2f317f2 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -126,12 +126,11 @@ test_mempool_basic(void)
 			MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size))
 		return -1;
 
+#ifndef RTE_EXEC_ENV_BSDAPP /* rte_mem_virt2phy() not supported on bsd */
 	printf("get physical address of an object\n");
-	if (MEMPOOL_IS_CONTIG(mp) &&
-			rte_mempool_virt2phy(mp, obj) !=
-			(phys_addr_t) (mp->phys_addr +
-			(phys_addr_t) ((char*) obj - (char*) mp)))
+	if (rte_mempool_virt2phy(mp, obj) != rte_mem_virt2phy(obj))
 		return -1;
+#endif
 
 	printf("put the object back\n");
 	rte_mempool_put(mp, obj);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 4f95bdf..a65b63f 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -274,13 +274,6 @@ struct rte_mempool {
 	(sizeof(*(mp)) + __PA_SIZE(mp, pgn) + (((cs) == 0) ? 0 : \
 	(sizeof(struct rte_mempool_cache) * RTE_MAX_LCORE)))
 
-/**
- * Return true if the whole mempool is in contiguous memory.
- */
-#define	MEMPOOL_IS_CONTIG(mp)                      \
-	((mp)->pg_num == MEMPOOL_PG_NUM_DEFAULT && \
-	(mp)->phys_addr == (mp)->elt_pa[0])
-
 /* return the header of a mempool object (internal) */
 static inline struct rte_mempool_objhdr *__mempool_get_header(void *obj)
 {
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 15/35] mempool: store memory chunks in a list
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (13 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 14/35] mempool: remove macro to check if contiguous Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 16/35] mempool: add function to iterate the memory chunks Olivier Matz
                       ` (20 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Do not use paddr table to store the mempool memory chunks.
This will allow to have several chunks with different virtual addresses.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c          |   2 +-
 lib/librte_mempool/rte_mempool.c | 207 ++++++++++++++++++++++++++-------------
 lib/librte_mempool/rte_mempool.h |  53 +++++-----
 3 files changed, 167 insertions(+), 95 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 2f317f2..2bc3ac0 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -123,7 +123,7 @@ test_mempool_basic(void)
 
 	printf("get private data\n");
 	if (rte_mempool_get_priv(mp) != (char *)mp +
-			MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size))
+			MEMPOOL_HEADER_SIZE(mp, mp->cache_size))
 		return -1;
 
 #ifndef RTE_EXEC_ENV_BSDAPP /* rte_mem_virt2phy() not supported on bsd */
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index ce12db5..9260318 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -142,14 +142,12 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
 
-	obj = (char *)obj + mp->header_size;
-	physaddr += mp->header_size;
-
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
 	hdr->physaddr = physaddr;
 	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
+	mp->populated_size++;
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	hdr->cookie = RTE_MEMPOOL_HEADER_COOKIE2;
@@ -248,33 +246,6 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 	return n;
 }
 
-/*
- * Populate  mempool with the objects.
- */
-
-static void
-mempool_obj_populate(void *arg, void *start, void *end,
-	__rte_unused uint32_t idx, phys_addr_t physaddr)
-{
-	struct rte_mempool *mp = arg;
-
-	mempool_add_elem(mp, start, physaddr);
-	mp->elt_va_end = (uintptr_t)end;
-}
-
-static void
-mempool_populate(struct rte_mempool *mp, size_t num, size_t align)
-{
-	uint32_t elt_sz;
-
-	elt_sz = mp->elt_size + mp->header_size + mp->trailer_size;
-
-	mp->size = rte_mempool_obj_mem_iter((void *)mp->elt_va_start,
-		num, elt_sz, align,
-		mp->elt_pa, mp->pg_num, mp->pg_shift,
-		mempool_obj_populate, mp);
-}
-
 /* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
@@ -469,6 +440,110 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 	return 0;
 }
 
+/* Free memory chunks used by a mempool. Objects must be in pool */
+static void
+rte_mempool_free_memchunks(struct rte_mempool *mp)
+{
+	struct rte_mempool_memhdr *memhdr;
+	void *elt;
+
+	while (!STAILQ_EMPTY(&mp->elt_list)) {
+		rte_ring_sc_dequeue(mp->ring, &elt);
+		(void)elt;
+		STAILQ_REMOVE_HEAD(&mp->elt_list, next);
+		mp->populated_size--;
+	}
+
+	while (!STAILQ_EMPTY(&mp->mem_list)) {
+		memhdr = STAILQ_FIRST(&mp->mem_list);
+		STAILQ_REMOVE_HEAD(&mp->mem_list, next);
+		rte_free(memhdr);
+		mp->nb_mem_chunks--;
+	}
+}
+
+/* Add objects in the pool, using a physically contiguous memory
+ * zone. Return the number of objects added, or a negative value
+ * on error.
+ */
+static int
+rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
+	phys_addr_t paddr, size_t len)
+{
+	unsigned total_elt_sz;
+	unsigned i = 0;
+	size_t off;
+	struct rte_mempool_memhdr *memhdr;
+
+	/* mempool is already populated */
+	if (mp->populated_size >= mp->size)
+		return -ENOSPC;
+
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+
+	memhdr = rte_zmalloc("MEMPOOL_MEMHDR", sizeof(*memhdr), 0);
+	if (memhdr == NULL)
+		return -ENOMEM;
+
+	memhdr->mp = mp;
+	memhdr->addr = vaddr;
+	memhdr->phys_addr = paddr;
+	memhdr->len = len;
+
+	if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
+		off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
+	else
+		off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - vaddr;
+
+	while (off + total_elt_sz <= len && mp->populated_size < mp->size) {
+		off += mp->header_size;
+		mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
+		off += mp->elt_size + mp->trailer_size;
+		i++;
+	}
+
+	/* not enough room to store one object */
+	if (i == 0)
+		return -EINVAL;
+
+	STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
+	mp->nb_mem_chunks++;
+	return i;
+}
+
+/* Add objects in the pool, using a table of physical pages. Return the
+ * number of objects added, or a negative value on error.
+ */
+static int
+rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+{
+	uint32_t i, n;
+	int ret, cnt = 0;
+	size_t pg_sz = (size_t)1 << pg_shift;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+
+	for (i = 0; i < pg_num && mp->populated_size < mp->size; i += n) {
+
+		/* populate with the largest group of contiguous pages */
+		for (n = 1; (i + n) < pg_num &&
+			     paddr[i] + pg_sz == paddr[i+n]; n++)
+			;
+
+		ret = rte_mempool_populate_phys(mp, vaddr + i * pg_sz,
+			paddr[i], n * pg_sz);
+		if (ret < 0) {
+			rte_mempool_free_memchunks(mp);
+			return ret;
+		}
+		cnt += ret;
+	}
+	return cnt;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -495,6 +570,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	struct rte_mempool_objsz objsz;
 	void *startaddr;
 	int page_size = getpagesize();
+	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -524,7 +600,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	}
 
 	/* Check that pg_num and pg_shift parameters are valid. */
-	if (pg_num < RTE_DIM(mp->elt_pa) || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
+	if (pg_num == 0 || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
@@ -571,7 +647,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	 * store mempool objects. Otherwise reserve a memzone that is large
 	 * enough to hold mempool header and metadata plus mempool objects.
 	 */
-	mempool_size = MEMPOOL_HEADER_SIZE(mp, pg_num, cache_size);
+	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
 	if (vaddr == NULL)
@@ -619,6 +695,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
 	mp->private_data_size = private_data_size;
 	STAILQ_INIT(&mp->elt_list);
+	STAILQ_INIT(&mp->mem_list);
 
 	if (rte_mempool_ring_create(mp) < 0)
 		goto exit_unlock;
@@ -628,37 +705,31 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	 * The local_cache points to just past the elt_pa[] array.
 	 */
 	mp->local_cache = (struct rte_mempool_cache *)
-		RTE_PTR_ADD(mp, MEMPOOL_HEADER_SIZE(mp, pg_num, 0));
-
-	/* calculate address of the first element for continuous mempool. */
-	obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, pg_num, cache_size) +
-		private_data_size;
-	obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
+		RTE_PTR_ADD(mp, MEMPOOL_HEADER_SIZE(mp, 0));
 
-	/* populate address translation fields. */
-	mp->pg_num = pg_num;
-	mp->pg_shift = pg_shift;
-	mp->pg_mask = RTE_LEN2MASK(mp->pg_shift, typeof(mp->pg_mask));
+	/* call the initializer */
+	if (mp_init)
+		mp_init(mp, mp_init_arg);
 
 	/* mempool elements allocated together with mempool */
 	if (vaddr == NULL) {
-		mp->elt_va_start = (uintptr_t)obj;
-		mp->elt_pa[0] = mp->phys_addr +
-			(mp->elt_va_start - (uintptr_t)mp);
+		/* calculate address of the first elt for continuous mempool. */
+		obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, cache_size) +
+			private_data_size;
+		obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
+
+		ret = rte_mempool_populate_phys(mp, obj,
+			mp->phys_addr + ((char *)obj - (char *)mp),
+			objsz.total_size * n);
+		if (ret != (int)mp->size)
+			goto exit_unlock;
 	} else {
-		/* mempool elements in a separate chunk of memory. */
-		mp->elt_va_start = (uintptr_t)vaddr;
-		memcpy(mp->elt_pa, paddr, sizeof (mp->elt_pa[0]) * pg_num);
+		ret = rte_mempool_populate_phys_tab(mp, vaddr,
+			paddr, pg_num, pg_shift);
+		if (ret != (int)mp->size)
+			goto exit_unlock;
 	}
 
-	mp->elt_va_end = mp->elt_va_start;
-
-	/* call the initializer */
-	if (mp_init)
-		mp_init(mp, mp_init_arg);
-
-	mempool_populate(mp, n, 1);
-
 	/* call the initializer */
 	if (obj_init)
 		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
@@ -674,8 +745,10 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	if (mp != NULL)
+	if (mp != NULL) {
+		rte_mempool_free_memchunks(mp);
 		rte_ring_free(mp->ring);
+	}
 	rte_free(te);
 
 	return NULL;
@@ -871,8 +944,10 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 	struct rte_mempool_debug_stats sum;
 	unsigned lcore_id;
 #endif
+	struct rte_mempool_memhdr *memhdr;
 	unsigned common_count;
 	unsigned cache_count;
+	size_t mem_len = 0;
 
 	RTE_ASSERT(f != NULL);
 	RTE_ASSERT(mp != NULL);
@@ -881,7 +956,9 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 	fprintf(f, "  flags=%x\n", mp->flags);
 	fprintf(f, "  ring=<%s>@%p\n", mp->ring->name, mp->ring);
 	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->phys_addr);
+	fprintf(f, "  nb_mem_chunks=%u\n", mp->nb_mem_chunks);
 	fprintf(f, "  size=%"PRIu32"\n", mp->size);
+	fprintf(f, "  populated_size=%"PRIu32"\n", mp->populated_size);
 	fprintf(f, "  header_size=%"PRIu32"\n", mp->header_size);
 	fprintf(f, "  elt_size=%"PRIu32"\n", mp->elt_size);
 	fprintf(f, "  trailer_size=%"PRIu32"\n", mp->trailer_size);
@@ -889,17 +966,13 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 	       mp->header_size + mp->elt_size + mp->trailer_size);
 
 	fprintf(f, "  private_data_size=%"PRIu32"\n", mp->private_data_size);
-	fprintf(f, "  pg_num=%"PRIu32"\n", mp->pg_num);
-	fprintf(f, "  pg_shift=%"PRIu32"\n", mp->pg_shift);
-	fprintf(f, "  pg_mask=%#tx\n", mp->pg_mask);
-	fprintf(f, "  elt_va_start=%#tx\n", mp->elt_va_start);
-	fprintf(f, "  elt_va_end=%#tx\n", mp->elt_va_end);
-	fprintf(f, "  elt_pa[0]=0x%" PRIx64 "\n", mp->elt_pa[0]);
-
-	if (mp->size != 0)
+
+	STAILQ_FOREACH(memhdr, &mp->mem_list, next)
+		mem_len += memhdr->len;
+	if (mem_len != 0) {
 		fprintf(f, "  avg bytes/object=%#Lf\n",
-			(long double)(mp->elt_va_end - mp->elt_va_start) /
-			mp->size);
+			(long double)mem_len / mp->size);
+	}
 
 	cache_count = rte_mempool_dump_cache(f, mp);
 	common_count = rte_ring_count(mp->ring);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index a65b63f..690928e 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -183,6 +183,25 @@ struct rte_mempool_objtlr {
 };
 
 /**
+ * A list of memory where objects are stored
+ */
+STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr);
+
+/**
+ * Mempool objects memory header structure
+ *
+ * The memory chunks where objects are stored. Each chunk is virtually
+ * and physically contiguous.
+ */
+struct rte_mempool_memhdr {
+	STAILQ_ENTRY(rte_mempool_memhdr) next; /**< Next in list. */
+	struct rte_mempool *mp;  /**< The mempool owning the chunk */
+	void *addr;              /**< Virtual address of the chunk */
+	phys_addr_t phys_addr;   /**< Physical address of the chunk */
+	size_t len;              /**< length of the chunk */
+};
+
+/**
  * The RTE mempool structure.
  */
 struct rte_mempool {
@@ -191,7 +210,7 @@ struct rte_mempool {
 	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
 	int flags;                       /**< Flags of the mempool. */
 	int socket_id;                   /**< Socket id passed at mempool creation. */
-	uint32_t size;                   /**< Size of the mempool. */
+	uint32_t size;                   /**< Max size of the mempool. */
 	uint32_t cache_size;             /**< Size of per-lcore local cache. */
 	uint32_t cache_flushthresh;
 	/**< Threshold before we flush excess elements. */
@@ -204,26 +223,15 @@ struct rte_mempool {
 
 	struct rte_mempool_cache *local_cache; /**< Per-lcore local cache */
 
+	uint32_t populated_size;         /**< Number of populated objects. */
 	struct rte_mempool_objhdr_list elt_list; /**< List of objects in pool */
+	uint32_t nb_mem_chunks;          /**< Number of memory chunks */
+	struct rte_mempool_memhdr_list mem_list; /**< List of memory chunks */
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	/** Per-lcore statistics. */
 	struct rte_mempool_debug_stats stats[RTE_MAX_LCORE];
 #endif
-
-	/* Address translation support, starts from next cache line. */
-
-	/** Number of elements in the elt_pa array. */
-	uint32_t    pg_num __rte_cache_aligned;
-	uint32_t    pg_shift;     /**< LOG2 of the physical pages. */
-	uintptr_t   pg_mask;      /**< physical page mask value. */
-	uintptr_t   elt_va_start;
-	/**< Virtual address of the first mempool object. */
-	uintptr_t   elt_va_end;
-	/**< Virtual address of the <size + 1> mempool object. */
-	phys_addr_t elt_pa[MEMPOOL_PG_NUM_DEFAULT];
-	/**< Array of physical page addresses for the mempool objects buffer. */
-
 }  __rte_cache_aligned;
 
 #define MEMPOOL_F_NO_SPREAD      0x0001 /**< Do not spread among memory channels. */
@@ -254,24 +262,15 @@ struct rte_mempool {
 #endif
 
 /**
- * Size of elt_pa array size based on number of pages. (Internal use)
- */
-#define __PA_SIZE(mp, pgn) \
-	RTE_ALIGN_CEIL((((pgn) - RTE_DIM((mp)->elt_pa)) * \
-	sizeof((mp)->elt_pa[0])), RTE_CACHE_LINE_SIZE)
-
-/**
  * Calculate the size of the mempool header.
  *
  * @param mp
  *   Pointer to the memory pool.
- * @param pgn
- *   Number of pages used to store mempool objects.
  * @param cs
  *   Size of the per-lcore cache.
  */
-#define MEMPOOL_HEADER_SIZE(mp, pgn, cs) \
-	(sizeof(*(mp)) + __PA_SIZE(mp, pgn) + (((cs) == 0) ? 0 : \
+#define MEMPOOL_HEADER_SIZE(mp, cs) \
+	(sizeof(*(mp)) + (((cs) == 0) ? 0 : \
 	(sizeof(struct rte_mempool_cache) * RTE_MAX_LCORE)))
 
 /* return the header of a mempool object (internal) */
@@ -1165,7 +1164,7 @@ void rte_mempool_audit(struct rte_mempool *mp);
 static inline void *rte_mempool_get_priv(struct rte_mempool *mp)
 {
 	return (char *)mp +
-		MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size);
+		MEMPOOL_HEADER_SIZE(mp, mp->cache_size);
 }
 
 /**
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 16/35] mempool: add function to iterate the memory chunks
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (14 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 15/35] mempool: store memory chunks in a list Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 17/35] mempool: simplify the memory usage calculation Olivier Matz
                       ` (19 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

In the same model than rte_mempool_obj_iter(), introduce
rte_mempool_mem_iter() to iterate the memory chunks attached
to the mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c           | 16 ++++++++++++++++
 lib/librte_mempool/rte_mempool.h           | 27 +++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool_version.map |  1 +
 3 files changed, 44 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 9260318..cbf5f2b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -246,6 +246,22 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 	return n;
 }
 
+/* call mem_cb() for each mempool memory chunk */
+uint32_t
+rte_mempool_mem_iter(struct rte_mempool *mp,
+	rte_mempool_mem_cb_t *mem_cb, void *mem_cb_arg)
+{
+	struct rte_mempool_memhdr *hdr;
+	unsigned n = 0;
+
+	STAILQ_FOREACH(hdr, &mp->mem_list, next) {
+		mem_cb(mp, mem_cb_arg, hdr, n);
+		n++;
+	}
+
+	return n;
+}
+
 /* get the header, trailer and total size of a mempool element. */
 uint32_t
 rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 690928e..65455d1 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -336,6 +336,15 @@ typedef void (rte_mempool_obj_cb_t)(struct rte_mempool *mp,
 typedef rte_mempool_obj_cb_t rte_mempool_obj_ctor_t; /* compat */
 
 /**
+ * A memory callback function for mempool.
+ *
+ * Used by rte_mempool_mem_iter().
+ */
+typedef void (rte_mempool_mem_cb_t)(struct rte_mempool *mp,
+		void *opaque, struct rte_mempool_memhdr *memhdr,
+		unsigned mem_idx);
+
+/**
  * A mempool constructor callback function.
  *
  * Arguments are the mempool and the opaque pointer given by the user in
@@ -606,6 +615,24 @@ uint32_t rte_mempool_obj_iter(struct rte_mempool *mp,
 	rte_mempool_obj_cb_t *obj_cb, void *obj_cb_arg);
 
 /**
+ * Call a function for each mempool memory chunk
+ *
+ * Iterate across all memory chunks attached to a rte_mempool and call
+ * the callback function on it.
+ *
+ * @param mp
+ *   A pointer to an initialized mempool.
+ * @param mem_cb
+ *   A function pointer that is called for each memory chunk.
+ * @param mem_cb_arg
+ *   An opaque pointer passed to the callback function.
+ * @return
+ *   Number of memory chunks iterated.
+ */
+uint32_t rte_mempool_mem_iter(struct rte_mempool *mp,
+	rte_mempool_mem_cb_t *mem_cb, void *mem_cb_arg);
+
+/**
  * Dump the status of the mempool to the console.
  *
  * @param f
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 72bc967..7de9f8c 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -22,6 +22,7 @@ DPDK_16.07 {
 
 	rte_mempool_check_cookies;
 	rte_mempool_obj_iter;
+	rte_mempool_mem_iter;
 
 	local: *;
 } DPDK_2.0;
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 17/35] mempool: simplify the memory usage calculation
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (15 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 16/35] mempool: add function to iterate the memory chunks Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 18/35] mempool: introduce a free callback for memory chunks Olivier Matz
                       ` (18 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

This commit simplifies rte_mempool_xmem_usage().

Since previous commit, the function rte_mempool_xmem_usage() is
now the last user of rte_mempool_obj_mem_iter(). This complex
code can now be moved inside the function. We can get rid of the
callback and do some simplification to make the code more readable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 140 +++++++++++----------------------------
 1 file changed, 37 insertions(+), 103 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index cbf5f2b..df8a527 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -127,15 +127,6 @@ static unsigned optimize_object_size(unsigned obj_size)
 	return new_obj_size * RTE_MEMPOOL_ALIGN;
 }
 
-/**
- * A mempool object iterator callback function.
- */
-typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
-	void * /*obj_start*/,
-	void * /*obj_end*/,
-	uint32_t /*obj_index */,
-	phys_addr_t /*physaddr*/);
-
 static void
 mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 {
@@ -159,75 +150,6 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t physaddr)
 	rte_ring_sp_enqueue(mp->ring, obj);
 }
 
-/* Iterate through objects at the given address
- *
- * Given the pointer to the memory, and its topology in physical memory
- * (the physical addresses table), iterate through the "elt_num" objects
- * of size "elt_sz" aligned at "align". For each object in this memory
- * chunk, invoke a callback. It returns the effective number of objects
- * in this memory.
- */
-static uint32_t
-rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
-	size_t align, const phys_addr_t paddr[], uint32_t pg_num,
-	uint32_t pg_shift, rte_mempool_obj_iter_t obj_iter, void *obj_iter_arg)
-{
-	uint32_t i, j, k;
-	uint32_t pgn, pgf;
-	uintptr_t end, start, va;
-	uintptr_t pg_sz;
-	phys_addr_t physaddr;
-
-	pg_sz = (uintptr_t)1 << pg_shift;
-	va = (uintptr_t)vaddr;
-
-	i = 0;
-	j = 0;
-
-	while (i != elt_num && j != pg_num) {
-
-		start = RTE_ALIGN_CEIL(va, align);
-		end = start + total_elt_sz;
-
-		/* index of the first page for the next element. */
-		pgf = (end >> pg_shift) - (start >> pg_shift);
-
-		/* index of the last page for the current element. */
-		pgn = ((end - 1) >> pg_shift) - (start >> pg_shift);
-		pgn += j;
-
-		/* do we have enough space left for the element. */
-		if (pgn >= pg_num)
-			break;
-
-		for (k = j;
-				k != pgn &&
-				paddr[k] + pg_sz == paddr[k + 1];
-				k++)
-			;
-
-		/*
-		 * if next pgn chunks of memory physically continuous,
-		 * use it to create next element.
-		 * otherwise, just skip that chunk unused.
-		 */
-		if (k == pgn) {
-			physaddr = paddr[k] + (start & (pg_sz - 1));
-			if (obj_iter != NULL)
-				obj_iter(obj_iter_arg, (void *)start,
-					(void *)end, i, physaddr);
-			va = end;
-			j += pgf;
-			i++;
-		} else {
-			va = RTE_ALIGN_CEIL((va + 1), pg_sz);
-			j++;
-		}
-	}
-
-	return i;
-}
-
 /* call obj_cb() for each mempool element */
 uint32_t
 rte_mempool_obj_iter(struct rte_mempool *mp,
@@ -345,41 +267,53 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 	return sz;
 }
 
-/* Callback used by rte_mempool_xmem_usage(): it sets the opaque
- * argument to the end of the object.
- */
-static void
-mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
-	__rte_unused uint32_t idx, __rte_unused phys_addr_t physaddr)
-{
-	*(uintptr_t *)arg = (uintptr_t)end;
-}
-
 /*
  * Calculate how much memory would be actually required with the
  * given memory footprint to store required number of elements.
  */
 ssize_t
-rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
+	size_t total_elt_sz, const phys_addr_t paddr[], uint32_t pg_num,
+	uint32_t pg_shift)
 {
-	uint32_t n;
-	uintptr_t va, uv;
-	size_t pg_sz, usz;
+	uint32_t elt_cnt = 0;
+	phys_addr_t start, end;
+	uint32_t paddr_idx;
+	size_t pg_sz = (size_t)1 << pg_shift;
 
-	pg_sz = (size_t)1 << pg_shift;
-	va = (uintptr_t)vaddr;
-	uv = va;
+	/* if paddr is NULL, assume contiguous memory */
+	if (paddr == NULL) {
+		start = 0;
+		end = pg_sz * pg_num;
+		paddr_idx = pg_num;
+	} else {
+		start = paddr[0];
+		end = paddr[0] + pg_sz;
+		paddr_idx = 1;
+	}
+	while (elt_cnt < elt_num) {
+
+		if (end - start >= total_elt_sz) {
+			/* enough contiguous memory, add an object */
+			start += total_elt_sz;
+			elt_cnt++;
+		} else if (paddr_idx < pg_num) {
+			/* no room to store one obj, add a page */
+			if (end == paddr[paddr_idx]) {
+				end += pg_sz;
+			} else {
+				start = paddr[paddr_idx];
+				end = paddr[paddr_idx] + pg_sz;
+			}
+			paddr_idx++;
 
-	if ((n = rte_mempool_obj_mem_iter(vaddr, elt_num, total_elt_sz, 1,
-			paddr, pg_num, pg_shift, mempool_lelem_iter,
-			&uv)) != elt_num) {
-		return -(ssize_t)n;
+		} else {
+			/* no more page, return how many elements fit */
+			return -(size_t)elt_cnt;
+		}
 	}
 
-	uv = RTE_ALIGN_CEIL(uv, pg_sz);
-	usz = uv - va;
-	return usz;
+	return (size_t)paddr_idx << pg_shift;
 }
 
 #ifndef RTE_LIBRTE_XEN_DOM0
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 18/35] mempool: introduce a free callback for memory chunks
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (16 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 17/35] mempool: simplify the memory usage calculation Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 19/35] mempool: get memory size with unspecified page size Olivier Matz
                       ` (17 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Introduce a free callback that is passed to the populate* functions,
which is used when freeing a mempool. This is unused now, but as next
commits will populate the mempool with several chunks of memory, we
need a way to free them properly on error.

Later in the series, we will also introduce a public rte_mempool_free()
and the ability for the user to populate a mempool with its own memory.
For that, we also need a free callback.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 27 ++++++++++++++++++++++-----
 lib/librte_mempool/rte_mempool.h |  8 ++++++++
 2 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index df8a527..0d18511 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -390,6 +390,15 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 	return 0;
 }
 
+/* free a memchunk allocated with rte_memzone_reserve() */
+__rte_unused static void
+rte_mempool_memchunk_mz_free(__rte_unused struct rte_mempool_memhdr *memhdr,
+	void *opaque)
+{
+	const struct rte_memzone *mz = opaque;
+	rte_memzone_free(mz);
+}
+
 /* Free memory chunks used by a mempool. Objects must be in pool */
 static void
 rte_mempool_free_memchunks(struct rte_mempool *mp)
@@ -407,6 +416,8 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
 	while (!STAILQ_EMPTY(&mp->mem_list)) {
 		memhdr = STAILQ_FIRST(&mp->mem_list);
 		STAILQ_REMOVE_HEAD(&mp->mem_list, next);
+		if (memhdr->free_cb != NULL)
+			memhdr->free_cb(memhdr, memhdr->opaque);
 		rte_free(memhdr);
 		mp->nb_mem_chunks--;
 	}
@@ -418,7 +429,8 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
  */
 static int
 rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
-	phys_addr_t paddr, size_t len)
+	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque)
 {
 	unsigned total_elt_sz;
 	unsigned i = 0;
@@ -439,6 +451,8 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	memhdr->addr = vaddr;
 	memhdr->phys_addr = paddr;
 	memhdr->len = len;
+	memhdr->free_cb = free_cb;
+	memhdr->opaque = opaque;
 
 	if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		off = RTE_PTR_ALIGN_CEIL(vaddr, 8) - vaddr;
@@ -466,7 +480,8 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
  */
 static int
 rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
-	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
+	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque)
 {
 	uint32_t i, n;
 	int ret, cnt = 0;
@@ -484,11 +499,13 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 			;
 
 		ret = rte_mempool_populate_phys(mp, vaddr + i * pg_sz,
-			paddr[i], n * pg_sz);
+			paddr[i], n * pg_sz, free_cb, opaque);
 		if (ret < 0) {
 			rte_mempool_free_memchunks(mp);
 			return ret;
 		}
+		/* no need to call the free callback for next chunks */
+		free_cb = NULL;
 		cnt += ret;
 	}
 	return cnt;
@@ -670,12 +687,12 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 		ret = rte_mempool_populate_phys(mp, obj,
 			mp->phys_addr + ((char *)obj - (char *)mp),
-			objsz.total_size * n);
+			objsz.total_size * n, NULL, NULL);
 		if (ret != (int)mp->size)
 			goto exit_unlock;
 	} else {
 		ret = rte_mempool_populate_phys_tab(mp, vaddr,
-			paddr, pg_num, pg_shift);
+			paddr, pg_num, pg_shift, NULL, NULL);
 		if (ret != (int)mp->size)
 			goto exit_unlock;
 	}
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 65455d1..0f900a1 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -188,6 +188,12 @@ struct rte_mempool_objtlr {
 STAILQ_HEAD(rte_mempool_memhdr_list, rte_mempool_memhdr);
 
 /**
+ * Callback used to free a memory chunk
+ */
+typedef void (rte_mempool_memchunk_free_cb_t)(struct rte_mempool_memhdr *memhdr,
+	void *opaque);
+
+/**
  * Mempool objects memory header structure
  *
  * The memory chunks where objects are stored. Each chunk is virtually
@@ -199,6 +205,8 @@ struct rte_mempool_memhdr {
 	void *addr;              /**< Virtual address of the chunk */
 	phys_addr_t phys_addr;   /**< Physical address of the chunk */
 	size_t len;              /**< length of the chunk */
+	rte_mempool_memchunk_free_cb_t *free_cb; /**< Free callback */
+	void *opaque;            /**< Argument passed to the free callback */
 };
 
 /**
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 19/35] mempool: get memory size with unspecified page size
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (17 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 18/35] mempool: introduce a free callback for memory chunks Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 20/35] mempool: allocate in several memory chunks by default Olivier Matz
                       ` (16 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Update rte_mempool_xmem_size() so that when the page_shift argument is
set to 0, assume that memory is physically contiguous, allowing to
ignore page boundaries. This will be used in the next commits.

By the way, rename the variable 'n' as 'obj_per_page' and avoid the
affectation inside the if().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 18 +++++++++---------
 lib/librte_mempool/rte_mempool.h |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0d18511..d5278b4 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -253,18 +253,18 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 size_t
 rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
 {
-	size_t n, pg_num, pg_sz, sz;
+	size_t obj_per_page, pg_num, pg_sz;
 
-	pg_sz = (size_t)1 << pg_shift;
+	if (pg_shift == 0)
+		return total_elt_sz * elt_num;
 
-	if ((n = pg_sz / total_elt_sz) > 0) {
-		pg_num = (elt_num + n - 1) / n;
-		sz = pg_num << pg_shift;
-	} else {
-		sz = RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
-	}
+	pg_sz = (size_t)1 << pg_shift;
+	obj_per_page = pg_sz / total_elt_sz;
+	if (obj_per_page == 0)
+		return RTE_ALIGN_CEIL(total_elt_sz, pg_sz) * elt_num;
 
-	return sz;
+	pg_num = (elt_num + obj_per_page - 1) / obj_per_page;
+	return pg_num << pg_shift;
 }
 
 /*
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 0f900a1..53275e4 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1261,7 +1261,7 @@ uint32_t rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
  *   The size of each element, including header and trailer, as returned
  *   by rte_mempool_calc_obj_size().
  * @param pg_shift
- *   LOG2 of the physical pages size.
+ *   LOG2 of the physical pages size. If set to 0, ignore page boundaries.
  * @return
  *   Required memory size aligned at page boundary.
  */
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 20/35] mempool: allocate in several memory chunks by default
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (18 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 19/35] mempool: get memory size with unspecified page size Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-06-01  3:37       ` Ferruh Yigit
  2016-05-18 11:04     ` [PATCH v3 21/35] eal: lock memory when not using hugepages Olivier Matz
                       ` (15 subsequent siblings)
  35 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Introduce rte_mempool_populate_default() which allocates
mempool objects in several memzones.

The mempool header is now always allocated in a specific memzone
(not with its objects). Thanks to this modification, we can remove
many specific behavior that was required when hugepages are not
enabled in case we are using rte_mempool_xmem_create().

This change requires to update how kni and mellanox drivers lookup for
mbuf memory. For now, this will only work if there is only one memory
chunk (like today), but we could make use of rte_mempool_mem_iter() to
support more memory chunks.

We can also remove RTE_MEMPOOL_OBJ_NAME that is not required anymore for
the lookup, as memory chunks are referenced by the mempool.

Note that rte_mempool_create() is still broken (it was the case before)
when there is no hugepages support (rte_mempool_create_xmem() has to be
used). This is fixed in next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 drivers/net/mlx4/mlx4.c               |  87 +++++++++++++++++++++---
 drivers/net/mlx5/mlx5_rxtx.c          |  87 +++++++++++++++++++++---
 drivers/net/mlx5/mlx5_rxtx.h          |   2 +-
 lib/librte_kni/rte_kni.c              |  12 +++-
 lib/librte_mempool/rte_dom0_mempool.c |   2 +-
 lib/librte_mempool/rte_mempool.c      | 120 +++++++++++++++++++---------------
 lib/librte_mempool/rte_mempool.h      |  11 ----
 7 files changed, 234 insertions(+), 87 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index ce518cf..080ab61 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -1198,8 +1198,71 @@ txq_complete(struct txq *txq)
 	return 0;
 }
 
+struct mlx4_check_mempool_data {
+	int ret;
+	char *start;
+	char *end;
+};
+
+/* Called by mlx4_check_mempool() when iterating the memory chunks. */
+static void mlx4_check_mempool_cb(struct rte_mempool *mp,
+	void *opaque, struct rte_mempool_memhdr *memhdr,
+	unsigned mem_idx)
+{
+	struct mlx4_check_mempool_data *data = opaque;
+
+	(void)mp;
+	(void)mem_idx;
+
+	/* It already failed, skip the next chunks. */
+	if (data->ret != 0)
+		return;
+	/* It is the first chunk. */
+	if (data->start == NULL && data->end == NULL) {
+		data->start = memhdr->addr;
+		data->end = data->start + memhdr->len;
+		return;
+	}
+	if (data->end == memhdr->addr) {
+		data->end += memhdr->len;
+		return;
+	}
+	if (data->start == (char *)memhdr->addr + memhdr->len) {
+		data->start -= memhdr->len;
+		return;
+	}
+	/* Error, mempool is not virtually contigous. */
+	data->ret = -1;
+}
+
+/**
+ * Check if a mempool can be used: it must be virtually contiguous.
+ *
+ * @param[in] mp
+ *   Pointer to memory pool.
+ * @param[out] start
+ *   Pointer to the start address of the mempool virtual memory area
+ * @param[out] end
+ *   Pointer to the end address of the mempool virtual memory area
+ *
+ * @return
+ *   0 on success (mempool is virtually contiguous), -1 on error.
+ */
+static int mlx4_check_mempool(struct rte_mempool *mp, uintptr_t *start,
+	uintptr_t *end)
+{
+	struct mlx4_check_mempool_data data;
+
+	memset(&data, 0, sizeof(data));
+	rte_mempool_mem_iter(mp, mlx4_check_mempool_cb, &data);
+	*start = (uintptr_t)data.start;
+	*end = (uintptr_t)data.end;
+
+	return data.ret;
+}
+
 /* For best performance, this function should not be inlined. */
-static struct ibv_mr *mlx4_mp2mr(struct ibv_pd *, const struct rte_mempool *)
+static struct ibv_mr *mlx4_mp2mr(struct ibv_pd *, struct rte_mempool *)
 	__attribute__((noinline));
 
 /**
@@ -1214,15 +1277,21 @@ static struct ibv_mr *mlx4_mp2mr(struct ibv_pd *, const struct rte_mempool *)
  *   Memory region pointer, NULL in case of error.
  */
 static struct ibv_mr *
-mlx4_mp2mr(struct ibv_pd *pd, const struct rte_mempool *mp)
+mlx4_mp2mr(struct ibv_pd *pd, struct rte_mempool *mp)
 {
 	const struct rte_memseg *ms = rte_eal_get_physmem_layout();
-	uintptr_t start = mp->elt_va_start;
-	uintptr_t end = mp->elt_va_end;
+	uintptr_t start;
+	uintptr_t end;
 	unsigned int i;
 
+	if (mlx4_check_mempool(mp, &start, &end) != 0) {
+		ERROR("mempool %p: not virtually contiguous",
+			(void *)mp);
+		return NULL;
+	}
+
 	DEBUG("mempool %p area start=%p end=%p size=%zu",
-	      (const void *)mp, (void *)start, (void *)end,
+	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
 	/* Round start and end to page boundary if found in memory segments. */
 	for (i = 0; (i < RTE_MAX_MEMSEG) && (ms[i].addr != NULL); ++i) {
@@ -1236,7 +1305,7 @@ mlx4_mp2mr(struct ibv_pd *pd, const struct rte_mempool *mp)
 			end = RTE_ALIGN_CEIL(end, align);
 	}
 	DEBUG("mempool %p using start=%p end=%p size=%zu for MR",
-	      (const void *)mp, (void *)start, (void *)end,
+	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
 	return ibv_reg_mr(pd,
 			  (void *)start,
@@ -1276,7 +1345,7 @@ txq_mb2mp(struct rte_mbuf *buf)
  *   mr->lkey on success, (uint32_t)-1 on failure.
  */
 static uint32_t
-txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
+txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
 {
 	unsigned int i;
 	struct ibv_mr *mr;
@@ -1294,7 +1363,7 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	}
 	/* Add a new entry, register MR first. */
 	DEBUG("%p: discovered new memory pool \"%s\" (%p)",
-	      (void *)txq, mp->name, (const void *)mp);
+	      (void *)txq, mp->name, (void *)mp);
 	mr = mlx4_mp2mr(txq->priv->pd, mp);
 	if (unlikely(mr == NULL)) {
 		DEBUG("%p: unable to configure MR, ibv_reg_mr() failed.",
@@ -1315,7 +1384,7 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	txq->mp2mr[i].mr = mr;
 	txq->mp2mr[i].lkey = mr->lkey;
 	DEBUG("%p: new MR lkey for MP \"%s\" (%p): 0x%08" PRIu32,
-	      (void *)txq, mp->name, (const void *)mp, txq->mp2mr[i].lkey);
+	      (void *)txq, mp->name, (void *)mp, txq->mp2mr[i].lkey);
 	return txq->mp2mr[i].lkey;
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index f2fe98b..13c8d71 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -140,8 +140,71 @@ txq_complete(struct txq *txq)
 	return 0;
 }
 
+struct mlx5_check_mempool_data {
+	int ret;
+	char *start;
+	char *end;
+};
+
+/* Called by mlx5_check_mempool() when iterating the memory chunks. */
+static void mlx5_check_mempool_cb(struct rte_mempool *mp,
+	void *opaque, struct rte_mempool_memhdr *memhdr,
+	unsigned mem_idx)
+{
+	struct mlx5_check_mempool_data *data = opaque;
+
+	(void)mp;
+	(void)mem_idx;
+
+	/* It already failed, skip the next chunks. */
+	if (data->ret != 0)
+		return;
+	/* It is the first chunk. */
+	if (data->start == NULL && data->end == NULL) {
+		data->start = memhdr->addr;
+		data->end = data->start + memhdr->len;
+		return;
+	}
+	if (data->end == memhdr->addr) {
+		data->end += memhdr->len;
+		return;
+	}
+	if (data->start == (char *)memhdr->addr + memhdr->len) {
+		data->start -= memhdr->len;
+		return;
+	}
+	/* Error, mempool is not virtually contigous. */
+	data->ret = -1;
+}
+
+/**
+ * Check if a mempool can be used: it must be virtually contiguous.
+ *
+ * @param[in] mp
+ *   Pointer to memory pool.
+ * @param[out] start
+ *   Pointer to the start address of the mempool virtual memory area
+ * @param[out] end
+ *   Pointer to the end address of the mempool virtual memory area
+ *
+ * @return
+ *   0 on success (mempool is virtually contiguous), -1 on error.
+ */
+static int mlx5_check_mempool(struct rte_mempool *mp, uintptr_t *start,
+	uintptr_t *end)
+{
+	struct mlx5_check_mempool_data data;
+
+	memset(&data, 0, sizeof(data));
+	rte_mempool_mem_iter(mp, mlx5_check_mempool_cb, &data);
+	*start = (uintptr_t)data.start;
+	*end = (uintptr_t)data.end;
+
+	return data.ret;
+}
+
 /* For best performance, this function should not be inlined. */
-struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, const struct rte_mempool *)
+struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, struct rte_mempool *)
 	__attribute__((noinline));
 
 /**
@@ -156,15 +219,21 @@ struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, const struct rte_mempool *)
  *   Memory region pointer, NULL in case of error.
  */
 struct ibv_mr *
-mlx5_mp2mr(struct ibv_pd *pd, const struct rte_mempool *mp)
+mlx5_mp2mr(struct ibv_pd *pd, struct rte_mempool *mp)
 {
 	const struct rte_memseg *ms = rte_eal_get_physmem_layout();
-	uintptr_t start = mp->elt_va_start;
-	uintptr_t end = mp->elt_va_end;
+	uintptr_t start;
+	uintptr_t end;
 	unsigned int i;
 
+	if (mlx5_check_mempool(mp, &start, &end) != 0) {
+		ERROR("mempool %p: not virtually contiguous",
+			(void *)mp);
+		return NULL;
+	}
+
 	DEBUG("mempool %p area start=%p end=%p size=%zu",
-	      (const void *)mp, (void *)start, (void *)end,
+	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
 	/* Round start and end to page boundary if found in memory segments. */
 	for (i = 0; (i < RTE_MAX_MEMSEG) && (ms[i].addr != NULL); ++i) {
@@ -178,7 +247,7 @@ mlx5_mp2mr(struct ibv_pd *pd, const struct rte_mempool *mp)
 			end = RTE_ALIGN_CEIL(end, align);
 	}
 	DEBUG("mempool %p using start=%p end=%p size=%zu for MR",
-	      (const void *)mp, (void *)start, (void *)end,
+	      (void *)mp, (void *)start, (void *)end,
 	      (size_t)(end - start));
 	return ibv_reg_mr(pd,
 			  (void *)start,
@@ -218,7 +287,7 @@ txq_mb2mp(struct rte_mbuf *buf)
  *   mr->lkey on success, (uint32_t)-1 on failure.
  */
 static uint32_t
-txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
+txq_mp2mr(struct txq *txq, struct rte_mempool *mp)
 {
 	unsigned int i;
 	struct ibv_mr *mr;
@@ -236,7 +305,7 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	}
 	/* Add a new entry, register MR first. */
 	DEBUG("%p: discovered new memory pool \"%s\" (%p)",
-	      (void *)txq, mp->name, (const void *)mp);
+	      (void *)txq, mp->name, (void *)mp);
 	mr = mlx5_mp2mr(txq->priv->pd, mp);
 	if (unlikely(mr == NULL)) {
 		DEBUG("%p: unable to configure MR, ibv_reg_mr() failed.",
@@ -257,7 +326,7 @@ txq_mp2mr(struct txq *txq, const struct rte_mempool *mp)
 	txq->mp2mr[i].mr = mr;
 	txq->mp2mr[i].lkey = mr->lkey;
 	DEBUG("%p: new MR lkey for MP \"%s\" (%p): 0x%08" PRIu32,
-	      (void *)txq, mp->name, (const void *)mp, txq->mp2mr[i].lkey);
+	      (void *)txq, mp->name, (void *)mp, txq->mp2mr[i].lkey);
 	return txq->mp2mr[i].lkey;
 }
 
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index db054d6..d522f70 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -341,7 +341,7 @@ uint16_t mlx5_tx_burst_secondary_setup(void *dpdk_txq, struct rte_mbuf **pkts,
 
 /* mlx5_rxtx.c */
 
-struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, const struct rte_mempool *);
+struct ibv_mr *mlx5_mp2mr(struct ibv_pd *, struct rte_mempool *);
 void txq_mp2mr_iter(struct rte_mempool *, void *);
 uint16_t mlx5_tx_burst(void *, struct rte_mbuf **, uint16_t);
 uint16_t mlx5_rx_burst_sp(void *, struct rte_mbuf **, uint16_t);
diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
index ea9baf4..3028fd4 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -323,6 +323,7 @@ rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
 	char intf_name[RTE_KNI_NAMESIZE];
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
+	const struct rte_mempool *mp;
 	struct rte_kni_memzone_slot *slot = NULL;
 
 	if (!pktmbuf_pool || !conf || !conf->name[0])
@@ -415,12 +416,17 @@ rte_kni_alloc(struct rte_mempool *pktmbuf_pool,
 
 
 	/* MBUF mempool */
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME,
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT,
 		pktmbuf_pool->name);
 	mz = rte_memzone_lookup(mz_name);
 	KNI_MEM_CHECK(mz == NULL);
-	dev_info.mbuf_va = mz->addr;
-	dev_info.mbuf_phys = mz->phys_addr;
+	mp = (struct rte_mempool *)mz->addr;
+	/* KNI currently requires to have only one memory chunk */
+	if (mp->nb_mem_chunks != 1)
+		goto kni_fail;
+
+	dev_info.mbuf_va = STAILQ_FIRST(&mp->mem_list)->addr;
+	dev_info.mbuf_phys = STAILQ_FIRST(&mp->mem_list)->phys_addr;
 	ctx->pktmbuf_pool = pktmbuf_pool;
 	ctx->group_id = conf->group_id;
 	ctx->slot_id = slot->id;
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
index 0051bd5..dad755c 100644
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ b/lib/librte_mempool/rte_dom0_mempool.c
@@ -110,7 +110,7 @@ rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
 	if (pa == NULL)
 		return mp;
 
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME, name);
+	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT "_elt", name);
 	mz = rte_memzone_reserve(mz_name, sz, socket_id, mz_flags);
 	if (mz == NULL) {
 		free(pa);
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index d5278b4..c3abf51 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -391,7 +391,7 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 }
 
 /* free a memchunk allocated with rte_memzone_reserve() */
-__rte_unused static void
+static void
 rte_mempool_memchunk_mz_free(__rte_unused struct rte_mempool_memhdr *memhdr,
 	void *opaque)
 {
@@ -511,6 +511,60 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	return cnt;
 }
 
+/* Default function to populate the mempool: allocate memory in mezones,
+ * and populate them. Return the number of objects added, or a negative
+ * value on error.
+ */
+static int rte_mempool_populate_default(struct rte_mempool *mp)
+{
+	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
+	char mz_name[RTE_MEMZONE_NAMESIZE];
+	const struct rte_memzone *mz;
+	size_t size, total_elt_sz, align;
+	unsigned mz_id, n;
+	int ret;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+
+	align = RTE_CACHE_LINE_SIZE;
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
+		size = rte_mempool_xmem_size(n, total_elt_sz, 0);
+
+		ret = snprintf(mz_name, sizeof(mz_name),
+			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
+		if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+			ret = -ENAMETOOLONG;
+			goto fail;
+		}
+
+		mz = rte_memzone_reserve_aligned(mz_name, size,
+			mp->socket_id, mz_flags, align);
+		/* not enough memory, retry with the biggest zone we have */
+		if (mz == NULL)
+			mz = rte_memzone_reserve_aligned(mz_name, 0,
+				mp->socket_id, mz_flags, align);
+		if (mz == NULL) {
+			ret = -rte_errno;
+			goto fail;
+		}
+
+		ret = rte_mempool_populate_phys(mp, mz->addr, mz->phys_addr,
+			mz->len, rte_mempool_memchunk_mz_free,
+			(void *)(uintptr_t)mz);
+		if (ret < 0)
+			goto fail;
+	}
+
+	return mp->size;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return ret;
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -530,13 +584,10 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	struct rte_mempool_list *mempool_list;
 	struct rte_mempool *mp = NULL;
 	struct rte_tailq_entry *te = NULL;
-	const struct rte_memzone *mz;
+	const struct rte_memzone *mz = NULL;
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-	void *obj;
 	struct rte_mempool_objsz objsz;
-	void *startaddr;
-	int page_size = getpagesize();
 	int ret;
 
 	/* compilation-time checks */
@@ -591,16 +642,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	private_data_size = (private_data_size +
 			     RTE_MEMPOOL_ALIGN_MASK) & (~RTE_MEMPOOL_ALIGN_MASK);
 
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * expand private data size to a whole page, so that the
-		 * first pool element will start on a new standard page
-		 */
-		int head = sizeof(struct rte_mempool);
-		int new_size = (private_data_size + head) % page_size;
-		if (new_size)
-			private_data_size += page_size - new_size;
-	}
 
 	/* try to allocate tailq entry */
 	te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0);
@@ -617,17 +658,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
-	if (vaddr == NULL)
-		mempool_size += (size_t)objsz.total_size * n;
-
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * we want the memory pool to start on a page boundary,
-		 * because pool elements crossing page boundaries would
-		 * result in discontiguous physical addresses
-		 */
-		mempool_size += page_size;
-	}
 
 	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
 
@@ -635,20 +665,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	if (mz == NULL)
 		goto exit_unlock;
 
-	if (rte_eal_has_hugepages()) {
-		startaddr = (void*)mz->addr;
-	} else {
-		/* align memory pool start address on a page boundary */
-		unsigned long addr = (unsigned long)mz->addr;
-		if (addr & (page_size - 1)) {
-			addr += page_size;
-			addr &= ~(page_size - 1);
-		}
-		startaddr = (void*)addr;
-	}
-
 	/* init the mempool structure */
-	mp = startaddr;
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->phys_addr = mz->phys_addr;
@@ -679,22 +696,17 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		mp_init(mp, mp_init_arg);
 
 	/* mempool elements allocated together with mempool */
-	if (vaddr == NULL) {
-		/* calculate address of the first elt for continuous mempool. */
-		obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, cache_size) +
-			private_data_size;
-		obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
-
-		ret = rte_mempool_populate_phys(mp, obj,
-			mp->phys_addr + ((char *)obj - (char *)mp),
-			objsz.total_size * n, NULL, NULL);
-		if (ret != (int)mp->size)
-			goto exit_unlock;
-	} else {
+	if (vaddr == NULL)
+		ret = rte_mempool_populate_default(mp);
+	else
 		ret = rte_mempool_populate_phys_tab(mp, vaddr,
 			paddr, pg_num, pg_shift, NULL, NULL);
-		if (ret != (int)mp->size)
-			goto exit_unlock;
+	if (ret < 0) {
+		rte_errno = -ret;
+		goto exit_unlock;
+	} else if (ret != (int)mp->size) {
+		rte_errno = EINVAL;
+		goto exit_unlock;
 	}
 
 	/* call the initializer */
@@ -717,6 +729,8 @@ exit_unlock:
 		rte_ring_free(mp->ring);
 	}
 	rte_free(te);
+	if (mz != NULL)
+		rte_memzone_free(mz);
 
 	return NULL;
 }
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 53275e4..3e458b8 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -125,17 +125,6 @@ struct rte_mempool_objsz {
 /* "MP_<name>" */
 #define	RTE_MEMPOOL_MZ_FORMAT	RTE_MEMPOOL_MZ_PREFIX "%s"
 
-#ifdef RTE_LIBRTE_XEN_DOM0
-
-/* "<name>_MP_elt" */
-#define	RTE_MEMPOOL_OBJ_NAME	"%s_" RTE_MEMPOOL_MZ_PREFIX "elt"
-
-#else
-
-#define	RTE_MEMPOOL_OBJ_NAME	RTE_MEMPOOL_MZ_FORMAT
-
-#endif /* RTE_LIBRTE_XEN_DOM0 */
-
 #define	MEMPOOL_PG_SHIFT_MAX	(sizeof(uintptr_t) * CHAR_BIT - 1)
 
 /** Mempool over one chunk of physically continuous memory */
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 21/35] eal: lock memory when not using hugepages
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (19 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 20/35] mempool: allocate in several memory chunks by default Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 22/35] mempool: support no hugepage mode Olivier Matz
                       ` (14 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Although the physical address won't be correct in memory segment,
this allows at least to retrieve the physical address using
rte_mem_virt2phy(). Indeed, if the page is not locked, the page
may not be present in physical memory.

With next commit, it allows a mempool to have properly filled physical
addresses when using --no-huge option.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..79d1d2d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1074,7 +1074,7 @@ rte_eal_hugepage_init(void)
 	/* hugetlbfs can be disabled */
 	if (internal_config.no_hugetlbfs) {
 		addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE,
-				MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
+			MAP_LOCKED | MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
 		if (addr == MAP_FAILED) {
 			RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__,
 					strerror(errno));
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 22/35] mempool: support no hugepage mode
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (20 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 21/35] eal: lock memory when not using hugepages Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 23/35] mempool: replace physical address by a memzone pointer Olivier Matz
                       ` (13 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Introduce a new function rte_mempool_populate_virt() that is now called
by default when hugepages are not supported. This function populate the
mempool with several physically contiguous chunks whose minimum size is
the page size of the system.

Thanks to this, rte_mempool_create() will work properly in without
hugepages (if the object size is smaller than a page size), and 2
specific workarouds can be removed:

- trailer_size was artificially extended to a page size
- rte_mempool_virt2phy() did not rely on object physical address

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 107 ++++++++++++++++++++++++++++++---------
 lib/librte_mempool/rte_mempool.h |  17 ++-----
 2 files changed, 86 insertions(+), 38 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index c3abf51..88c5780 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -223,23 +223,6 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t flags,
 		sz->trailer_size = new_size - sz->header_size - sz->elt_size;
 	}
 
-	if (! rte_eal_has_hugepages()) {
-		/*
-		 * compute trailer size so that pool elements fit exactly in
-		 * a standard page
-		 */
-		int page_size = getpagesize();
-		int new_size = page_size - sz->header_size - sz->elt_size;
-		if (new_size < 0 || (unsigned int)new_size < sz->trailer_size) {
-			printf("When hugepages are disabled, pool objects "
-			       "can't exceed PAGE_SIZE: %d + %d + %d > %d\n",
-			       sz->header_size, sz->elt_size, sz->trailer_size,
-			       page_size);
-			return 0;
-		}
-		sz->trailer_size = new_size;
-	}
-
 	/* this is the size of an object, including header and trailer */
 	sz->total_size = sz->header_size + sz->elt_size + sz->trailer_size;
 
@@ -511,16 +494,74 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	return cnt;
 }
 
-/* Default function to populate the mempool: allocate memory in mezones,
+/* Populate the mempool with a virtual area. Return the number of
+ * objects added, or a negative value on error.
+ */
+static int
+rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
+	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque)
+{
+	phys_addr_t paddr;
+	size_t off, phys_len;
+	int ret, cnt = 0;
+
+	/* mempool must not be populated */
+	if (mp->nb_mem_chunks != 0)
+		return -EEXIST;
+	/* address and len must be page-aligned */
+	if (RTE_PTR_ALIGN_CEIL(addr, pg_sz) != addr)
+		return -EINVAL;
+	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
+		return -EINVAL;
+
+	for (off = 0; off + pg_sz <= len &&
+		     mp->populated_size < mp->size; off += phys_len) {
+
+		paddr = rte_mem_virt2phy(addr + off);
+		if (paddr == RTE_BAD_PHYS_ADDR) {
+			ret = -EINVAL;
+			goto fail;
+		}
+
+		/* populate with the largest group of contiguous pages */
+		for (phys_len = pg_sz; off + phys_len < len; phys_len += pg_sz) {
+			phys_addr_t paddr_tmp;
+
+			paddr_tmp = rte_mem_virt2phy(addr + off + phys_len);
+			paddr_tmp = rte_mem_phy2mch(-1, paddr_tmp);
+
+			if (paddr_tmp != paddr + phys_len)
+				break;
+		}
+
+		ret = rte_mempool_populate_phys(mp, addr + off, paddr,
+			phys_len, free_cb, opaque);
+		if (ret < 0)
+			goto fail;
+		/* no need to call the free callback for next chunks */
+		free_cb = NULL;
+		cnt += ret;
+	}
+
+	return cnt;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return ret;
+}
+
+/* Default function to populate the mempool: allocate memory in memzones,
  * and populate them. Return the number of objects added, or a negative
  * value on error.
  */
-static int rte_mempool_populate_default(struct rte_mempool *mp)
+static int
+rte_mempool_populate_default(struct rte_mempool *mp)
 {
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
-	size_t size, total_elt_sz, align;
+	size_t size, total_elt_sz, align, pg_sz, pg_shift;
 	unsigned mz_id, n;
 	int ret;
 
@@ -528,10 +569,19 @@ static int rte_mempool_populate_default(struct rte_mempool *mp)
 	if (mp->nb_mem_chunks != 0)
 		return -EEXIST;
 
-	align = RTE_CACHE_LINE_SIZE;
+	if (rte_eal_has_hugepages()) {
+		pg_shift = 0; /* not needed, zone is physically contiguous */
+		pg_sz = 0;
+		align = RTE_CACHE_LINE_SIZE;
+	} else {
+		pg_sz = getpagesize();
+		pg_shift = rte_bsf32(pg_sz);
+		align = pg_sz;
+	}
+
 	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
 	for (mz_id = 0, n = mp->size; n > 0; mz_id++, n -= ret) {
-		size = rte_mempool_xmem_size(n, total_elt_sz, 0);
+		size = rte_mempool_xmem_size(n, total_elt_sz, pg_shift);
 
 		ret = snprintf(mz_name, sizeof(mz_name),
 			RTE_MEMPOOL_MZ_FORMAT "_%d", mp->name, mz_id);
@@ -551,9 +601,16 @@ static int rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		ret = rte_mempool_populate_phys(mp, mz->addr, mz->phys_addr,
-			mz->len, rte_mempool_memchunk_mz_free,
-			(void *)(uintptr_t)mz);
+		if (rte_eal_has_hugepages())
+			ret = rte_mempool_populate_phys(mp, mz->addr,
+				mz->phys_addr, mz->len,
+				rte_mempool_memchunk_mz_free,
+				(void *)(uintptr_t)mz);
+		else
+			ret = rte_mempool_populate_virt(mp, mz->addr,
+				mz->len, pg_sz,
+				rte_mempool_memchunk_mz_free,
+				(void *)(uintptr_t)mz);
 		if (ret < 0)
 			goto fail;
 	}
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 3e458b8..e0aa698 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1150,19 +1150,10 @@ rte_mempool_empty(const struct rte_mempool *mp)
 static inline phys_addr_t
 rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
 {
-	if (rte_eal_has_hugepages()) {
-		const struct rte_mempool_objhdr *hdr;
-		hdr = (const struct rte_mempool_objhdr *)RTE_PTR_SUB(elt,
-			sizeof(*hdr));
-		return hdr->physaddr;
-	} else {
-		/*
-		 * If huge pages are disabled, we cannot assume the
-		 * memory region to be physically contiguous.
-		 * Lookup for each element.
-		 */
-		return rte_mem_virt2phy(elt);
-	}
+	const struct rte_mempool_objhdr *hdr;
+	hdr = (const struct rte_mempool_objhdr *)RTE_PTR_SUB(elt,
+		sizeof(*hdr));
+	return hdr->physaddr;
 }
 
 /**
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 23/35] mempool: replace physical address by a memzone pointer
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (21 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 22/35] mempool: support no hugepage mode Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 24/35] mempool: introduce a function to free a pool Olivier Matz
                       ` (12 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Storing the pointer to the memzone instead of the physical address
provides more information than just the physical address: for instance,
the memzone flags.

Moreover, keeping the memzone pointer will allow us to free the mempool
(this is done later in the series).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 4 ++--
 lib/librte_mempool/rte_mempool.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 88c5780..891458a 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -725,7 +725,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	/* init the mempool structure */
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
-	mp->phys_addr = mz->phys_addr;
+	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
@@ -993,7 +993,7 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
 	fprintf(f, "mempool <%s>@%p\n", mp->name, mp);
 	fprintf(f, "  flags=%x\n", mp->flags);
 	fprintf(f, "  ring=<%s>@%p\n", mp->ring->name, mp->ring);
-	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->phys_addr);
+	fprintf(f, "  phys_addr=0x%" PRIx64 "\n", mp->mz->phys_addr);
 	fprintf(f, "  nb_mem_chunks=%u\n", mp->nb_mem_chunks);
 	fprintf(f, "  size=%"PRIu32"\n", mp->size);
 	fprintf(f, "  populated_size=%"PRIu32"\n", mp->populated_size);
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index e0aa698..13bd56b 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -204,7 +204,7 @@ struct rte_mempool_memhdr {
 struct rte_mempool {
 	char name[RTE_MEMPOOL_NAMESIZE]; /**< Name of mempool. */
 	struct rte_ring *ring;           /**< Ring to store objects. */
-	phys_addr_t phys_addr;           /**< Phys. addr. of mempool struct. */
+	const struct rte_memzone *mz;    /**< Memzone where pool is allocated */
 	int flags;                       /**< Flags of the mempool. */
 	int socket_id;                   /**< Socket id passed at mempool creation. */
 	uint32_t size;                   /**< Max size of the mempool. */
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 24/35] mempool: introduce a function to free a pool
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (22 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 23/35] mempool: replace physical address by a memzone pointer Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 25/35] mempool: introduce a function to create an empty pool Olivier Matz
                       ` (11 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Introduce rte_mempool_free() that:

- unlink the mempool from the global list if it is found
- free all the memory chunks using their free callbacks
- free the internal ring
- free the memzone containing the mempool

Currently this function is only used in error cases when
creating a new mempool, but it will be made public later
in the patch series.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 37 ++++++++++++++++++++++++++++++-------
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 891458a..4aff53b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -622,6 +622,35 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	return ret;
 }
 
+/* free a mempool */
+static void
+rte_mempool_free(struct rte_mempool *mp)
+{
+	struct rte_mempool_list *mempool_list = NULL;
+	struct rte_tailq_entry *te;
+
+	if (mp == NULL)
+		return;
+
+	mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list);
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+	/* find out tailq entry */
+	TAILQ_FOREACH(te, mempool_list, next) {
+		if (te->data == (void *)mp)
+			break;
+	}
+
+	if (te != NULL) {
+		TAILQ_REMOVE(mempool_list, te, next);
+		rte_free(te);
+	}
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	rte_mempool_free_memchunks(mp);
+	rte_ring_free(mp->ring);
+	rte_memzone_free(mp->mz);
+}
+
 /*
  * Create the mempool over already allocated chunk of memory.
  * That external memory buffer can consists of physically disjoint pages.
@@ -781,13 +810,7 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 
 exit_unlock:
 	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
-	if (mp != NULL) {
-		rte_mempool_free_memchunks(mp);
-		rte_ring_free(mp->ring);
-	}
-	rte_free(te);
-	if (mz != NULL)
-		rte_memzone_free(mz);
+	rte_mempool_free(mp);
 
 	return NULL;
 }
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 25/35] mempool: introduce a function to create an empty pool
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (23 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 24/35] mempool: introduce a function to free a pool Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 26/35] eal/xen: return machine address without knowing memseg id Olivier Matz
                       ` (10 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Introduce a new function rte_mempool_create_empty()
that allocates a mempool that is not populated.

The functions rte_mempool_create() and rte_mempool_xmem_create()
now make use of it, making their code much easier to read.
Currently, they are the only users of rte_mempool_create_empty()
but the function will be made public in next commits.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 185 ++++++++++++++++++++++-----------------
 1 file changed, 107 insertions(+), 78 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 4aff53b..33af51b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -319,30 +319,6 @@ rte_dom0_mempool_create(const char *name __rte_unused,
 }
 #endif
 
-/* create the mempool */
-struct rte_mempool *
-rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
-		   unsigned cache_size, unsigned private_data_size,
-		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg,
-		   int socket_id, unsigned flags)
-{
-	if (rte_xen_dom0_supported())
-		return rte_dom0_mempool_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags);
-	else
-		return rte_mempool_xmem_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags,
-					       NULL, NULL, MEMPOOL_PG_NUM_DEFAULT,
-					       MEMPOOL_PG_SHIFT_MAX);
-}
-
 /* create the internal ring */
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
@@ -651,20 +627,11 @@ rte_mempool_free(struct rte_mempool *mp)
 	rte_memzone_free(mp->mz);
 }
 
-/*
- * Create the mempool over already allocated chunk of memory.
- * That external memory buffer can consists of physically disjoint pages.
- * Setting vaddr to NULL, makes mempool to fallback to original behaviour
- * and allocate space for mempool and it's elements as one big chunk of
- * physically continuos memory.
- * */
-struct rte_mempool *
-rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
-		unsigned cache_size, unsigned private_data_size,
-		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		int socket_id, unsigned flags, void *vaddr,
-		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+/* create an empty mempool */
+static struct rte_mempool *
+rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	int socket_id, unsigned flags)
 {
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	struct rte_mempool_list *mempool_list;
@@ -674,7 +641,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	struct rte_mempool_objsz objsz;
-	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -697,18 +663,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		return NULL;
 	}
 
-	/* check that we have both VA and PA */
-	if (vaddr != NULL && paddr == NULL) {
-		rte_errno = EINVAL;
-		return NULL;
-	}
-
-	/* Check that pg_num and pg_shift parameters are valid. */
-	if (pg_num == 0 || pg_shift > MEMPOOL_PG_SHIFT_MAX) {
-		rte_errno = EINVAL;
-		return NULL;
-	}
-
 	/* "no cache align" imply "no spread" */
 	if (flags & MEMPOOL_F_NO_CACHE_ALIGN)
 		flags |= MEMPOOL_F_NO_SPREAD;
@@ -736,11 +690,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		goto exit_unlock;
 	}
 
-	/*
-	 * If user provided an external memory buffer, then use it to
-	 * store mempool objects. Otherwise reserve a memzone that is large
-	 * enough to hold mempool header and metadata plus mempool objects.
-	 */
 	mempool_size = MEMPOOL_HEADER_SIZE(mp, cache_size);
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
@@ -752,12 +701,14 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		goto exit_unlock;
 
 	/* init the mempool structure */
+	mp = mz->addr;
 	memset(mp, 0, sizeof(*mp));
 	snprintf(mp->name, sizeof(mp->name), "%s", name);
 	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
 	mp->flags = flags;
+	mp->socket_id = socket_id;
 	mp->elt_size = objsz.elt_size;
 	mp->header_size = objsz.header_size;
 	mp->trailer_size = objsz.trailer_size;
@@ -777,41 +728,119 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 	mp->local_cache = (struct rte_mempool_cache *)
 		RTE_PTR_ADD(mp, MEMPOOL_HEADER_SIZE(mp, 0));
 
-	/* call the initializer */
+	te->data = mp;
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+	TAILQ_INSERT_TAIL(mempool_list, te, next);
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+
+	return mp;
+
+exit_unlock:
+	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+	rte_free(te);
+	rte_mempool_free(mp);
+	return NULL;
+}
+
+/* create the mempool */
+struct rte_mempool *
+rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
+	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
+	int socket_id, unsigned flags)
+{
+	struct rte_mempool *mp;
+
+	if (rte_xen_dom0_supported())
+		return rte_dom0_mempool_create(name, n, elt_size,
+					       cache_size, private_data_size,
+					       mp_init, mp_init_arg,
+					       obj_init, obj_init_arg,
+					       socket_id, flags);
+
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		private_data_size, socket_id, flags);
+	if (mp == NULL)
+		return NULL;
+
+	/* call the mempool priv initializer */
 	if (mp_init)
 		mp_init(mp, mp_init_arg);
 
-	/* mempool elements allocated together with mempool */
+	if (rte_mempool_populate_default(mp) < 0)
+		goto fail;
+
+	/* call the object initializers */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
+
+	return mp;
+
+ fail:
+	rte_mempool_free(mp);
+	return NULL;
+}
+
+/*
+ * Create the mempool over already allocated chunk of memory.
+ * That external memory buffer can consists of physically disjoint pages.
+ * Setting vaddr to NULL, makes mempool to fallback to original behaviour
+ * and allocate space for mempool and it's elements as one big chunk of
+ * physically continuos memory.
+ */
+struct rte_mempool *
+rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
+		unsigned cache_size, unsigned private_data_size,
+		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
+		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
+		int socket_id, unsigned flags, void *vaddr,
+		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift)
+{
+	struct rte_mempool *mp = NULL;
+	int ret;
+
+	/* no virtual address supplied, use rte_mempool_create() */
 	if (vaddr == NULL)
-		ret = rte_mempool_populate_default(mp);
-	else
-		ret = rte_mempool_populate_phys_tab(mp, vaddr,
-			paddr, pg_num, pg_shift, NULL, NULL);
-	if (ret < 0) {
-		rte_errno = -ret;
-		goto exit_unlock;
-	} else if (ret != (int)mp->size) {
+		return rte_mempool_create(name, n, elt_size, cache_size,
+			private_data_size, mp_init, mp_init_arg,
+			obj_init, obj_init_arg, socket_id, flags);
+
+	/* check that we have both VA and PA */
+	if (paddr == NULL) {
 		rte_errno = EINVAL;
-		goto exit_unlock;
+		return NULL;
 	}
 
-	/* call the initializer */
-	if (obj_init)
-		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
+	/* Check that pg_shift parameter is valid. */
+	if (pg_shift > MEMPOOL_PG_SHIFT_MAX) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
 
-	te->data = (void *) mp;
+	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
+		private_data_size, socket_id, flags);
+	if (mp == NULL)
+		return NULL;
 
-	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
-	TAILQ_INSERT_TAIL(mempool_list, te, next);
-	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
-	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+	/* call the mempool priv initializer */
+	if (mp_init)
+		mp_init(mp, mp_init_arg);
+
+	ret = rte_mempool_populate_phys_tab(mp, vaddr, paddr, pg_num, pg_shift,
+		NULL, NULL);
+	if (ret < 0 || ret != (int)mp->size)
+		goto fail;
+
+	/* call the object initializers */
+	if (obj_init)
+		rte_mempool_obj_iter(mp, obj_init, obj_init_arg);
 
 	return mp;
 
-exit_unlock:
-	rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK);
+ fail:
 	rte_mempool_free(mp);
-
 	return NULL;
 }
 
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 26/35] eal/xen: return machine address without knowing memseg id
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (24 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 25/35] mempool: introduce a function to create an empty pool Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 27/35] mempool: rework support of Xen dom0 Olivier Matz
                       ` (9 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

The conversion from guest physical address to machine physical address
is fast when the caller knows the memseg corresponding to the gpa.

But in case the user does not know this information, just find it
by browsing the segments. This feature will be used by next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/include/rte_memory.h   | 11 ++++++-----
 lib/librte_eal/linuxapp/eal/eal_xen_memory.c | 18 ++++++++++++++++--
 2 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h
index f8dbece..0661109 100644
--- a/lib/librte_eal/common/include/rte_memory.h
+++ b/lib/librte_eal/common/include/rte_memory.h
@@ -200,21 +200,22 @@ unsigned rte_memory_get_nrank(void);
 int rte_xen_dom0_supported(void);
 
 /**< Internal use only - phys to virt mapping for xen */
-phys_addr_t rte_xen_mem_phy2mch(uint32_t, const phys_addr_t);
+phys_addr_t rte_xen_mem_phy2mch(int32_t, const phys_addr_t);
 
 /**
  * Return the physical address of elt, which is an element of the pool mp.
  *
  * @param memseg_id
- *   The mempool is from which memory segment.
+ *   Identifier of the memory segment owning the physical address. If
+ *   set to -1, find it automatically.
  * @param phy_addr
  *   physical address of elt.
  *
  * @return
- *   The physical address or error.
+ *   The physical address or RTE_BAD_PHYS_ADDR on error.
  */
 static inline phys_addr_t
-rte_mem_phy2mch(uint32_t memseg_id, const phys_addr_t phy_addr)
+rte_mem_phy2mch(int32_t memseg_id, const phys_addr_t phy_addr)
 {
 	if (rte_xen_dom0_supported())
 		return rte_xen_mem_phy2mch(memseg_id, phy_addr);
@@ -250,7 +251,7 @@ static inline int rte_xen_dom0_supported(void)
 }
 
 static inline phys_addr_t
-rte_mem_phy2mch(uint32_t memseg_id __rte_unused, const phys_addr_t phy_addr)
+rte_mem_phy2mch(int32_t memseg_id __rte_unused, const phys_addr_t phy_addr)
 {
 	return phy_addr;
 }
diff --git a/lib/librte_eal/linuxapp/eal/eal_xen_memory.c b/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
index 495eef9..0b612bb 100644
--- a/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_xen_memory.c
@@ -156,13 +156,27 @@ get_xen_memory_size(void)
  * Based on physical address to caculate MFN in Xen Dom0.
  */
 phys_addr_t
-rte_xen_mem_phy2mch(uint32_t memseg_id, const phys_addr_t phy_addr)
+rte_xen_mem_phy2mch(int32_t memseg_id, const phys_addr_t phy_addr)
 {
-	int mfn_id;
+	int mfn_id, i;
 	uint64_t mfn, mfn_offset;
 	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
 	struct rte_memseg *memseg = mcfg->memseg;
 
+	/* find the memory segment owning the physical address */
+	if (memseg_id == -1) {
+		for (i = 0; i < RTE_MAX_MEMSEG; i++) {
+			if ((phy_addr >= memseg[i].phys_addr) &&
+					(phys_addr < memseg[i].phys_addr +
+						memseg[i].size)) {
+				memseg_id = i;
+				break;
+			}
+		}
+		if (memseg_id == -1)
+			return RTE_BAD_PHYS_ADDR;
+	}
+
 	mfn_id = (phy_addr - memseg[memseg_id].phys_addr) / RTE_PGSIZE_2M;
 
 	/*the MFN is contiguous in 2M */
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 27/35] mempool: rework support of Xen dom0
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (25 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 26/35] eal/xen: return machine address without knowing memseg id Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 28/35] mempool: create the internal ring when populating Olivier Matz
                       ` (8 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Avoid to have a specific file for that, and remove #ifdefs.
Now that we have introduced a function to populate a mempool
with a virtual area, the support of xen dom0 is much easier.

The only thing we need to do is to convert the guest physical
address into the machine physical address using rte_mem_phy2mch().
This function does nothing when not running xen.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/Makefile                |   3 -
 lib/librte_mempool/rte_dom0_mempool.c      | 133 -----------------------------
 lib/librte_mempool/rte_mempool.c           |  33 ++-----
 lib/librte_mempool/rte_mempool.h           |  89 -------------------
 lib/librte_mempool/rte_mempool_version.map |   1 -
 5 files changed, 5 insertions(+), 254 deletions(-)
 delete mode 100644 lib/librte_mempool/rte_dom0_mempool.c

diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index 706f844..43423e0 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -42,9 +42,6 @@ LIBABIVER := 2
 
 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
-ifeq ($(CONFIG_RTE_LIBRTE_XEN_DOM0),y)
-SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_dom0_mempool.c
-endif
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_MEMPOOL)-include := rte_mempool.h
 
diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c
deleted file mode 100644
index dad755c..0000000
--- a/lib/librte_mempool/rte_dom0_mempool.c
+++ /dev/null
@@ -1,133 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <stdio.h>
-#include <string.h>
-#include <stdint.h>
-#include <unistd.h>
-#include <stdarg.h>
-#include <inttypes.h>
-#include <errno.h>
-#include <sys/queue.h>
-
-#include <rte_common.h>
-#include <rte_log.h>
-#include <rte_debug.h>
-#include <rte_memory.h>
-#include <rte_memzone.h>
-#include <rte_atomic.h>
-#include <rte_launch.h>
-#include <rte_eal.h>
-#include <rte_eal_memconfig.h>
-#include <rte_per_lcore.h>
-#include <rte_lcore.h>
-#include <rte_branch_prediction.h>
-#include <rte_ring.h>
-#include <rte_errno.h>
-#include <rte_string_fns.h>
-#include <rte_spinlock.h>
-
-#include "rte_mempool.h"
-
-static void
-get_phys_map(void *va, phys_addr_t pa[], uint32_t pg_num,
-	uint32_t pg_sz, uint32_t memseg_id)
-{
-	uint32_t i;
-	uint64_t virt_addr, mfn_id;
-	struct rte_mem_config *mcfg;
-	uint32_t page_size = getpagesize();
-
-	/* get pointer to global configuration */
-	mcfg = rte_eal_get_configuration()->mem_config;
-	virt_addr = (uintptr_t) mcfg->memseg[memseg_id].addr;
-
-	for (i = 0; i != pg_num; i++) {
-		mfn_id = ((uintptr_t)va + i * pg_sz - virt_addr) / RTE_PGSIZE_2M;
-		pa[i] = mcfg->memseg[memseg_id].mfn[mfn_id] * page_size;
-	}
-}
-
-/* create the mempool for supporting Dom0 */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size,
-	unsigned cache_size, unsigned private_data_size,
-	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-	int socket_id, unsigned flags)
-{
-	struct rte_mempool *mp = NULL;
-	phys_addr_t *pa;
-	char *va;
-	size_t sz;
-	uint32_t pg_num, pg_shift, pg_sz, total_size;
-	const struct rte_memzone *mz;
-	char mz_name[RTE_MEMZONE_NAMESIZE];
-	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
-
-	pg_sz = RTE_PGSIZE_2M;
-
-	pg_shift = rte_bsf32(pg_sz);
-	total_size = rte_mempool_calc_obj_size(elt_size, flags, NULL);
-
-	/* calc max memory size and max number of pages needed. */
-	sz = rte_mempool_xmem_size(elt_num, total_size, pg_shift) +
-		RTE_PGSIZE_2M;
-	pg_num = sz >> pg_shift;
-
-	/* extract physical mappings of the allocated memory. */
-	pa = calloc(pg_num, sizeof (*pa));
-	if (pa == NULL)
-		return mp;
-
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT "_elt", name);
-	mz = rte_memzone_reserve(mz_name, sz, socket_id, mz_flags);
-	if (mz == NULL) {
-		free(pa);
-		return mp;
-	}
-
-	va = (char *)RTE_ALIGN_CEIL((uintptr_t)mz->addr, RTE_PGSIZE_2M);
-	/* extract physical mappings of the allocated memory. */
-	get_phys_map(va, pa, pg_num, pg_sz, mz->memseg_id);
-
-	mp = rte_mempool_xmem_create(name, elt_num, elt_size,
-		cache_size, private_data_size,
-		mp_init, mp_init_arg,
-		obj_init, obj_init_arg,
-		socket_id, flags, va, pa, pg_num, pg_shift);
-
-	free(pa);
-
-	return mp;
-}
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 33af51b..f141139 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -299,26 +299,6 @@ rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
 	return (size_t)paddr_idx << pg_shift;
 }
 
-#ifndef RTE_LIBRTE_XEN_DOM0
-/* stub if DOM0 support not configured */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name __rte_unused,
-			unsigned n __rte_unused,
-			unsigned elt_size __rte_unused,
-			unsigned cache_size __rte_unused,
-			unsigned private_data_size __rte_unused,
-			rte_mempool_ctor_t *mp_init __rte_unused,
-			void *mp_init_arg __rte_unused,
-			rte_mempool_obj_ctor_t *obj_init __rte_unused,
-			void *obj_init_arg __rte_unused,
-			int socket_id __rte_unused,
-			unsigned flags __rte_unused)
-{
-	rte_errno = EINVAL;
-	return NULL;
-}
-#endif
-
 /* create the internal ring */
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
@@ -495,6 +475,9 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 		     mp->populated_size < mp->size; off += phys_len) {
 
 		paddr = rte_mem_virt2phy(addr + off);
+		/* required for xen_dom0 to get the machine address */
+		paddr = rte_mem_phy2mch(-1, paddr);
+
 		if (paddr == RTE_BAD_PHYS_ADDR) {
 			ret = -EINVAL;
 			goto fail;
@@ -577,7 +560,8 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		if (rte_eal_has_hugepages())
+		/* use memzone physical address if it is valid */
+		if (rte_eal_has_hugepages() && !rte_xen_dom0_supported())
 			ret = rte_mempool_populate_phys(mp, mz->addr,
 				mz->phys_addr, mz->len,
 				rte_mempool_memchunk_mz_free,
@@ -753,13 +737,6 @@ rte_mempool_create(const char *name, unsigned n, unsigned elt_size,
 {
 	struct rte_mempool *mp;
 
-	if (rte_xen_dom0_supported())
-		return rte_dom0_mempool_create(name, n, elt_size,
-					       cache_size, private_data_size,
-					       mp_init, mp_init_arg,
-					       obj_init, obj_init_arg,
-					       socket_id, flags);
-
 	mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
 		private_data_size, socket_id, flags);
 	if (mp == NULL)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 13bd56b..8c35b45 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -505,95 +505,6 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
 /**
- * Create a new mempool named *name* in memory on Xen Dom0.
- *
- * This function uses ``rte_mempool_xmem_create()`` to allocate memory. The
- * pool contains n elements of elt_size. Its size is set to n.
- * All elements of the mempool are allocated together with the mempool header,
- * and memory buffer can consist of set of disjoint physical pages.
- *
- * @param name
- *   The name of the mempool.
- * @param n
- *   The number of elements in the mempool. The optimum size (in terms of
- *   memory usage) for a mempool is when n is a power of two minus one:
- *   n = (2^q - 1).
- * @param elt_size
- *   The size of each element.
- * @param cache_size
- *   If cache_size is non-zero, the rte_mempool library will try to
- *   limit the accesses to the common lockless pool, by maintaining a
- *   per-lcore object cache. This argument must be lower or equal to
- *   CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE. It is advised to choose
- *   cache_size to have "n modulo cache_size == 0": if this is
- *   not the case, some elements will always stay in the pool and will
- *   never be used. The access to the per-lcore table is of course
- *   faster than the multi-producer/consumer pool. The cache can be
- *   disabled if the cache_size argument is set to 0; it can be useful to
- *   avoid losing objects in cache. Note that even if not used, the
- *   memory space for cache is always reserved in a mempool structure,
- *   except if CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE is set to 0.
- * @param private_data_size
- *   The size of the private data appended after the mempool
- *   structure. This is useful for storing some private data after the
- *   mempool structure, as is done for rte_mbuf_pool for example.
- * @param mp_init
- *   A function pointer that is called for initialization of the pool,
- *   before object initialization. The user can initialize the private
- *   data in this function if needed. This parameter can be NULL if
- *   not needed.
- * @param mp_init_arg
- *   An opaque pointer to data that can be used in the mempool
- *   constructor function.
- * @param obj_init
- *   A function pointer that is called for each object at
- *   initialization of the pool. The user can set some meta data in
- *   objects if needed. This parameter can be NULL if not needed.
- *   The obj_init() function takes the mempool pointer, the init_arg,
- *   the object pointer and the object number as parameters.
- * @param obj_init_arg
- *   An opaque pointer to data that can be used as an argument for
- *   each call to the object constructor function.
- * @param socket_id
- *   The *socket_id* argument is the socket identifier in the case of
- *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
- *   constraint for the reserved zone.
- * @param flags
- *   The *flags* arguments is an OR of following flags:
- *   - MEMPOOL_F_NO_SPREAD: By default, objects addresses are spread
- *     between channels in RAM: the pool allocator will add padding
- *     between objects depending on the hardware configuration. See
- *     Memory alignment constraints for details. If this flag is set,
- *     the allocator will just align them to a cache line.
- *   - MEMPOOL_F_NO_CACHE_ALIGN: By default, the returned objects are
- *     cache-aligned. This flag removes this constraint, and no
- *     padding will be present between objects. This flag implies
- *     MEMPOOL_F_NO_SPREAD.
- *   - MEMPOOL_F_SP_PUT: If this flag is set, the default behavior
- *     when using rte_mempool_put() or rte_mempool_put_bulk() is
- *     "single-producer". Otherwise, it is "multi-producers".
- *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
- *     when using rte_mempool_get() or rte_mempool_get_bulk() is
- *     "single-consumer". Otherwise, it is "multi-consumers".
- * @return
- *   The pointer to the new allocated mempool, on success. NULL on error
- *   with rte_errno set appropriately. Possible rte_errno values include:
- *    - E_RTE_NO_CONFIG - function could not get pointer to rte_config structure
- *    - E_RTE_SECONDARY - function was called from a secondary process instance
- *    - EINVAL - cache size provided is too large
- *    - ENOSPC - the maximum number of memzones has already been allocated
- *    - EEXIST - a memzone with the same name already exists
- *    - ENOMEM - no appropriate memory area found in which to create memzone
- */
-struct rte_mempool *
-rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size,
-		unsigned cache_size, unsigned private_data_size,
-		rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		int socket_id, unsigned flags);
-
-
-/**
  * Call a function for each mempool element
  *
  * Iterate across all objects attached to a rte_mempool and call the
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 7de9f8c..75259d1 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -1,7 +1,6 @@
 DPDK_2.0 {
 	global:
 
-	rte_dom0_mempool_create;
 	rte_mempool_audit;
 	rte_mempool_calc_obj_size;
 	rte_mempool_count;
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 28/35] mempool: create the internal ring when populating
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (26 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 27/35] mempool: rework support of Xen dom0 Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 29/35] mempool: populate with anonymous memory Olivier Matz
                       ` (7 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Instead of creating the internal ring at mempool creation, do
it when populating the mempool with the first memory chunk. The
objective here is to simplify the change of external handler
when it will be introduced.

For instance, this will be possible:

  mp = rte_mempool_create_empty(...)
  rte_mempool_set_ext_handler(mp, my_handler)
  rte_mempool_populate_default()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 12 +++++++++---
 lib/librte_mempool/rte_mempool.h |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index f141139..8be3c74 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -326,6 +326,7 @@ rte_mempool_ring_create(struct rte_mempool *mp)
 		return -rte_errno;
 
 	mp->ring = r;
+	mp->flags |= MEMPOOL_F_RING_CREATED;
 	return 0;
 }
 
@@ -375,6 +376,14 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	unsigned i = 0;
 	size_t off;
 	struct rte_mempool_memhdr *memhdr;
+	int ret;
+
+	/* create the internal ring if not already done */
+	if ((mp->flags & MEMPOOL_F_RING_CREATED) == 0) {
+		ret = rte_mempool_ring_create(mp);
+		if (ret < 0)
+			return ret;
+	}
 
 	/* mempool is already populated */
 	if (mp->populated_size >= mp->size)
@@ -702,9 +711,6 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	STAILQ_INIT(&mp->elt_list);
 	STAILQ_INIT(&mp->mem_list);
 
-	if (rte_mempool_ring_create(mp) < 0)
-		goto exit_unlock;
-
 	/*
 	 * local_cache pointer is set even if cache_size is zero.
 	 * The local_cache points to just past the elt_pa[] array.
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 8c35b45..2a40aa7 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -235,6 +235,7 @@ struct rte_mempool {
 #define MEMPOOL_F_NO_CACHE_ALIGN 0x0002 /**< Do not align objs on cache lines.*/
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
+#define MEMPOOL_F_RING_CREATED   0x0010 /**< Internal: ring is created */
 
 /**
  * @internal When debug is enabled, store some statistics.
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 29/35] mempool: populate with anonymous memory
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (27 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 28/35] mempool: create the internal ring when populating Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 30/35] mempool: make mempool populate and free api public Olivier Matz
                       ` (6 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Now that we can populate a mempool with any virtual memory,
it is easier to introduce a function to populate a mempool
with memory coming from an anonymous mapping, as it's done
in test-pmd.

The next commit will replace test-pmd anonymous mapping by
this function.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 64 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 8be3c74..0ba6c24 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -40,6 +40,7 @@
 #include <inttypes.h>
 #include <errno.h>
 #include <sys/queue.h>
+#include <sys/mman.h>
 
 #include <rte_common.h>
 #include <rte_log.h>
@@ -591,6 +592,69 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	return ret;
 }
 
+/* return the memory size required for mempool objects in anonymous mem */
+static size_t
+get_anon_size(const struct rte_mempool *mp)
+{
+	size_t size, total_elt_sz, pg_sz, pg_shift;
+
+	pg_sz = getpagesize();
+	pg_shift = rte_bsf32(pg_sz);
+	total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+	size = rte_mempool_xmem_size(mp->size, total_elt_sz, pg_shift);
+
+	return size;
+}
+
+/* unmap a memory zone mapped by rte_mempool_populate_anon() */
+static void
+rte_mempool_memchunk_anon_free(struct rte_mempool_memhdr *memhdr,
+	void *opaque)
+{
+	munmap(opaque, get_anon_size(memhdr->mp));
+}
+
+/* populate the mempool with an anonymous mapping */
+__rte_unused static int
+rte_mempool_populate_anon(struct rte_mempool *mp)
+{
+	size_t size;
+	int ret;
+	char *addr;
+
+	/* mempool is already populated, error */
+	if (!STAILQ_EMPTY(&mp->mem_list)) {
+		rte_errno = EINVAL;
+		return 0;
+	}
+
+	/* get chunk of virtually continuous memory */
+	size = get_anon_size(mp);
+	addr = mmap(NULL, size, PROT_READ | PROT_WRITE,
+		MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+	if (addr == MAP_FAILED) {
+		rte_errno = errno;
+		return 0;
+	}
+	/* can't use MMAP_LOCKED, it does not exist on BSD */
+	if (mlock(addr, size) < 0) {
+		rte_errno = errno;
+		munmap(addr, size);
+		return 0;
+	}
+
+	ret = rte_mempool_populate_virt(mp, addr, size, getpagesize(),
+		rte_mempool_memchunk_anon_free, addr);
+	if (ret == 0)
+		goto fail;
+
+	return mp->populated_size;
+
+ fail:
+	rte_mempool_free_memchunks(mp);
+	return 0;
+}
+
 /* free a mempool */
 static void
 rte_mempool_free(struct rte_mempool *mp)
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 30/35] mempool: make mempool populate and free api public
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (28 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 29/35] mempool: populate with anonymous memory Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 31/35] app/testpmd: remove anonymous mempool code Olivier Matz
                       ` (5 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Add the following functions to the public mempool API:

- rte_mempool_create_empty()
- rte_mempool_populate_phys()
- rte_mempool_populate_phys_tab()
- rte_mempool_populate_virt()
- rte_mempool_populate_default()
- rte_mempool_populate_anon()
- rte_mempool_free()

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c           |  14 +--
 lib/librte_mempool/rte_mempool.h           | 168 +++++++++++++++++++++++++++++
 lib/librte_mempool/rte_mempool_version.map |   9 +-
 3 files changed, 183 insertions(+), 8 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0ba6c24..df2ae0e 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -368,7 +368,7 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
  * zone. Return the number of objects added, or a negative value
  * on error.
  */
-static int
+int
 rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
 	void *opaque)
@@ -427,7 +427,7 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 /* Add objects in the pool, using a table of physical pages. Return the
  * number of objects added, or a negative value on error.
  */
-static int
+int
 rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
 	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque)
@@ -463,7 +463,7 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 /* Populate the mempool with a virtual area. Return the number of
  * objects added, or a negative value on error.
  */
-static int
+int
 rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
 	void *opaque)
@@ -524,7 +524,7 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
  * and populate them. Return the number of objects added, or a negative
  * value on error.
  */
-static int
+int
 rte_mempool_populate_default(struct rte_mempool *mp)
 {
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
@@ -615,7 +615,7 @@ rte_mempool_memchunk_anon_free(struct rte_mempool_memhdr *memhdr,
 }
 
 /* populate the mempool with an anonymous mapping */
-__rte_unused static int
+int
 rte_mempool_populate_anon(struct rte_mempool *mp)
 {
 	size_t size;
@@ -656,7 +656,7 @@ rte_mempool_populate_anon(struct rte_mempool *mp)
 }
 
 /* free a mempool */
-static void
+void
 rte_mempool_free(struct rte_mempool *mp)
 {
 	struct rte_mempool_list *mempool_list = NULL;
@@ -685,7 +685,7 @@ rte_mempool_free(struct rte_mempool *mp)
 }
 
 /* create an empty mempool */
-static struct rte_mempool *
+struct rte_mempool *
 rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	unsigned cache_size, unsigned private_data_size,
 	int socket_id, unsigned flags)
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 2a40aa7..05de5f7 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -506,6 +506,174 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
 		const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift);
 
 /**
+ * Create an empty mempool
+ *
+ * The mempool is allocated and initialized, but it is not populated: no
+ * memory is allocated for the mempool elements. The user has to call
+ * rte_mempool_populate_*() or to add memory chunks to the pool. Once
+ * populated, the user may also want to initialize each object with
+ * rte_mempool_obj_iter().
+ *
+ * @param name
+ *   The name of the mempool.
+ * @param n
+ *   The maximum number of elements that can be added in the mempool.
+ *   The optimum size (in terms of memory usage) for a mempool is when n
+ *   is a power of two minus one: n = (2^q - 1).
+ * @param elt_size
+ *   The size of each element.
+ * @param cache_size
+ *   Size of the cache. See rte_mempool_create() for details.
+ * @param private_data_size
+ *   The size of the private data appended after the mempool
+ *   structure. This is useful for storing some private data after the
+ *   mempool structure, as is done for rte_mbuf_pool for example.
+ * @param socket_id
+ *   The *socket_id* argument is the socket identifier in the case of
+ *   NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA
+ *   constraint for the reserved zone.
+ * @param flags
+ *   Flags controlling the behavior of the mempool. See
+ *   rte_mempool_create() for details.
+ * @return
+ *   The pointer to the new allocated mempool, on success. NULL on error
+ *   with rte_errno set appropriately. See rte_mempool_create() for details.
+ */
+struct rte_mempool *
+rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
+	unsigned cache_size, unsigned private_data_size,
+	int socket_id, unsigned flags);
+/**
+ * Free a mempool
+ *
+ * Unlink the mempool from global list, free the memory chunks, and all
+ * memory referenced by the mempool. The objects must not be used by
+ * other cores as they will be freed.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ */
+void
+rte_mempool_free(struct rte_mempool *mp);
+
+/**
+ * Add physically contiguous memory for objects in the pool at init
+ *
+ * Add a virtually and physically contiguous memory chunk in the pool
+ * where objects can be instanciated.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param vaddr
+ *   The virtual address of memory that should be used to store objects.
+ * @param paddr
+ *   The physical address
+ * @param len
+ *   The length of memory in bytes.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
+	phys_addr_t paddr, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque);
+
+/**
+ * Add physical memory for objects in the pool at init
+ *
+ * Add a virtually contiguous memory chunk in the pool where objects can
+ * be instanciated. The physical addresses corresponding to the virtual
+ * area are described in paddr[], pg_num, pg_shift.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param vaddr
+ *   The virtual address of memory that should be used to store objects.
+ * @param paddr
+ *   An array of physical addresses of each page composing the virtual
+ *   area.
+ * @param pg_num
+ *   Number of elements in the paddr array.
+ * @param pg_shift
+ *   LOG2 of the physical pages size.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunks are not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
+	const phys_addr_t paddr[], uint32_t pg_num, uint32_t pg_shift,
+	rte_mempool_memchunk_free_cb_t *free_cb, void *opaque);
+
+/**
+ * Add virtually contiguous memory for objects in the pool at init
+ *
+ * Add a virtually contiguous memory chunk in the pool where objects can
+ * be instanciated.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param addr
+ *   The virtual address of memory that should be used to store objects.
+ *   Must be page-aligned.
+ * @param len
+ *   The length of memory in bytes. Must be page-aligned.
+ * @param pg_sz
+ *   The size of memory pages in this virtual area.
+ * @param free_cb
+ *   The callback used to free this chunk when destroying the mempool.
+ * @param opaque
+ *   An opaque argument passed to free_cb.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int
+rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
+	size_t len, size_t pg_sz, rte_mempool_memchunk_free_cb_t *free_cb,
+	void *opaque);
+
+/**
+ * Add memory for objects in the pool at init
+ *
+ * This is the default function used by rte_mempool_create() to populate
+ * the mempool. It adds memory allocated using rte_memzone_reserve().
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_default(struct rte_mempool *mp);
+
+/**
+ * Add memory from anonymous mapping for objects in the pool at init
+ *
+ * This function mmap an anonymous memory zone that is locked in
+ * memory to store the objects of the mempool.
+ *
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @return
+ *   The number of objects added on success.
+ *   On error, the chunk is not added in the memory list of the
+ *   mempool and a negative errno is returned.
+ */
+int rte_mempool_populate_anon(struct rte_mempool *mp);
+
+/**
  * Call a function for each mempool element
  *
  * Iterate across all objects attached to a rte_mempool and call the
diff --git a/lib/librte_mempool/rte_mempool_version.map b/lib/librte_mempool/rte_mempool_version.map
index 75259d1..f63461b 100644
--- a/lib/librte_mempool/rte_mempool_version.map
+++ b/lib/librte_mempool/rte_mempool_version.map
@@ -16,12 +16,19 @@ DPDK_2.0 {
 	local: *;
 };
 
-DPDK_16.07 {
+DPDK_16.7 {
 	global:
 
 	rte_mempool_check_cookies;
 	rte_mempool_obj_iter;
 	rte_mempool_mem_iter;
+	rte_mempool_create_empty;
+	rte_mempool_populate_phys;
+	rte_mempool_populate_phys_tab;
+	rte_mempool_populate_virt;
+	rte_mempool_populate_default;
+	rte_mempool_populate_anon;
+	rte_mempool_free;
 
 	local: *;
 } DPDK_2.0;
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 31/35] app/testpmd: remove anonymous mempool code
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (29 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 30/35] mempool: make mempool populate and free api public Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 32/35] mem: avoid memzone/mempool/ring name truncation Olivier Matz
                       ` (4 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Now that mempool library provide functions to populate with anonymous
mmap'd memory, we can remove this specific code from test-pmd.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test-pmd/Makefile        |   4 -
 app/test-pmd/mempool_anon.c  | 201 -------------------------------------------
 app/test-pmd/mempool_osdep.h |  54 ------------
 app/test-pmd/testpmd.c       |  23 +++--
 4 files changed, 14 insertions(+), 268 deletions(-)
 delete mode 100644 app/test-pmd/mempool_anon.c
 delete mode 100644 app/test-pmd/mempool_osdep.h

diff --git a/app/test-pmd/Makefile b/app/test-pmd/Makefile
index 72426f3..40039a1 100644
--- a/app/test-pmd/Makefile
+++ b/app/test-pmd/Makefile
@@ -58,11 +58,7 @@ SRCS-y += txonly.c
 SRCS-y += csumonly.c
 SRCS-y += icmpecho.c
 SRCS-$(CONFIG_RTE_LIBRTE_IEEE1588) += ieee1588fwd.c
-SRCS-y += mempool_anon.c
 
-ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
-CFLAGS_mempool_anon.o := -D_GNU_SOURCE
-endif
 CFLAGS_cmdline.o := -D_GNU_SOURCE
 
 # this application needs libraries first
diff --git a/app/test-pmd/mempool_anon.c b/app/test-pmd/mempool_anon.c
deleted file mode 100644
index 5e23848..0000000
--- a/app/test-pmd/mempool_anon.c
+++ /dev/null
@@ -1,201 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include <sys/types.h>
-#include <sys/stat.h>
-#include "mempool_osdep.h"
-#include <rte_errno.h>
-
-#ifdef RTE_EXEC_ENV_LINUXAPP
-
-#include <fcntl.h>
-#include <unistd.h>
-#include <sys/mman.h>
-
-
-#define	PAGEMAP_FNAME		"/proc/self/pagemap"
-
-/*
- * the pfn (page frame number) are bits 0-54 (see pagemap.txt in linux
- * Documentation).
- */
-#define	PAGEMAP_PFN_BITS	54
-#define	PAGEMAP_PFN_MASK	RTE_LEN2MASK(PAGEMAP_PFN_BITS, phys_addr_t)
-
-
-static int
-get_phys_map(void *va, phys_addr_t pa[], uint32_t pg_num, uint32_t pg_sz)
-{
-	int32_t fd, rc;
-	uint32_t i, nb;
-	off_t ofs;
-
-	ofs = (uintptr_t)va / pg_sz * sizeof(*pa);
-	nb = pg_num * sizeof(*pa);
-
-	if ((fd = open(PAGEMAP_FNAME, O_RDONLY)) < 0)
-		return ENOENT;
-
-	if ((rc = pread(fd, pa, nb, ofs)) < 0 || (rc -= nb) != 0) {
-
-		RTE_LOG(ERR, USER1, "failed read of %u bytes from \'%s\' "
-			"at offset %zu, error code: %d\n",
-			nb, PAGEMAP_FNAME, (size_t)ofs, errno);
-		rc = ENOENT;
-	}
-
-	close(fd);
-
-	for (i = 0; i != pg_num; i++)
-		pa[i] = (pa[i] & PAGEMAP_PFN_MASK) * pg_sz;
-
-	return rc;
-}
-
-struct rte_mempool *
-mempool_anon_create(const char *name, unsigned elt_num, unsigned elt_size,
-		   unsigned cache_size, unsigned private_data_size,
-		   rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-		   rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-		   int socket_id, unsigned flags)
-{
-	struct rte_mempool *mp;
-	phys_addr_t *pa;
-	char *va, *uv;
-	uint32_t n, pg_num, pg_shift, pg_sz, total_size;
-	size_t sz;
-	ssize_t usz;
-	int32_t rc;
-
-	rc = ENOMEM;
-	mp = NULL;
-
-	pg_sz = getpagesize();
-	if (rte_is_power_of_2(pg_sz) == 0) {
-		rte_errno = EINVAL;
-		return mp;
-	}
-
-	pg_shift = rte_bsf32(pg_sz);
-
-	total_size = rte_mempool_calc_obj_size(elt_size, flags, NULL);
-
-	/* calc max memory size and max number of pages needed. */
-	sz = rte_mempool_xmem_size(elt_num, total_size, pg_shift);
-	pg_num = sz >> pg_shift;
-
-	/* get chunk of virtually continuos memory.*/
-	if ((va = mmap(NULL, sz, PROT_READ | PROT_WRITE,
-			MAP_SHARED | MAP_ANONYMOUS | MAP_LOCKED,
-			-1, 0)) == MAP_FAILED) {
-		RTE_LOG(ERR, USER1, "%s(%s) failed mmap of %zu bytes, "
-			"error code: %d\n",
-			__func__, name, sz, errno);
-		rte_errno = rc;
-		return mp;
-	}
-
-	/* extract physical mappings of the allocated memory. */
-	if ((pa = calloc(pg_num, sizeof (*pa))) != NULL &&
-			(rc = get_phys_map(va, pa, pg_num, pg_sz)) == 0) {
-
-		/*
-		 * Check that allocated size is big enough to hold elt_num
-		 * objects and a calcualte how many bytes are actually required.
-		 */
-
-		if ((usz = rte_mempool_xmem_usage(va, elt_num, total_size, pa,
-				pg_num, pg_shift)) < 0) {
-
-			n = -usz;
-			rc = ENOENT;
-			RTE_LOG(ERR, USER1, "%s(%s) only %u objects from %u "
-				"requested can  be created over "
-				"mmaped region %p of %zu bytes\n",
-				__func__, name, n, elt_num, va, sz);
-		} else {
-
-			/* unmap unused pages if any */
-			if ((size_t)usz < sz) {
-
-				uv = va + usz;
-				usz = sz - usz;
-
-				RTE_LOG(INFO, USER1,
-					"%s(%s): unmap unused %zu of %zu "
-					"mmaped bytes @%p\n",
-					__func__, name, (size_t)usz, sz, uv);
-				munmap(uv, usz);
-				sz -= usz;
-				pg_num = sz >> pg_shift;
-			}
-
-			if ((mp = rte_mempool_xmem_create(name, elt_num,
-					elt_size, cache_size, private_data_size,
-					mp_init, mp_init_arg,
-					obj_init, obj_init_arg,
-					socket_id, flags, va, pa, pg_num,
-					pg_shift)) != NULL)
-
-				RTE_VERIFY(elt_num == mp->size);
-		}
-	}
-
-	if (mp == NULL) {
-		munmap(va, sz);
-		rte_errno = rc;
-	}
-
-	free(pa);
-	return mp;
-}
-
-#else /* RTE_EXEC_ENV_LINUXAPP */
-
-
-struct rte_mempool *
-mempool_anon_create(__rte_unused const char *name,
-	__rte_unused unsigned elt_num, __rte_unused unsigned elt_size,
-	__rte_unused unsigned cache_size,
-	__rte_unused unsigned private_data_size,
-	__rte_unused rte_mempool_ctor_t *mp_init,
-	__rte_unused void *mp_init_arg,
-	__rte_unused rte_mempool_obj_cb_t *obj_init,
-	__rte_unused void *obj_init_arg,
-	__rte_unused int socket_id, __rte_unused unsigned flags)
-{
-	rte_errno = ENOTSUP;
-	return NULL;
-}
-
-#endif /* RTE_EXEC_ENV_LINUXAPP */
diff --git a/app/test-pmd/mempool_osdep.h b/app/test-pmd/mempool_osdep.h
deleted file mode 100644
index 7ce7297..0000000
--- a/app/test-pmd/mempool_osdep.h
+++ /dev/null
@@ -1,54 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- *     * Redistributions of source code must retain the above copyright
- *       notice, this list of conditions and the following disclaimer.
- *     * Redistributions in binary form must reproduce the above copyright
- *       notice, this list of conditions and the following disclaimer in
- *       the documentation and/or other materials provided with the
- *       distribution.
- *     * Neither the name of Intel Corporation nor the names of its
- *       contributors may be used to endorse or promote products derived
- *       from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _MEMPOOL_OSDEP_H_
-#define _MEMPOOL_OSDEP_H_
-
-#include <rte_mempool.h>
-
-/**
- * @file
- * mempool OS specific header.
- */
-
-/*
- * Create mempool over objects from mmap(..., MAP_ANONYMOUS, ...).
- */
-struct rte_mempool *
-mempool_anon_create(const char *name, unsigned n, unsigned elt_size,
-	unsigned cache_size, unsigned private_data_size,
-	rte_mempool_ctor_t *mp_init, void *mp_init_arg,
-	rte_mempool_obj_cb_t *obj_init, void *obj_init_arg,
-	int socket_id, unsigned flags);
-
-#endif /*_RTE_MEMPOOL_OSDEP_H_ */
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 26a174c..9d11830 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -77,7 +77,6 @@
 #endif
 
 #include "testpmd.h"
-#include "mempool_osdep.h"
 
 uint16_t verbose_level = 0; /**< Silent by default. */
 
@@ -427,17 +426,23 @@ mbuf_pool_create(uint16_t mbuf_seg_size, unsigned nb_mbuf,
 
 	/* if the former XEN allocation failed fall back to normal allocation */
 	if (rte_mp == NULL) {
-		if (mp_anon != 0)
-			rte_mp = mempool_anon_create(pool_name, nb_mbuf,
-					mb_size, (unsigned) mb_mempool_cache,
-					sizeof(struct rte_pktmbuf_pool_private),
-					rte_pktmbuf_pool_init, NULL,
-					rte_pktmbuf_init, NULL,
-					socket_id, 0);
-		else
+		if (mp_anon != 0) {
+			rte_mp = rte_mempool_create_empty(pool_name, nb_mbuf,
+				mb_size, (unsigned) mb_mempool_cache,
+				sizeof(struct rte_pktmbuf_pool_private),
+				socket_id, 0);
+
+			if (rte_mempool_populate_anon(rte_mp) == 0) {
+				rte_mempool_free(rte_mp);
+				rte_mp = NULL;
+			}
+			rte_pktmbuf_pool_init(rte_mp, NULL);
+			rte_mempool_obj_iter(rte_mp, rte_pktmbuf_init, NULL);
+		} else {
 			/* wrapper to rte_mempool_create() */
 			rte_mp = rte_pktmbuf_pool_create(pool_name, nb_mbuf,
 				mb_mempool_cache, 0, mbuf_seg_size, socket_id);
+		}
 	}
 
 	if (rte_mp == NULL) {
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 32/35] mem: avoid memzone/mempool/ring name truncation
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (30 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 31/35] app/testpmd: remove anonymous mempool code Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 33/35] mempool: add flag for removing phys contiguous constraint Olivier Matz
                       ` (3 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Check the return value of snprintf to ensure that the name of
the object is not truncated.

By the way, update the test to avoid to trigger an error in
that case.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c                    | 12 ++++++++----
 lib/librte_eal/common/eal_common_memzone.c | 10 +++++++++-
 lib/librte_mempool/rte_mempool.c           | 20 ++++++++++++++++----
 lib/librte_ring/rte_ring.c                 | 16 +++++++++++++---
 4 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 2bc3ac0..7af708b 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -407,21 +407,25 @@ test_mempool_same_name_twice_creation(void)
 {
 	struct rte_mempool *mp_tc;
 
-	mp_tc = rte_mempool_create("test_mempool_same_name_twice_creation", MEMPOOL_SIZE,
+	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
 						MEMPOOL_ELT_SIZE, 0, 0,
 						NULL, NULL,
 						NULL, NULL,
 						SOCKET_ID_ANY, 0);
-	if (NULL == mp_tc)
+	if (mp_tc == NULL) {
+		printf("cannot create mempool\n");
 		return -1;
+	}
 
-	mp_tc = rte_mempool_create("test_mempool_same_name_twice_creation", MEMPOOL_SIZE,
+	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
 						MEMPOOL_ELT_SIZE, 0, 0,
 						NULL, NULL,
 						NULL, NULL,
 						SOCKET_ID_ANY, 0);
-	if (NULL != mp_tc)
+	if (mp_tc != NULL) {
+		printf("should not be able to create mempool\n");
 		return -1;
+	}
 
 	return 0;
 }
diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index a8f804c..452679e 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -126,6 +126,7 @@ static const struct rte_memzone *
 memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 		int socket_id, unsigned flags, unsigned align, unsigned bound)
 {
+	struct rte_memzone *mz;
 	struct rte_mem_config *mcfg;
 	size_t requested_len;
 	int socket, i;
@@ -148,6 +149,13 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 		return NULL;
 	}
 
+	if (strlen(name) >= sizeof(mz->name) - 1) {
+		RTE_LOG(DEBUG, EAL, "%s(): memzone <%s>: name too long\n",
+			__func__, name);
+		rte_errno = EEXIST;
+		return NULL;
+	}
+
 	/* if alignment is not a power of two */
 	if (align && !rte_is_power_of_2(align)) {
 		RTE_LOG(ERR, EAL, "%s(): Invalid alignment: %u\n", __func__,
@@ -223,7 +231,7 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 	const struct malloc_elem *elem = malloc_elem_from_data(mz_addr);
 
 	/* fill the zone in config */
-	struct rte_memzone *mz = get_next_free_memzone();
+	mz = get_next_free_memzone();
 
 	if (mz == NULL) {
 		RTE_LOG(ERR, EAL, "%s(): Cannot find free memzone but there is room "
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index df2ae0e..a694a0b 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -304,11 +304,14 @@ rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t elt_num,
 static int
 rte_mempool_ring_create(struct rte_mempool *mp)
 {
-	int rg_flags = 0;
+	int rg_flags = 0, ret;
 	char rg_name[RTE_RING_NAMESIZE];
 	struct rte_ring *r;
 
-	snprintf(rg_name, sizeof(rg_name), RTE_MEMPOOL_MZ_FORMAT, mp->name);
+	ret = snprintf(rg_name, sizeof(rg_name),
+		RTE_MEMPOOL_MZ_FORMAT, mp->name);
+	if (ret < 0 || ret >= (int)sizeof(rg_name))
+		return -ENAMETOOLONG;
 
 	/* ring flags */
 	if (mp->flags & MEMPOOL_F_SP_PUT)
@@ -698,6 +701,7 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	size_t mempool_size;
 	int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY;
 	struct rte_mempool_objsz objsz;
+	int ret;
 
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) &
@@ -751,7 +755,11 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	mempool_size += private_data_size;
 	mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN);
 
-	snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
+	ret = snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
+	if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+		rte_errno = ENAMETOOLONG;
+		goto exit_unlock;
+	}
 
 	mz = rte_memzone_reserve(mz_name, mempool_size, socket_id, mz_flags);
 	if (mz == NULL)
@@ -760,7 +768,11 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
 	/* init the mempool structure */
 	mp = mz->addr;
 	memset(mp, 0, sizeof(*mp));
-	snprintf(mp->name, sizeof(mp->name), "%s", name);
+	ret = snprintf(mp->name, sizeof(mp->name), "%s", name);
+	if (ret < 0 || ret >= (int)sizeof(mp->name)) {
+		rte_errno = ENAMETOOLONG;
+		goto exit_unlock;
+	}
 	mp->mz = mz;
 	mp->socket_id = socket_id;
 	mp->size = n;
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index d80faf3..ca0a108 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -122,6 +122,8 @@ int
 rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 	unsigned flags)
 {
+	int ret;
+
 	/* compilation-time checks */
 	RTE_BUILD_BUG_ON((sizeof(struct rte_ring) &
 			  RTE_CACHE_LINE_MASK) != 0);
@@ -140,7 +142,9 @@ rte_ring_init(struct rte_ring *r, const char *name, unsigned count,
 
 	/* init the ring structure */
 	memset(r, 0, sizeof(*r));
-	snprintf(r->name, sizeof(r->name), "%s", name);
+	ret = snprintf(r->name, sizeof(r->name), "%s", name);
+	if (ret < 0 || ret >= (int)sizeof(r->name))
+		return -ENAMETOOLONG;
 	r->flags = flags;
 	r->prod.watermark = count;
 	r->prod.sp_enqueue = !!(flags & RING_F_SP_ENQ);
@@ -165,6 +169,7 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 	ssize_t ring_size;
 	int mz_flags = 0;
 	struct rte_ring_list* ring_list = NULL;
+	int ret;
 
 	ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list);
 
@@ -174,6 +179,13 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 		return NULL;
 	}
 
+	ret = snprintf(mz_name, sizeof(mz_name), "%s%s",
+		RTE_RING_MZ_PREFIX, name);
+	if (ret < 0 || ret >= (int)sizeof(mz_name)) {
+		rte_errno = ENAMETOOLONG;
+		return NULL;
+	}
+
 	te = rte_zmalloc("RING_TAILQ_ENTRY", sizeof(*te), 0);
 	if (te == NULL) {
 		RTE_LOG(ERR, RING, "Cannot reserve memory for tailq\n");
@@ -181,8 +193,6 @@ rte_ring_create(const char *name, unsigned count, int socket_id,
 		return NULL;
 	}
 
-	snprintf(mz_name, sizeof(mz_name), "%s%s", RTE_RING_MZ_PREFIX, name);
-
 	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
 
 	/* reserve a memory zone for this ring. If we can't get rte_config or
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 33/35] mempool: add flag for removing phys contiguous constraint
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (31 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 32/35] mem: avoid memzone/mempool/ring name truncation Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 34/35] app/test: rework mempool test Olivier Matz
                       ` (2 subsequent siblings)
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Add a new flag to remove the constraint of having physically contiguous
objects inside a mempool.

Add this flag to the log history mempool to start, but we could add
it in most cases where objects are not mbufs.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_eal/common/eal_common_log.c |  2 +-
 lib/librte_mempool/rte_mempool.c       | 23 ++++++++++++++++++++---
 lib/librte_mempool/rte_mempool.h       |  5 +++++
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_log.c b/lib/librte_eal/common/eal_common_log.c
index 64aa79f..0daa067 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -322,7 +322,7 @@ rte_eal_common_log_init(FILE *default_log)
 				LOG_ELT_SIZE, 0, 0,
 				NULL, NULL,
 				NULL, NULL,
-				SOCKET_ID_ANY, 0);
+				SOCKET_ID_ANY, MEMPOOL_F_NO_PHYS_CONTIG);
 
 	if ((log_history_mp == NULL) &&
 	    ((log_history_mp = rte_mempool_lookup(LOG_HISTORY_MP_NAME)) == NULL)){
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index a694a0b..1ab6701 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -413,7 +413,11 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char *vaddr,
 
 	while (off + total_elt_sz <= len && mp->populated_size < mp->size) {
 		off += mp->header_size;
-		mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
+		if (paddr == RTE_BAD_PHYS_ADDR)
+			mempool_add_elem(mp, (char *)vaddr + off,
+				RTE_BAD_PHYS_ADDR);
+		else
+			mempool_add_elem(mp, (char *)vaddr + off, paddr + off);
 		off += mp->elt_size + mp->trailer_size;
 		i++;
 	}
@@ -443,6 +447,10 @@ rte_mempool_populate_phys_tab(struct rte_mempool *mp, char *vaddr,
 	if (mp->nb_mem_chunks != 0)
 		return -EEXIST;
 
+	if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+		return rte_mempool_populate_phys(mp, vaddr, RTE_BAD_PHYS_ADDR,
+			pg_num * pg_sz, free_cb, opaque);
+
 	for (i = 0; i < pg_num && mp->populated_size < mp->size; i += n) {
 
 		/* populate with the largest group of contiguous pages */
@@ -484,6 +492,10 @@ rte_mempool_populate_virt(struct rte_mempool *mp, char *addr,
 	if (RTE_ALIGN_CEIL(len, pg_sz) != len)
 		return -EINVAL;
 
+	if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+		return rte_mempool_populate_phys(mp, addr, RTE_BAD_PHYS_ADDR,
+			len, free_cb, opaque);
+
 	for (off = 0; off + pg_sz <= len &&
 		     mp->populated_size < mp->size; off += phys_len) {
 
@@ -534,6 +546,7 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 	char mz_name[RTE_MEMZONE_NAMESIZE];
 	const struct rte_memzone *mz;
 	size_t size, total_elt_sz, align, pg_sz, pg_shift;
+	phys_addr_t paddr;
 	unsigned mz_id, n;
 	int ret;
 
@@ -573,10 +586,14 @@ rte_mempool_populate_default(struct rte_mempool *mp)
 			goto fail;
 		}
 
-		/* use memzone physical address if it is valid */
+		if (mp->flags & MEMPOOL_F_NO_PHYS_CONTIG)
+			paddr = RTE_BAD_PHYS_ADDR;
+		else
+			paddr = mz->phys_addr;
+
 		if (rte_eal_has_hugepages() && !rte_xen_dom0_supported())
 			ret = rte_mempool_populate_phys(mp, mz->addr,
-				mz->phys_addr, mz->len,
+				paddr, mz->len,
 				rte_mempool_memchunk_mz_free,
 				(void *)(uintptr_t)mz);
 		else
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 05de5f7..60339bd 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -236,6 +236,7 @@ struct rte_mempool {
 #define MEMPOOL_F_SP_PUT         0x0004 /**< Default put is "single-producer".*/
 #define MEMPOOL_F_SC_GET         0x0008 /**< Default get is "single-consumer".*/
 #define MEMPOOL_F_RING_CREATED   0x0010 /**< Internal: ring is created */
+#define MEMPOOL_F_NO_PHYS_CONTIG 0x0020 /**< Don't need physically contiguous objs. */
 
 /**
  * @internal When debug is enabled, store some statistics.
@@ -421,6 +422,8 @@ typedef void (rte_mempool_ctor_t)(struct rte_mempool *, void *);
  *   - MEMPOOL_F_SC_GET: If this flag is set, the default behavior
  *     when using rte_mempool_get() or rte_mempool_get_bulk() is
  *     "single-consumer". Otherwise, it is "multi-consumers".
+ *   - MEMPOOL_F_NO_PHYS_CONTIG: If set, allocated objects won't
+ *     necessarilly be contiguous in physical memory.
  * @return
  *   The pointer to the new allocated mempool, on success. NULL on error
  *   with rte_errno set appropriately. Possible rte_errno values include:
@@ -1226,6 +1229,8 @@ rte_mempool_empty(const struct rte_mempool *mp)
  *   A pointer (virtual address) to the element of the pool.
  * @return
  *   The physical address of the elt element.
+ *   If the mempool was created with MEMPOOL_F_NO_PHYS_CONTIG, the
+ *   returned value is RTE_BAD_PHYS_ADDR.
  */
 static inline phys_addr_t
 rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const void *elt)
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 34/35] app/test: rework mempool test
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (32 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 33/35] mempool: add flag for removing phys contiguous constraint Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-18 11:04     ` [PATCH v3 35/35] doc: update release notes about mempool allocation Olivier Matz
  2016-05-19 12:47     ` [PATCH v3 00/35] mempool: rework memory allocation Thomas Monjalon
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Rework the mempool test to better indicate where it failed,
and, now that this feature is available, add the freeing of the
mempool after the test is done.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 app/test/test_mempool.c | 232 +++++++++++++++++++++++++++---------------------
 1 file changed, 129 insertions(+), 103 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 7af708b..9f02758 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -77,13 +77,13 @@
 #define MAX_KEEP 128
 #define MEMPOOL_SIZE ((rte_lcore_count()*(MAX_KEEP+RTE_MEMPOOL_CACHE_MAX_SIZE))-1)
 
-static struct rte_mempool *mp;
-static struct rte_mempool *mp_cache, *mp_nocache;
+#define RET_ERR() do {							\
+		printf("test failed at %s():%d\n", __func__, __LINE__); \
+		return -1;						\
+	} while (0)
 
 static rte_atomic32_t synchro;
 
-
-
 /*
  * save the object number in the first 4 bytes of object data. All
  * other bytes are set to 0.
@@ -93,13 +93,14 @@ my_obj_init(struct rte_mempool *mp, __attribute__((unused)) void *arg,
 	    void *obj, unsigned i)
 {
 	uint32_t *objnum = obj;
+
 	memset(obj, 0, mp->elt_size);
 	*objnum = i;
 }
 
 /* basic tests (done on one core) */
 static int
-test_mempool_basic(void)
+test_mempool_basic(struct rte_mempool *mp)
 {
 	uint32_t *objnum;
 	void **objtable;
@@ -113,23 +114,23 @@ test_mempool_basic(void)
 
 	printf("get an object\n");
 	if (rte_mempool_get(mp, &obj) < 0)
-		return -1;
+		RET_ERR();
 	rte_mempool_dump(stdout, mp);
 
 	/* tests that improve coverage */
 	printf("get object count\n");
 	if (rte_mempool_count(mp) != MEMPOOL_SIZE - 1)
-		return -1;
+		RET_ERR();
 
 	printf("get private data\n");
 	if (rte_mempool_get_priv(mp) != (char *)mp +
 			MEMPOOL_HEADER_SIZE(mp, mp->cache_size))
-		return -1;
+		RET_ERR();
 
 #ifndef RTE_EXEC_ENV_BSDAPP /* rte_mem_virt2phy() not supported on bsd */
 	printf("get physical address of an object\n");
 	if (rte_mempool_virt2phy(mp, obj) != rte_mem_virt2phy(obj))
-		return -1;
+		RET_ERR();
 #endif
 
 	printf("put the object back\n");
@@ -138,10 +139,10 @@ test_mempool_basic(void)
 
 	printf("get 2 objects\n");
 	if (rte_mempool_get(mp, &obj) < 0)
-		return -1;
+		RET_ERR();
 	if (rte_mempool_get(mp, &obj2) < 0) {
 		rte_mempool_put(mp, obj);
-		return -1;
+		RET_ERR();
 	}
 	rte_mempool_dump(stdout, mp);
 
@@ -155,11 +156,10 @@ test_mempool_basic(void)
 	 * on other cores may not be empty.
 	 */
 	objtable = malloc(MEMPOOL_SIZE * sizeof(void *));
-	if (objtable == NULL) {
-		return -1;
-	}
+	if (objtable == NULL)
+		RET_ERR();
 
-	for (i=0; i<MEMPOOL_SIZE; i++) {
+	for (i = 0; i < MEMPOOL_SIZE; i++) {
 		if (rte_mempool_get(mp, &objtable[i]) < 0)
 			break;
 	}
@@ -173,11 +173,11 @@ test_mempool_basic(void)
 		obj_data = obj;
 		objnum = obj;
 		if (*objnum > MEMPOOL_SIZE) {
-			printf("bad object number\n");
+			printf("bad object number(%d)\n", *objnum);
 			ret = -1;
 			break;
 		}
-		for (j=sizeof(*objnum); j<mp->elt_size; j++) {
+		for (j = sizeof(*objnum); j < mp->elt_size; j++) {
 			if (obj_data[j] != 0)
 				ret = -1;
 		}
@@ -196,14 +196,17 @@ static int test_mempool_creation_with_exceeded_cache_size(void)
 {
 	struct rte_mempool *mp_cov;
 
-	mp_cov = rte_mempool_create("test_mempool_creation_with_exceeded_cache_size", MEMPOOL_SIZE,
-					      MEMPOOL_ELT_SIZE,
-					      RTE_MEMPOOL_CACHE_MAX_SIZE + 32, 0,
-					      NULL, NULL,
-					      my_obj_init, NULL,
-					      SOCKET_ID_ANY, 0);
-	if(NULL != mp_cov) {
-		return -1;
+	mp_cov = rte_mempool_create("test_mempool_cache_too_big",
+		MEMPOOL_SIZE,
+		MEMPOOL_ELT_SIZE,
+		RTE_MEMPOOL_CACHE_MAX_SIZE + 32, 0,
+		NULL, NULL,
+		my_obj_init, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_cov != NULL) {
+		rte_mempool_free(mp_cov);
+		RET_ERR();
 	}
 
 	return 0;
@@ -241,8 +244,8 @@ static int test_mempool_single_producer(void)
 			continue;
 		}
 		if (rte_mempool_from_obj(obj) != mp_spsc) {
-			printf("test_mempool_single_producer there is an obj not owned by this mempool\n");
-			return -1;
+			printf("obj not owned by this mempool\n");
+			RET_ERR();
 		}
 		rte_mempool_sp_put(mp_spsc, obj);
 		rte_spinlock_lock(&scsp_spinlock);
@@ -288,7 +291,8 @@ static int test_mempool_single_consumer(void)
 }
 
 /*
- * test function for mempool test based on singple consumer and single producer, can run on one lcore only
+ * test function for mempool test based on singple consumer and single producer,
+ * can run on one lcore only
  */
 static int test_mempool_launch_single_consumer(__attribute__((unused)) void *arg)
 {
@@ -313,33 +317,41 @@ test_mempool_sp_sc(void)
 	unsigned lcore_next;
 
 	/* create a mempool with single producer/consumer ring */
-	if (NULL == mp_spsc) {
+	if (mp_spsc == NULL) {
 		mp_spsc = rte_mempool_create("test_mempool_sp_sc", MEMPOOL_SIZE,
-						MEMPOOL_ELT_SIZE, 0, 0,
-						my_mp_init, NULL,
-						my_obj_init, NULL,
-						SOCKET_ID_ANY, MEMPOOL_F_NO_CACHE_ALIGN | MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET);
-		if (NULL == mp_spsc) {
-			return -1;
-		}
+			MEMPOOL_ELT_SIZE, 0, 0,
+			my_mp_init, NULL,
+			my_obj_init, NULL,
+			SOCKET_ID_ANY,
+			MEMPOOL_F_NO_CACHE_ALIGN | MEMPOOL_F_SP_PUT |
+			MEMPOOL_F_SC_GET);
+		if (mp_spsc == NULL)
+			RET_ERR();
 	}
 	if (rte_mempool_lookup("test_mempool_sp_sc") != mp_spsc) {
 		printf("Cannot lookup mempool from its name\n");
-		return -1;
+		rte_mempool_free(mp_spsc);
+		RET_ERR();
 	}
 	lcore_next = rte_get_next_lcore(lcore_id, 0, 1);
-	if (RTE_MAX_LCORE <= lcore_next)
-		return -1;
-	if (rte_eal_lcore_role(lcore_next) != ROLE_RTE)
-		return -1;
+	if (lcore_next >= RTE_MAX_LCORE) {
+		rte_mempool_free(mp_spsc);
+		RET_ERR();
+	}
+	if (rte_eal_lcore_role(lcore_next) != ROLE_RTE) {
+		rte_mempool_free(mp_spsc);
+		RET_ERR();
+	}
 	rte_spinlock_init(&scsp_spinlock);
 	memset(scsp_obj_table, 0, sizeof(scsp_obj_table));
-	rte_eal_remote_launch(test_mempool_launch_single_consumer, NULL, lcore_next);
-	if(test_mempool_single_producer() < 0)
+	rte_eal_remote_launch(test_mempool_launch_single_consumer, NULL,
+		lcore_next);
+	if (test_mempool_single_producer() < 0)
 		ret = -1;
 
-	if(rte_eal_wait_lcore(lcore_next) < 0)
+	if (rte_eal_wait_lcore(lcore_next) < 0)
 		ret = -1;
+	rte_mempool_free(mp_spsc);
 
 	return ret;
 }
@@ -348,7 +360,7 @@ test_mempool_sp_sc(void)
  * it tests some more basic of mempool
  */
 static int
-test_mempool_basic_ex(struct rte_mempool * mp)
+test_mempool_basic_ex(struct rte_mempool *mp)
 {
 	unsigned i;
 	void **obj;
@@ -358,38 +370,41 @@ test_mempool_basic_ex(struct rte_mempool * mp)
 	if (mp == NULL)
 		return ret;
 
-	obj = rte_calloc("test_mempool_basic_ex", MEMPOOL_SIZE , sizeof(void *), 0);
+	obj = rte_calloc("test_mempool_basic_ex", MEMPOOL_SIZE,
+		sizeof(void *), 0);
 	if (obj == NULL) {
 		printf("test_mempool_basic_ex fail to rte_malloc\n");
 		return ret;
 	}
-	printf("test_mempool_basic_ex now mempool (%s) has %u free entries\n", mp->name, rte_mempool_free_count(mp));
+	printf("test_mempool_basic_ex now mempool (%s) has %u free entries\n",
+		mp->name, rte_mempool_free_count(mp));
 	if (rte_mempool_full(mp) != 1) {
-		printf("test_mempool_basic_ex the mempool is not full but it should be\n");
+		printf("test_mempool_basic_ex the mempool should be full\n");
 		goto fail_mp_basic_ex;
 	}
 
 	for (i = 0; i < MEMPOOL_SIZE; i ++) {
 		if (rte_mempool_mc_get(mp, &obj[i]) < 0) {
-			printf("fail_mp_basic_ex fail to get mempool object for [%u]\n", i);
+			printf("test_mp_basic_ex fail to get object for [%u]\n",
+				i);
 			goto fail_mp_basic_ex;
 		}
 	}
 	if (rte_mempool_mc_get(mp, &err_obj) == 0) {
-		printf("test_mempool_basic_ex get an impossible obj from mempool\n");
+		printf("test_mempool_basic_ex get an impossible obj\n");
 		goto fail_mp_basic_ex;
 	}
 	printf("number: %u\n", i);
 	if (rte_mempool_empty(mp) != 1) {
-		printf("test_mempool_basic_ex the mempool is not empty but it should be\n");
+		printf("test_mempool_basic_ex the mempool should be empty\n");
 		goto fail_mp_basic_ex;
 	}
 
-	for (i = 0; i < MEMPOOL_SIZE; i ++) {
+	for (i = 0; i < MEMPOOL_SIZE; i++)
 		rte_mempool_mp_put(mp, obj[i]);
-	}
+
 	if (rte_mempool_full(mp) != 1) {
-		printf("test_mempool_basic_ex the mempool is not full but it should be\n");
+		printf("test_mempool_basic_ex the mempool should be full\n");
 		goto fail_mp_basic_ex;
 	}
 
@@ -405,28 +420,30 @@ fail_mp_basic_ex:
 static int
 test_mempool_same_name_twice_creation(void)
 {
-	struct rte_mempool *mp_tc;
+	struct rte_mempool *mp_tc, *mp_tc2;
 
 	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
-						MEMPOOL_ELT_SIZE, 0, 0,
-						NULL, NULL,
-						NULL, NULL,
-						SOCKET_ID_ANY, 0);
-	if (mp_tc == NULL) {
-		printf("cannot create mempool\n");
-		return -1;
-	}
-
-	mp_tc = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
-						MEMPOOL_ELT_SIZE, 0, 0,
-						NULL, NULL,
-						NULL, NULL,
-						SOCKET_ID_ANY, 0);
-	if (mp_tc != NULL) {
-		printf("should not be able to create mempool\n");
-		return -1;
+		MEMPOOL_ELT_SIZE, 0, 0,
+		NULL, NULL,
+		NULL, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_tc == NULL)
+		RET_ERR();
+
+	mp_tc2 = rte_mempool_create("test_mempool_same_name", MEMPOOL_SIZE,
+		MEMPOOL_ELT_SIZE, 0, 0,
+		NULL, NULL,
+		NULL, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_tc2 != NULL) {
+		rte_mempool_free(mp_tc);
+		rte_mempool_free(mp_tc2);
+		RET_ERR();
 	}
 
+	rte_mempool_free(mp_tc);
 	return 0;
 }
 
@@ -447,7 +464,7 @@ test_mempool_xmem_misc(void)
 	usz = rte_mempool_xmem_usage(NULL, elt_num, total_size, 0, 1,
 		MEMPOOL_PG_SHIFT_MAX);
 
-	if(sz != (size_t)usz)  {
+	if (sz != (size_t)usz)  {
 		printf("failure @ %s: rte_mempool_xmem_usage(%u, %u) "
 			"returns: %#zx, while expected: %#zx;\n",
 			__func__, elt_num, total_size, sz, (size_t)usz);
@@ -460,68 +477,77 @@ test_mempool_xmem_misc(void)
 static int
 test_mempool(void)
 {
+	struct rte_mempool *mp_cache = NULL;
+	struct rte_mempool *mp_nocache = NULL;
+
 	rte_atomic32_init(&synchro);
 
 	/* create a mempool (without cache) */
-	if (mp_nocache == NULL)
-		mp_nocache = rte_mempool_create("test_nocache", MEMPOOL_SIZE,
-						MEMPOOL_ELT_SIZE, 0, 0,
-						NULL, NULL,
-						my_obj_init, NULL,
-						SOCKET_ID_ANY, 0);
-	if (mp_nocache == NULL)
-		return -1;
+	mp_nocache = rte_mempool_create("test_nocache", MEMPOOL_SIZE,
+		MEMPOOL_ELT_SIZE, 0, 0,
+		NULL, NULL,
+		my_obj_init, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_nocache == NULL) {
+		printf("cannot allocate mp_nocache mempool\n");
+		goto err;
+	}
 
 	/* create a mempool (with cache) */
-	if (mp_cache == NULL)
-		mp_cache = rte_mempool_create("test_cache", MEMPOOL_SIZE,
-					      MEMPOOL_ELT_SIZE,
-					      RTE_MEMPOOL_CACHE_MAX_SIZE, 0,
-					      NULL, NULL,
-					      my_obj_init, NULL,
-					      SOCKET_ID_ANY, 0);
-	if (mp_cache == NULL)
-		return -1;
-
+	mp_cache = rte_mempool_create("test_cache", MEMPOOL_SIZE,
+		MEMPOOL_ELT_SIZE,
+		RTE_MEMPOOL_CACHE_MAX_SIZE, 0,
+		NULL, NULL,
+		my_obj_init, NULL,
+		SOCKET_ID_ANY, 0);
+
+	if (mp_cache == NULL) {
+		printf("cannot allocate mp_cache mempool\n");
+		goto err;
+	}
 
 	/* retrieve the mempool from its name */
 	if (rte_mempool_lookup("test_nocache") != mp_nocache) {
 		printf("Cannot lookup mempool from its name\n");
-		return -1;
+		goto err;
 	}
 
 	rte_mempool_list_dump(stdout);
 
 	/* basic tests without cache */
-	mp = mp_nocache;
-	if (test_mempool_basic() < 0)
-		return -1;
+	if (test_mempool_basic(mp_nocache) < 0)
+		goto err;
 
 	/* basic tests with cache */
-	mp = mp_cache;
-	if (test_mempool_basic() < 0)
-		return -1;
+	if (test_mempool_basic(mp_cache) < 0)
+		goto err;
 
 	/* more basic tests without cache */
 	if (test_mempool_basic_ex(mp_nocache) < 0)
-		return -1;
+		goto err;
 
 	/* mempool operation test based on single producer and single comsumer */
 	if (test_mempool_sp_sc() < 0)
-		return -1;
+		goto err;
 
 	if (test_mempool_creation_with_exceeded_cache_size() < 0)
-		return -1;
+		goto err;
 
 	if (test_mempool_same_name_twice_creation() < 0)
-		return -1;
+		goto err;
 
 	if (test_mempool_xmem_misc() < 0)
-		return -1;
+		goto err;
 
 	rte_mempool_list_dump(stdout);
 
 	return 0;
+
+err:
+	rte_mempool_free(mp_nocache);
+	rte_mempool_free(mp_cache);
+	return -1;
 }
 
 static struct test_command mempool_cmd = {
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* [PATCH v3 35/35] doc: update release notes about mempool allocation
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (33 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 34/35] app/test: rework mempool test Olivier Matz
@ 2016-05-18 11:04     ` Olivier Matz
  2016-05-19 12:47     ` [PATCH v3 00/35] mempool: rework memory allocation Thomas Monjalon
  35 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-18 11:04 UTC (permalink / raw)
  To: dev; +Cc: bruce.richardson, stephen, keith.wiles

Remove the deprecation notice and add an entry in the release note
for the changes in mempool allocation.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 doc/guides/rel_notes/deprecation.rst   | 8 --------
 doc/guides/rel_notes/release_16_07.rst | 9 +++++++++
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 7d94ba5..ad05eba 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -51,14 +51,6 @@ Deprecation Notices
   functions added to facilitate the creation of mempools using an external
   handler. The 16.07 release will contain these changes.
 
-* The rte_mempool allocation will be changed in 16.07:
-  allocation of large mempool in several virtual memory chunks, new API
-  to populate a mempool, new API to free a mempool, allocation in
-  anonymous mapping, drop of specific dom0 code. These changes will
-  induce a modification of the rte_mempool structure, plus a
-  modification of the API of rte_mempool_obj_iter(), implying a breakage
-  of the ABI.
-
 * A librte_vhost public structures refactor is planned for DPDK 16.07
   that requires both ABI and API change.
   The proposed refactor would expose DPDK vhost dev to applications as
diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_notes/release_16_07.rst
index 58c8ef9..6cb5304 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -38,6 +38,15 @@ New Features
 
   The size of the mempool structure is reduced if the per-lcore cache is disabled.
 
+* **Changed the memory allocation in mempool library.**
+
+  * Added ability to allocate a large mempool in virtually fragmented memory.
+  * Added new APIs to populate a mempool with memory.
+  * Added an API to free a mempool.
+  * Modified the API of rte_mempool_obj_iter() function.
+  * Dropped specific Xen Dom0 code.
+  * Dropped specific anonymous mempool code in testpmd.
+
 
 Resolved Issues
 ---------------
-- 
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* Re: [PATCH v3 00/35] mempool: rework memory allocation
  2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
                       ` (34 preceding siblings ...)
  2016-05-18 11:04     ` [PATCH v3 35/35] doc: update release notes about mempool allocation Olivier Matz
@ 2016-05-19 12:47     ` Thomas Monjalon
  2016-05-20  8:42       ` Panu Matilainen
  35 siblings, 1 reply; 150+ messages in thread
From: Thomas Monjalon @ 2016-05-19 12:47 UTC (permalink / raw)
  To: Olivier Matz; +Cc: dev, bruce.richardson, stephen, keith.wiles

2016-05-18 13:04, Olivier Matz:
> This series is a rework of mempool. For those who don't want to read
> all the cover letter, here is a sumary:
> 
> - it is not possible to allocate large mempools if there is not enough
>   contiguous memory, this series solves this issue
> - introduce new APIs with less arguments: "create, populate, obj_init"
> - allow to free a mempool
> - split code in smaller functions, will ease the introduction of ext_handler
> - remove test-pmd anonymous mempool creation
> - remove most of dom0-specific mempool code
> - opens the door for a eal_memory rework: we probably don't need large
>   contiguous memory area anymore, working with pages would work.
> 
> This breaks the ABI as it was indicated in the deprecation for 16.04.
> The API stays almost the same, no modification is needed in examples app
> or in test-pmd. Only kni and mellanox drivers are slightly modified.

Applied with a small change you sent me to fix mlx build in the middle of the patchset
and update the removed Xen files in MAINTAINERS file.

Thanks for the big rework!

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH v3 00/35] mempool: rework memory allocation
  2016-05-19 12:47     ` [PATCH v3 00/35] mempool: rework memory allocation Thomas Monjalon
@ 2016-05-20  8:42       ` Panu Matilainen
  2016-05-20  9:09         ` Thomas Monjalon
  0 siblings, 1 reply; 150+ messages in thread
From: Panu Matilainen @ 2016-05-20  8:42 UTC (permalink / raw)
  To: Thomas Monjalon, Olivier Matz; +Cc: dev, bruce.richardson, stephen, keith.wiles

On 05/19/2016 03:47 PM, Thomas Monjalon wrote:
> 2016-05-18 13:04, Olivier Matz:
>> This series is a rework of mempool. For those who don't want to read
>> all the cover letter, here is a sumary:
>>
>> - it is not possible to allocate large mempools if there is not enough
>>   contiguous memory, this series solves this issue
>> - introduce new APIs with less arguments: "create, populate, obj_init"
>> - allow to free a mempool
>> - split code in smaller functions, will ease the introduction of ext_handler
>> - remove test-pmd anonymous mempool creation
>> - remove most of dom0-specific mempool code
>> - opens the door for a eal_memory rework: we probably don't need large
>>   contiguous memory area anymore, working with pages would work.
>>
>> This breaks the ABI as it was indicated in the deprecation for 16.04.
>> The API stays almost the same, no modification is needed in examples app
>> or in test-pmd. Only kni and mellanox drivers are slightly modified.
>
> Applied with a small change you sent me to fix mlx build in the middle of the patchset
> and update the removed Xen files in MAINTAINERS file.
>
> Thanks for the big rework!
>

Just noticed this series "breaks" --no-huge as a regular user, commit 
593a084afc2b to be exact:

mmap(NULL, 4194304, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_ANONYMOUS|MAP_LOCKED, 0, 0) = -1 EAGAIN (Resource 
temporarily unavailable)
write(1, "EAL: rte_eal_hugepage_init: mmap"..., 76EAL: 
rte_eal_hugepage_init: mmap() failed: Resource temporarily unavailable

"Breaks" in quotes because I guess it always was broken (as the 
non-locked pages might not be in physical memory) and because its
possible to adjust resourse limits to allow the operation to succeed.
If you're root, that is.

I was just looking into making the test-suite runnable by a regular user 
with no special privileges, primarily to make it possible to run the 
testsuite as part of rpm package builds (in %check), and no special 
setup or extra privileges can be assumed there. Such tests are of course 
of limited coverage but still better than nothing, and --no-huge was my 
ticket there. Talk about bad timing :)

It'd be fine to have limited subset of tests to run when non-privileged 
but since this one lives inside rte_eal_init() it practically prevents 
everything, unless I'm missing some other magic switch or such. Thoughts?

	- Panu -

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH v3 00/35] mempool: rework memory allocation
  2016-05-20  8:42       ` Panu Matilainen
@ 2016-05-20  9:09         ` Thomas Monjalon
  2016-05-23  7:43           ` Olivier Matz
  0 siblings, 1 reply; 150+ messages in thread
From: Thomas Monjalon @ 2016-05-20  9:09 UTC (permalink / raw)
  To: Panu Matilainen; +Cc: Olivier Matz, dev

2016-05-20 11:42, Panu Matilainen:
> Just noticed this series "breaks" --no-huge as a regular user, commit 
> 593a084afc2b to be exact:
> 
> mmap(NULL, 4194304, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_ANONYMOUS|MAP_LOCKED, 0, 0) = -1 EAGAIN (Resource 
> temporarily unavailable)
> write(1, "EAL: rte_eal_hugepage_init: mmap"..., 76EAL: 
> rte_eal_hugepage_init: mmap() failed: Resource temporarily unavailable
> 
> "Breaks" in quotes because I guess it always was broken (as the 
> non-locked pages might not be in physical memory) and because its
> possible to adjust resourse limits to allow the operation to succeed.
> If you're root, that is.
> 
> I was just looking into making the test-suite runnable by a regular user 
> with no special privileges,

I have the same dream, to make sure every developer can run the unit tests
easily and quickly.

> primarily to make it possible to run the 
> testsuite as part of rpm package builds (in %check), and no special 
> setup or extra privileges can be assumed there. Such tests are of course 
> of limited coverage but still better than nothing, and --no-huge was my 
> ticket there. Talk about bad timing :)
> 
> It'd be fine to have limited subset of tests to run when non-privileged 
> but since this one lives inside rte_eal_init() it practically prevents 
> everything, unless I'm missing some other magic switch or such. Thoughts?

This change was done for mbuf allocation because they are passed to the
hardware. We should not have any hardware constraint in the unit tests.
So I'd say it is a requirement for the memory rework. We must be capable
to allocate some locked pages if required, and some standard pages for
other usages.
Please jump in this thread:
	http://dpdk.org/ml/archives/dev/2016-April/037444.html

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH v3 00/35] mempool: rework memory allocation
  2016-05-20  9:09         ` Thomas Monjalon
@ 2016-05-23  7:43           ` Olivier Matz
  2016-06-13 10:27             ` Olivier Matz
  0 siblings, 1 reply; 150+ messages in thread
From: Olivier Matz @ 2016-05-23  7:43 UTC (permalink / raw)
  To: Thomas Monjalon, Panu Matilainen; +Cc: dev

Hi Panu, Thomas,

On 05/20/2016 11:09 AM, Thomas Monjalon wrote:
> 2016-05-20 11:42, Panu Matilainen:
>> Just noticed this series "breaks" --no-huge as a regular user, commit 
>> 593a084afc2b to be exact:
>>
>> mmap(NULL, 4194304, PROT_READ|PROT_WRITE, 
>> MAP_PRIVATE|MAP_ANONYMOUS|MAP_LOCKED, 0, 0) = -1 EAGAIN (Resource 
>> temporarily unavailable)
>> write(1, "EAL: rte_eal_hugepage_init: mmap"..., 76EAL: 
>> rte_eal_hugepage_init: mmap() failed: Resource temporarily unavailable
>>
>> "Breaks" in quotes because I guess it always was broken (as the 
>> non-locked pages might not be in physical memory) and because its
>> possible to adjust resourse limits to allow the operation to succeed.
>> If you're root, that is.
>>
>> I was just looking into making the test-suite runnable by a regular user 
>> with no special privileges,
> 
> I have the same dream, to make sure every developer can run the unit tests
> easily and quickly.

Thanks Panu for the feedback on this, I didn't notice this regression
for a regular user.

The goal of this commit was to do a step forward in the direction
of a working --no-huge: locking the pages in physical memory is
mandatory for most physical drivers. But as described at the end
of http://dpdk.org/ml/archives/dev/2016-May/039229.html , the
--no-huge option is still not working because the physical addresses
are not correct.

So I think it wouldn't be a problem to revert this commit if it breaks
something.

>> primarily to make it possible to run the 
>> testsuite as part of rpm package builds (in %check), and no special 
>> setup or extra privileges can be assumed there. Such tests are of course 
>> of limited coverage but still better than nothing, and --no-huge was my 
>> ticket there. Talk about bad timing :)
>>
>> It'd be fine to have limited subset of tests to run when non-privileged 
>> but since this one lives inside rte_eal_init() it practically prevents 
>> everything, unless I'm missing some other magic switch or such. Thoughts?
> 
> This change was done for mbuf allocation because they are passed to the
> hardware. We should not have any hardware constraint in the unit tests.
> So I'd say it is a requirement for the memory rework. We must be capable
> to allocate some locked pages if required, and some standard pages for
> other usages.
> Please jump in this thread:
> 	http://dpdk.org/ml/archives/dev/2016-April/037444.html

Yes, I agree, this is something that could be managed in the
memory rework task.


Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH v3 13/35] mempool: store physical address in objects
  2016-05-18 11:04     ` [PATCH v3 13/35] mempool: store physical address in objects Olivier Matz
@ 2016-05-25 17:51       ` Jain, Deepak K
  2016-05-25 19:41         ` Olivier Matz
  0 siblings, 1 reply; 150+ messages in thread
From: Jain, Deepak K @ 2016-05-25 17:51 UTC (permalink / raw)
  To: Olivier Matz, dev
  Cc: Richardson, Bruce, stephen, Wiles, Keith, Griffin, John, Kusztal,
	ArkadiuszX, Trahe, Fiona, Mcnamara, John

Hi,

While running the QAT PMD tests, a system hang is observed when this commit is used.

rte_mempool_virt2phy is used in qat_crypto.c.

regards,
Deepak


-----Original Message-----
From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
Sent: Wednesday, May 18, 2016 12:05 PM
To: dev@dpdk.org
Cc: Richardson, Bruce <bruce.richardson@intel.com>; stephen@networkplumber.org; Wiles, Keith <keith.wiles@intel.com>
Subject: [dpdk-dev] [PATCH v3 13/35] mempool: store physical address in objects

Store the physical address of the object in its header. It simplifies
rte_mempool_virt2phy() and prepares the removing of the paddr[] table in the mempool header.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
---
 lib/librte_mempool/rte_mempool.c | 17 +++++++++++------  lib/librte_mempool/rte_mempool.h | 11 ++++++-----
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 61e191e..ce12db5 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -133,19 +133,22 @@ static unsigned optimize_object_size(unsigned obj_size)  typedef void (*rte_mempool_obj_iter_t)(void * /*obj_iter_arg*/,
 	void * /*obj_start*/,
 	void * /*obj_end*/,
-	uint32_t /*obj_index */);
+	uint32_t /*obj_index */,
+	phys_addr_t /*physaddr*/);
 
 static void
-mempool_add_elem(struct rte_mempool *mp, void *obj)
+mempool_add_elem(struct rte_mempool *mp, void *obj, phys_addr_t 
+physaddr)
 {
 	struct rte_mempool_objhdr *hdr;
 	struct rte_mempool_objtlr *tlr __rte_unused;
 
 	obj = (char *)obj + mp->header_size;
+	physaddr += mp->header_size;
 
 	/* set mempool ptr in header */
 	hdr = RTE_PTR_SUB(obj, sizeof(*hdr));
 	hdr->mp = mp;
+	hdr->physaddr = physaddr;
 	STAILQ_INSERT_TAIL(&mp->elt_list, hdr, next);
 
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
@@ -175,6 +178,7 @@ rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 	uint32_t pgn, pgf;
 	uintptr_t end, start, va;
 	uintptr_t pg_sz;
+	phys_addr_t physaddr;
 
 	pg_sz = (uintptr_t)1 << pg_shift;
 	va = (uintptr_t)vaddr;
@@ -210,9 +214,10 @@ rte_mempool_obj_mem_iter(void *vaddr, uint32_t elt_num, size_t total_elt_sz,
 		 * otherwise, just skip that chunk unused.
 		 */
 		if (k == pgn) {
+			physaddr = paddr[k] + (start & (pg_sz - 1));
 			if (obj_iter != NULL)
 				obj_iter(obj_iter_arg, (void *)start,
-					(void *)end, i);
+					(void *)end, i, physaddr);
 			va = end;
 			j += pgf;
 			i++;
@@ -249,11 +254,11 @@ rte_mempool_obj_iter(struct rte_mempool *mp,
 
 static void
 mempool_obj_populate(void *arg, void *start, void *end,
-	__rte_unused uint32_t idx)
+	__rte_unused uint32_t idx, phys_addr_t physaddr)
 {
 	struct rte_mempool *mp = arg;
 
-	mempool_add_elem(mp, start);
+	mempool_add_elem(mp, start, physaddr);
 	mp->elt_va_end = (uintptr_t)end;
 }
 
@@ -358,7 +363,7 @@ rte_mempool_xmem_size(uint32_t elt_num, size_t total_elt_sz, uint32_t pg_shift)
  */
 static void
 mempool_lelem_iter(void *arg, __rte_unused void *start, void *end,
-	__rte_unused uint32_t idx)
+	__rte_unused uint32_t idx, __rte_unused phys_addr_t physaddr)
 {
 	*(uintptr_t *)arg = (uintptr_t)end;
 }
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 12215f6..4f95bdf 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -159,6 +159,7 @@ struct rte_mempool_objsz {  struct rte_mempool_objhdr {
 	STAILQ_ENTRY(rte_mempool_objhdr) next; /**< Next in list. */
 	struct rte_mempool *mp;          /**< The mempool owning the object. */
+	phys_addr_t physaddr;            /**< Physical address of the object. */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
 	uint64_t cookie;                 /**< Debug cookie. */
 #endif
@@ -1131,13 +1132,13 @@ rte_mempool_empty(const struct rte_mempool *mp)
  *   The physical address of the elt element.
  */
 static inline phys_addr_t
-rte_mempool_virt2phy(const struct rte_mempool *mp, const void *elt)
+rte_mempool_virt2phy(__rte_unused const struct rte_mempool *mp, const 
+void *elt)
 {
 	if (rte_eal_has_hugepages()) {
-		uintptr_t off;
-
-		off = (const char *)elt - (const char *)mp->elt_va_start;
-		return mp->elt_pa[off >> mp->pg_shift] + (off & mp->pg_mask);
+		const struct rte_mempool_objhdr *hdr;
+		hdr = (const struct rte_mempool_objhdr *)RTE_PTR_SUB(elt,
+			sizeof(*hdr));
+		return hdr->physaddr;
 	} else {
 		/*
 		 * If huge pages are disabled, we cannot assume the
--
2.8.0.rc3

^ permalink raw reply related	[flat|nested] 150+ messages in thread

* Re: [PATCH v3 13/35] mempool: store physical address in objects
  2016-05-25 17:51       ` Jain, Deepak K
@ 2016-05-25 19:41         ` Olivier Matz
  0 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-05-25 19:41 UTC (permalink / raw)
  To: Jain, Deepak K, dev
  Cc: Richardson, Bruce, stephen, Wiles, Keith, Griffin, John, Kusztal,
	ArkadiuszX, Trahe, Fiona, Mcnamara, John

Hi Deepak,

On 05/25/2016 07:51 PM, Jain, Deepak K wrote:
> Hi,
> 
> While running the QAT PMD tests, a system hang is observed when this commit is used.
> 
> rte_mempool_virt2phy is used in qat_crypto.c.

>From what I see in the code, the second argument of the function
rte_mempool_virt2phy(mp, elt) is not a pointer to a element of
the mempool.

This should be the case according to the API (even before my patchset):

  * @param elt
  *   A pointer (virtual address) to the element of the pool.


Could you try to replace:

  s->cd_paddr = rte_mempool_virt2phy(mp, &s->cd)

By something like:

  s->cd_paddr = rte_mempool_virt2phy(mp, s) +
    offsetof(struct qat_session, cd)



Regards,
Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH v3 20/35] mempool: allocate in several memory chunks by default
  2016-05-18 11:04     ` [PATCH v3 20/35] mempool: allocate in several memory chunks by default Olivier Matz
@ 2016-06-01  3:37       ` Ferruh Yigit
  0 siblings, 0 replies; 150+ messages in thread
From: Ferruh Yigit @ 2016-06-01  3:37 UTC (permalink / raw)
  To: Olivier Matz, dev; +Cc: bruce.richardson, stephen, keith.wiles

On 5/18/2016 12:04 PM, Olivier Matz wrote:
> Introduce rte_mempool_populate_default() which allocates
> mempool objects in several memzones.
> 
> The mempool header is now always allocated in a specific memzone
> (not with its objects). Thanks to this modification, we can remove
> many specific behavior that was required when hugepages are not
> enabled in case we are using rte_mempool_xmem_create().
> 
> This change requires to update how kni and mellanox drivers lookup for
> mbuf memory. For now, this will only work if there is only one memory
> chunk (like today), but we could make use of rte_mempool_mem_iter() to
> support more memory chunks.
> 
> We can also remove RTE_MEMPOOL_OBJ_NAME that is not required anymore for
> the lookup, as memory chunks are referenced by the mempool.
> 
> Note that rte_mempool_create() is still broken (it was the case before)
> when there is no hugepages support (rte_mempool_create_xmem() has to be
> used). This is fixed in next commit.
> 
> Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
> ---

<snip>

> -	if (vaddr == NULL) {
> -		/* calculate address of the first elt for continuous mempool. */
> -		obj = (char *)mp + MEMPOOL_HEADER_SIZE(mp, cache_size) +
> -			private_data_size;
> -		obj = RTE_PTR_ALIGN_CEIL(obj, RTE_MEMPOOL_ALIGN);
> -
> -		ret = rte_mempool_populate_phys(mp, obj,
> -			mp->phys_addr + ((char *)obj - (char *)mp),
> -			objsz.total_size * n, NULL, NULL);
> -		if (ret != (int)mp->size)
> -			goto exit_unlock;
> -	} else {
> +	if (vaddr == NULL)
> +		ret = rte_mempool_populate_default(mp);

This breaks current ivshmem code, since now mp has multiple mz.
I will send a patch for ivshmem.

<snip>

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH v3 00/35] mempool: rework memory allocation
  2016-05-23  7:43           ` Olivier Matz
@ 2016-06-13 10:27             ` Olivier Matz
  0 siblings, 0 replies; 150+ messages in thread
From: Olivier Matz @ 2016-06-13 10:27 UTC (permalink / raw)
  To: Thomas Monjalon, Panu Matilainen; +Cc: dev

Hi,

On 05/23/2016 09:43 AM, Olivier Matz wrote:
> Hi Panu, Thomas,
> 
> On 05/20/2016 11:09 AM, Thomas Monjalon wrote:
>> 2016-05-20 11:42, Panu Matilainen:
>>> Just noticed this series "breaks" --no-huge as a regular user, commit 
>>> 593a084afc2b to be exact:
>>>
>>> mmap(NULL, 4194304, PROT_READ|PROT_WRITE, 
>>> MAP_PRIVATE|MAP_ANONYMOUS|MAP_LOCKED, 0, 0) = -1 EAGAIN (Resource 
>>> temporarily unavailable)
>>> write(1, "EAL: rte_eal_hugepage_init: mmap"..., 76EAL: 
>>> rte_eal_hugepage_init: mmap() failed: Resource temporarily unavailable
>>>
>>> "Breaks" in quotes because I guess it always was broken (as the 
>>> non-locked pages might not be in physical memory) and because its
>>> possible to adjust resourse limits to allow the operation to succeed.
>>> If you're root, that is.
>>>
>>> I was just looking into making the test-suite runnable by a regular user 
>>> with no special privileges,
>>
>> I have the same dream, to make sure every developer can run the unit tests
>> easily and quickly.
> 
> Thanks Panu for the feedback on this, I didn't notice this regression
> for a regular user.
> 
> The goal of this commit was to do a step forward in the direction
> of a working --no-huge: locking the pages in physical memory is
> mandatory for most physical drivers. But as described at the end
> of http://dpdk.org/ml/archives/dev/2016-May/039229.html , the
> --no-huge option is still not working because the physical addresses
> are not correct.
> 
> So I think it wouldn't be a problem to revert this commit if it breaks
> something.

I've just sent a patch to fix that. Feel free to comment.
See http://www.dpdk.org/ml/archives/dev/2016-June/041051.html

Regards,
Olivier

^ permalink raw reply	[flat|nested] 150+ messages in thread

end of thread, other threads:[~2016-06-13 10:27 UTC | newest]

Thread overview: 150+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-09 16:19 [RFC 00/35] mempool: rework memory allocation Olivier Matz
2016-03-09 16:19 ` [RFC 01/35] mempool: fix comments and style Olivier Matz
2016-03-09 16:19 ` [RFC 02/35] mempool: replace elt_size by total_elt_size Olivier Matz
2016-03-09 16:19 ` [RFC 03/35] mempool: uninline function to check cookies Olivier Matz
2016-03-09 16:19 ` [RFC 04/35] mempool: use sizeof to get the size of header and trailer Olivier Matz
2016-03-09 16:19 ` [RFC 05/35] mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t Olivier Matz
2016-03-09 16:19 ` [RFC 06/35] mempool: update library version Olivier Matz
2016-03-09 16:19 ` [RFC 07/35] mempool: list objects when added in the mempool Olivier Matz
2016-03-09 16:19 ` [RFC 08/35] mempool: remove const attribute in mempool_walk Olivier Matz
2016-03-09 16:19 ` [RFC 09/35] mempool: use the list to iterate the mempool elements Olivier Matz
2016-03-09 16:19 ` [RFC 10/35] eal: introduce RTE_DECONST macro Olivier Matz
2016-03-09 18:53   ` Stephen Hemminger
2016-03-09 20:47     ` Olivier MATZ
2016-03-09 21:01       ` Stephen Hemminger
2016-03-10  8:11         ` Olivier MATZ
2016-03-11 21:47           ` Stephen Hemminger
2016-03-09 21:22       ` Bruce Richardson
2016-03-10  8:29         ` Olivier MATZ
2016-03-10  9:26           ` Bruce Richardson
2016-03-10 10:05             ` Olivier MATZ
2016-03-09 16:19 ` [RFC 11/35] mempool: use the list to audit all elements Olivier Matz
2016-03-09 16:19 ` [RFC 12/35] mempool: use the list to initialize mempool objects Olivier Matz
2016-03-09 16:19 ` [RFC 13/35] mempool: create the internal ring in a specific function Olivier Matz
2016-03-09 16:19 ` [RFC 14/35] mempool: store physaddr in mempool objects Olivier Matz
2016-03-09 16:19 ` [RFC 15/35] mempool: remove MEMPOOL_IS_CONTIG() Olivier Matz
2016-03-09 16:19 ` [RFC 16/35] mempool: store memory chunks in a list Olivier Matz
2016-03-09 16:19 ` [RFC 17/35] mempool: new function to iterate the memory chunks Olivier Matz
2016-03-09 16:19 ` [RFC 18/35] mempool: simplify xmem_usage Olivier Matz
2016-03-09 16:19 ` [RFC 19/35] mempool: introduce a free callback for memory chunks Olivier Matz
2016-03-09 16:19 ` [RFC 20/35] mempool: make page size optional when getting xmem size Olivier Matz
2016-03-09 16:19 ` [RFC 21/35] mempool: default allocation in several memory chunks Olivier Matz
2016-03-09 16:19 ` [RFC 22/35] eal: lock memory when using no-huge Olivier Matz
2016-03-09 16:19 ` [RFC 23/35] mempool: support no-hugepage mode Olivier Matz
2016-03-09 16:19 ` [RFC 24/35] mempool: replace mempool physaddr by a memzone pointer Olivier Matz
2016-03-09 16:19 ` [RFC 25/35] mempool: introduce a function to free a mempool Olivier Matz
2016-03-09 16:19 ` [RFC 26/35] mempool: introduce a function to create an empty mempool Olivier Matz
2016-03-09 16:19 ` [RFC 27/35] eal/xen: return machine address without knowing memseg id Olivier Matz
2016-03-09 16:19 ` [RFC 28/35] mempool: rework support of xen dom0 Olivier Matz
2016-03-09 16:19 ` [RFC 29/35] mempool: create the internal ring when populating Olivier Matz
2016-03-09 16:19 ` [RFC 30/35] mempool: populate a mempool with anonymous memory Olivier Matz
2016-03-09 16:19 ` [RFC 31/35] test-pmd: remove specific anon mempool code Olivier Matz
2016-03-09 16:19 ` [RFC 32/35] mempool: make mempool populate and free api public Olivier Matz
2016-03-09 16:19 ` [RFC 33/35] mem: avoid memzone/mempool/ring name truncation Olivier Matz
2016-03-09 16:19 ` [RFC 34/35] mempool: new flag when phys contig mem is not needed Olivier Matz
2016-03-09 16:19 ` [RFC 35/35] mempool: update copyright Olivier Matz
2016-03-09 18:52   ` Stephen Hemminger
2016-03-10 14:57     ` Panu Matilainen
2016-03-09 16:44 ` [RFC 00/35] mempool: rework memory allocation Olivier MATZ
2016-03-17  9:05 ` [PATCH] doc: mempool ABI deprecation notice for 16.07 Olivier Matz
2016-04-04 14:38   ` Thomas Monjalon
2016-04-05  9:27     ` Hunt, David
2016-04-05 14:08       ` Wiles, Keith
2016-04-05 15:17         ` Thomas Monjalon
2016-04-14 10:19 ` [PATCH 00/36] mempool: rework memory allocation Olivier Matz
2016-04-14 10:19   ` [PATCH 01/36] mempool: fix comments and style Olivier Matz
2016-04-14 14:15     ` Wiles, Keith
2016-04-14 10:19   ` [PATCH 02/36] mempool: replace elt_size by total_elt_size Olivier Matz
2016-04-14 14:18     ` Wiles, Keith
2016-04-14 10:19   ` [PATCH 03/36] mempool: uninline function to check cookies Olivier Matz
2016-04-14 10:19   ` [PATCH 04/36] mempool: use sizeof to get the size of header and trailer Olivier Matz
2016-04-14 10:19   ` [PATCH 05/36] mempool: rename mempool_obj_ctor_t as mempool_obj_cb_t Olivier Matz
2016-04-14 10:19   ` [PATCH 06/36] mempool: update library version Olivier Matz
2016-04-15 12:38     ` Olivier Matz
2016-04-14 10:19   ` [PATCH 07/36] mempool: list objects when added in the mempool Olivier Matz
2016-04-14 10:19   ` [PATCH 08/36] mempool: remove const attribute in mempool_walk Olivier Matz
2016-04-14 10:19   ` [PATCH 09/36] mempool: remove const qualifier in dump and audit Olivier Matz
2016-04-14 10:19   ` [PATCH 10/36] mempool: use the list to iterate the mempool elements Olivier Matz
2016-04-14 15:33     ` Wiles, Keith
2016-04-15  7:31       ` Olivier Matz
2016-04-15 13:19         ` Wiles, Keith
2016-05-11 10:02     ` [PATCH v2 " Olivier Matz
2016-04-14 10:19   ` [PATCH 11/36] mempool: use the list to audit all elements Olivier Matz
2016-04-14 10:19   ` [PATCH 12/36] mempool: use the list to initialize mempool objects Olivier Matz
2016-04-14 10:19   ` [PATCH 13/36] mempool: create the internal ring in a specific function Olivier Matz
2016-04-14 10:19   ` [PATCH 14/36] mempool: store physaddr in mempool objects Olivier Matz
2016-04-14 15:40     ` Wiles, Keith
2016-04-15  7:34       ` Olivier Matz
2016-04-14 10:19   ` [PATCH 15/36] mempool: remove MEMPOOL_IS_CONTIG() Olivier Matz
2016-04-14 10:19   ` [PATCH 16/36] mempool: store memory chunks in a list Olivier Matz
2016-04-14 10:19   ` [PATCH 17/36] mempool: new function to iterate the memory chunks Olivier Matz
2016-04-14 10:19   ` [PATCH 18/36] mempool: simplify xmem_usage Olivier Matz
2016-04-14 10:19   ` [PATCH 19/36] mempool: introduce a free callback for memory chunks Olivier Matz
2016-04-14 10:19   ` [PATCH 20/36] mempool: make page size optional when getting xmem size Olivier Matz
2016-04-14 10:19   ` [PATCH 21/36] mempool: default allocation in several memory chunks Olivier Matz
2016-04-14 10:19   ` [PATCH 22/36] eal: lock memory when using no-huge Olivier Matz
2016-04-14 10:19   ` [PATCH 23/36] mempool: support no-hugepage mode Olivier Matz
2016-04-14 10:19   ` [PATCH 24/36] mempool: replace mempool physaddr by a memzone pointer Olivier Matz
2016-04-14 10:19   ` [PATCH 25/36] mempool: introduce a function to free a mempool Olivier Matz
2016-04-14 10:19   ` [PATCH 26/36] mempool: introduce a function to create an empty mempool Olivier Matz
2016-04-14 15:57     ` Wiles, Keith
2016-04-15  7:42       ` Olivier Matz
2016-04-15 13:26         ` Wiles, Keith
2016-04-14 10:19   ` [PATCH 27/36] eal/xen: return machine address without knowing memseg id Olivier Matz
2016-04-14 10:19   ` [PATCH 28/36] mempool: rework support of xen dom0 Olivier Matz
2016-04-14 10:19   ` [PATCH 29/36] mempool: create the internal ring when populating Olivier Matz
2016-04-14 10:19   ` [PATCH 30/36] mempool: populate a mempool with anonymous memory Olivier Matz
2016-04-14 10:19   ` [PATCH 31/36] mempool: make mempool populate and free api public Olivier Matz
2016-04-14 10:19   ` [PATCH 32/36] test-pmd: remove specific anon mempool code Olivier Matz
2016-04-14 10:19   ` [PATCH 33/36] mem: avoid memzone/mempool/ring name truncation Olivier Matz
2016-04-14 10:19   ` [PATCH 34/36] mempool: new flag when phys contig mem is not needed Olivier Matz
2016-04-14 10:19   ` [PATCH 35/36] app/test: rework mempool test Olivier Matz
2016-04-14 10:19   ` [PATCH 36/36] mempool: update copyright Olivier Matz
2016-04-14 13:50   ` [PATCH 00/36] mempool: rework memory allocation Wiles, Keith
2016-04-14 14:01     ` Olivier MATZ
2016-04-14 14:03       ` Wiles, Keith
2016-04-14 14:20       ` Hunt, David
2016-05-18 11:04   ` [PATCH v3 00/35] " Olivier Matz
2016-05-18 11:04     ` [PATCH v3 01/35] mempool: rework comments and style Olivier Matz
2016-05-18 11:04     ` [PATCH v3 02/35] mempool: rename element size variables Olivier Matz
2016-05-18 11:04     ` [PATCH v3 03/35] mempool: uninline function to check cookies Olivier Matz
2016-05-18 11:04     ` [PATCH v3 04/35] mempool: use sizeof to get the size of header and trailer Olivier Matz
2016-05-18 11:04     ` [PATCH v3 05/35] mempool: rename object constructor typedef Olivier Matz
2016-05-18 11:04     ` [PATCH v3 06/35] mempool: list objects when added Olivier Matz
2016-05-18 11:04     ` [PATCH v3 07/35] mempool: remove const qualifier when browsing pools Olivier Matz
2016-05-18 11:04     ` [PATCH v3 08/35] mempool: remove const qualifier in dump and audit Olivier Matz
2016-05-18 11:04     ` [PATCH v3 09/35] mempool: use the list to iterate the elements Olivier Matz
2016-05-18 11:04     ` [PATCH v3 10/35] mempool: use the list to audit all elements Olivier Matz
2016-05-18 11:04     ` [PATCH v3 11/35] mempool: use the list to initialize objects Olivier Matz
2016-05-18 11:04     ` [PATCH v3 12/35] mempool: create internal ring in a specific function Olivier Matz
2016-05-18 11:04     ` [PATCH v3 13/35] mempool: store physical address in objects Olivier Matz
2016-05-25 17:51       ` Jain, Deepak K
2016-05-25 19:41         ` Olivier Matz
2016-05-18 11:04     ` [PATCH v3 14/35] mempool: remove macro to check if contiguous Olivier Matz
2016-05-18 11:04     ` [PATCH v3 15/35] mempool: store memory chunks in a list Olivier Matz
2016-05-18 11:04     ` [PATCH v3 16/35] mempool: add function to iterate the memory chunks Olivier Matz
2016-05-18 11:04     ` [PATCH v3 17/35] mempool: simplify the memory usage calculation Olivier Matz
2016-05-18 11:04     ` [PATCH v3 18/35] mempool: introduce a free callback for memory chunks Olivier Matz
2016-05-18 11:04     ` [PATCH v3 19/35] mempool: get memory size with unspecified page size Olivier Matz
2016-05-18 11:04     ` [PATCH v3 20/35] mempool: allocate in several memory chunks by default Olivier Matz
2016-06-01  3:37       ` Ferruh Yigit
2016-05-18 11:04     ` [PATCH v3 21/35] eal: lock memory when not using hugepages Olivier Matz
2016-05-18 11:04     ` [PATCH v3 22/35] mempool: support no hugepage mode Olivier Matz
2016-05-18 11:04     ` [PATCH v3 23/35] mempool: replace physical address by a memzone pointer Olivier Matz
2016-05-18 11:04     ` [PATCH v3 24/35] mempool: introduce a function to free a pool Olivier Matz
2016-05-18 11:04     ` [PATCH v3 25/35] mempool: introduce a function to create an empty pool Olivier Matz
2016-05-18 11:04     ` [PATCH v3 26/35] eal/xen: return machine address without knowing memseg id Olivier Matz
2016-05-18 11:04     ` [PATCH v3 27/35] mempool: rework support of Xen dom0 Olivier Matz
2016-05-18 11:04     ` [PATCH v3 28/35] mempool: create the internal ring when populating Olivier Matz
2016-05-18 11:04     ` [PATCH v3 29/35] mempool: populate with anonymous memory Olivier Matz
2016-05-18 11:04     ` [PATCH v3 30/35] mempool: make mempool populate and free api public Olivier Matz
2016-05-18 11:04     ` [PATCH v3 31/35] app/testpmd: remove anonymous mempool code Olivier Matz
2016-05-18 11:04     ` [PATCH v3 32/35] mem: avoid memzone/mempool/ring name truncation Olivier Matz
2016-05-18 11:04     ` [PATCH v3 33/35] mempool: add flag for removing phys contiguous constraint Olivier Matz
2016-05-18 11:04     ` [PATCH v3 34/35] app/test: rework mempool test Olivier Matz
2016-05-18 11:04     ` [PATCH v3 35/35] doc: update release notes about mempool allocation Olivier Matz
2016-05-19 12:47     ` [PATCH v3 00/35] mempool: rework memory allocation Thomas Monjalon
2016-05-20  8:42       ` Panu Matilainen
2016-05-20  9:09         ` Thomas Monjalon
2016-05-23  7:43           ` Olivier Matz
2016-06-13 10:27             ` Olivier Matz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.