All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration
@ 2018-02-02 15:16 Adrien Mazarguil
  2018-02-02 15:16 ` [PATCH v1 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 15:16 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

The decision to deliver mlx4/mlx5 rdma-core glue plug-ins separately instead
of generating them at run time due to security concerns [1] led to a few
issues:

- They must be present on the file system before running DPDK.
- Their location must be known to the dynamic linker.
- Their names overlap and ABI compatibility is not guaranteed, which may
  lead to crashes.

This series addresses the above by adding version information to plug-ins
and taking CONFIG_RTE_EAL_PMD_PATH into account to locate them on the file
system.

[1] http://dpdk.org/ml/archives/dev/2018-January/089617.html

Adrien Mazarguil (4):
  net/mlx: add debug checks to glue structure
  net/mlx: fix missing includes for rdma-core glue
  net/mlx: version rdma-core glue libraries
  net/mlx: make rdma-core glue path configurable

 doc/guides/nics/mlx4.rst     | 17 ++++++++++++
 doc/guides/nics/mlx5.rst     | 14 ++++++++++
 drivers/net/mlx4/Makefile    |  8 ++++--
 drivers/net/mlx4/mlx4.c      | 57 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx4/mlx4_glue.c |  4 +++
 drivers/net/mlx4/mlx4_glue.h |  9 +++++++
 drivers/net/mlx5/Makefile    |  8 ++++--
 drivers/net/mlx5/mlx5.c      | 57 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_glue.c |  1 +
 drivers/net/mlx5/mlx5_glue.h |  7 +++++
 10 files changed, 176 insertions(+), 6 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v1 1/4] net/mlx: add debug checks to glue structure
  2018-02-02 15:16 [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
@ 2018-02-02 15:16 ` Adrien Mazarguil
  2018-02-02 15:16 ` [PATCH v1 2/4] net/mlx: fix missing includes for rdma-core glue Adrien Mazarguil
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 15:16 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

This code should catch mistakes early if a glue structure member is added
without a corresponding implementation in the library.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/mlx4.c | 9 +++++++++
 drivers/net/mlx5/mlx5.c | 9 +++++++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 50a55ee52..201d39b6e 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -799,6 +799,15 @@ rte_mlx4_pmd_init(void)
 		return;
 	assert(mlx4_glue);
 #endif
+#ifndef NDEBUG
+	/* Glue structure must not contain any NULL pointers. */
+	{
+		unsigned int i;
+
+		for (i = 0; i != sizeof(*mlx4_glue) / sizeof(void *); ++i)
+			assert(((const void *const *)mlx4_glue)[i]);
+	}
+#endif
 	mlx4_glue->fork_init();
 	rte_pci_register(&mlx4_driver);
 }
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 544599b01..050cfac0d 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1142,6 +1142,15 @@ rte_mlx5_pmd_init(void)
 		return;
 	assert(mlx5_glue);
 #endif
+#ifndef NDEBUG
+	/* Glue structure must not contain any NULL pointers. */
+	{
+		unsigned int i;
+
+		for (i = 0; i != sizeof(*mlx5_glue) / sizeof(void *); ++i)
+			assert(((const void *const *)mlx5_glue)[i]);
+	}
+#endif
 	mlx5_glue->fork_init();
 	rte_pci_register(&mlx5_driver);
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v1 2/4] net/mlx: fix missing includes for rdma-core glue
  2018-02-02 15:16 [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  2018-02-02 15:16 ` [PATCH v1 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
@ 2018-02-02 15:16 ` Adrien Mazarguil
  2018-02-02 15:16 ` [PATCH v1 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 15:16 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

For consistency since these includes are already pulled by others.

Fixes: 6aca97d310 ("net/mlx4: move rdma-core calls to separate file")
Fixes: 7202118686 ("net/mlx5: move rdma-core calls to separate file")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/mlx4_glue.c | 3 +++
 drivers/net/mlx4/mlx4_glue.h | 3 +++
 drivers/net/mlx5/mlx5_glue.h | 1 +
 3 files changed, 7 insertions(+)

diff --git a/drivers/net/mlx4/mlx4_glue.c b/drivers/net/mlx4/mlx4_glue.c
index 30797bd2b..47ae7ad0f 100644
--- a/drivers/net/mlx4/mlx4_glue.c
+++ b/drivers/net/mlx4/mlx4_glue.c
@@ -3,6 +3,9 @@
  * Copyright 2018 Mellanox
  */
 
+#include <stddef.h>
+#include <stdint.h>
+
 /* Verbs headers do not support -pedantic. */
 #ifdef PEDANTIC
 #pragma GCC diagnostic ignored "-Wpedantic"
diff --git a/drivers/net/mlx4/mlx4_glue.h b/drivers/net/mlx4/mlx4_glue.h
index 0623511f2..de251c622 100644
--- a/drivers/net/mlx4/mlx4_glue.h
+++ b/drivers/net/mlx4/mlx4_glue.h
@@ -6,6 +6,9 @@
 #ifndef MLX4_GLUE_H_
 #define MLX4_GLUE_H_
 
+#include <stddef.h>
+#include <stdint.h>
+
 /* Verbs headers do not support -pedantic. */
 #ifdef PEDANTIC
 #pragma GCC diagnostic ignored "-Wpedantic"
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 6afb629ff..7fed302ba 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -6,6 +6,7 @@
 #ifndef MLX5_GLUE_H_
 #define MLX5_GLUE_H_
 
+#include <stddef.h>
 #include <stdint.h>
 
 /* Verbs headers do not support -pedantic. */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v1 3/4] net/mlx: version rdma-core glue libraries
  2018-02-02 15:16 [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  2018-02-02 15:16 ` [PATCH v1 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
  2018-02-02 15:16 ` [PATCH v1 2/4] net/mlx: fix missing includes for rdma-core glue Adrien Mazarguil
@ 2018-02-02 15:16 ` Adrien Mazarguil
  2018-02-02 15:16 ` [PATCH v1 4/4] net/mlx: make rdma-core glue path configurable Adrien Mazarguil
  2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  4 siblings, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 15:16 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

When built as separate objects, these libraries do not have unique names.
Since they do not maintain a stable ABI, loading an incompatible library
may result in a crash (e.g. in case multiple versions are installed).

This patch addresses the above by versioning glue libraries, both on the
file system (version suffix) and by comparing a dedicated version field
member in glue structures.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/Makefile    | 8 ++++++--
 drivers/net/mlx4/mlx4.c      | 5 +++++
 drivers/net/mlx4/mlx4_glue.c | 1 +
 drivers/net/mlx4/mlx4_glue.h | 6 ++++++
 drivers/net/mlx5/Makefile    | 8 ++++++--
 drivers/net/mlx5/mlx5.c      | 5 +++++
 drivers/net/mlx5/mlx5_glue.c | 1 +
 drivers/net/mlx5/mlx5_glue.h | 6 ++++++
 8 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index c004ac71c..cc9db9977 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx4.a
-LIB_GLUE = librte_pmd_mlx4_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c
@@ -64,6 +66,7 @@ CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += $(WERROR_FLAGS)
 ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS),y)
 CFLAGS += -DMLX4_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX4_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx4_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -131,6 +134,7 @@ $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx4_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx4
 
 mlx4_glue.o: mlx4_autoconf.h
@@ -139,6 +143,6 @@ endif
 
 clean_mlx4: FORCE
 	$Q rm -f -- mlx4_autoconf.h mlx4_autoconf.h.new
-	$Q rm -f -- mlx4_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx4_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx4
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 201d39b6e..61a852fb9 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -808,6 +808,11 @@ rte_mlx4_pmd_init(void)
 			assert(((const void *const *)mlx4_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx4_glue->version, MLX4_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx4_glue->version, MLX4_GLUE_VERSION);
+		return;
+	}
 	mlx4_glue->fork_init();
 	rte_pci_register(&mlx4_driver);
 }
diff --git a/drivers/net/mlx4/mlx4_glue.c b/drivers/net/mlx4/mlx4_glue.c
index 47ae7ad0f..3b79d320e 100644
--- a/drivers/net/mlx4/mlx4_glue.c
+++ b/drivers/net/mlx4/mlx4_glue.c
@@ -240,6 +240,7 @@ mlx4_glue_dv_set_context_attr(struct ibv_context *context,
 }
 
 const struct mlx4_glue *mlx4_glue = &(const struct mlx4_glue){
+	.version = MLX4_GLUE_VERSION,
 	.fork_init = mlx4_glue_fork_init,
 	.get_async_event = mlx4_glue_get_async_event,
 	.ack_async_event = mlx4_glue_ack_async_event,
diff --git a/drivers/net/mlx4/mlx4_glue.h b/drivers/net/mlx4/mlx4_glue.h
index de251c622..368f906bf 100644
--- a/drivers/net/mlx4/mlx4_glue.h
+++ b/drivers/net/mlx4/mlx4_glue.h
@@ -19,7 +19,13 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX4_GLUE_VERSION
+#define MLX4_GLUE_VERSION ""
+#endif
+
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx4_glue {
+	const char *version;
 	int (*fork_init)(void);
 	int (*get_async_event)(struct ibv_context *context,
 			       struct ibv_async_event *event);
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 4b20d718b..4086f2039 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx5.a
-LIB_GLUE = librte_pmd_mlx5_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
@@ -74,6 +76,7 @@ CFLAGS += $(WERROR_FLAGS)
 CFLAGS += -Wno-strict-prototypes
 ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
 CFLAGS += -DMLX5_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX5_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx5_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -180,6 +183,7 @@ $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx5_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx5
 
 mlx5_glue.o: mlx5_autoconf.h
@@ -188,6 +192,6 @@ endif
 
 clean_mlx5: FORCE
 	$Q rm -f -- mlx5_autoconf.h mlx5_autoconf.h.new
-	$Q rm -f -- mlx5_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx5_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx5
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 050cfac0d..341230d2b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1151,6 +1151,11 @@ rte_mlx5_pmd_init(void)
 			assert(((const void *const *)mlx5_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx5_glue->version, MLX5_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx5_glue->version, MLX5_GLUE_VERSION);
+		return;
+	}
 	mlx5_glue->fork_init();
 	rte_pci_register(&mlx5_driver);
 }
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index 8f500be6e..1c4396ada 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -308,6 +308,7 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 }
 
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
+	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
 	.alloc_pd = mlx5_glue_alloc_pd,
 	.dealloc_pd = mlx5_glue_dealloc_pd,
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 7fed302ba..b5efee3b6 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -19,6 +19,10 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX5_GLUE_VERSION
+#define MLX5_GLUE_VERSION ""
+#endif
+
 #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
 struct ibv_counter_set;
 struct ibv_counter_set_data;
@@ -27,7 +31,9 @@ struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
+	const char *version;
 	int (*fork_init)(void);
 	struct ibv_pd *(*alloc_pd)(struct ibv_context *context);
 	int (*dealloc_pd)(struct ibv_pd *pd);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v1 4/4] net/mlx: make rdma-core glue path configurable
  2018-02-02 15:16 [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
                   ` (2 preceding siblings ...)
  2018-02-02 15:16 ` [PATCH v1 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
@ 2018-02-02 15:16 ` Adrien Mazarguil
  2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  4 siblings, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 15:16 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

Since rdma-core glue libraries are intrinsically tied to their respective
PMDs and used as internal plug-ins, their presence in the default search
path among other system libraries for the dynamic linker is not necessarily
desired.

This commit enables their installation and subsequent look-up at run time
in RTE_EAL_PMD_PATH if configured to a nonempty string. This path can also
be overridden by environment variables MLX[45]_GLUE_PATH.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/nics/mlx4.rst | 17 +++++++++++++++++
 doc/guides/nics/mlx5.rst | 14 ++++++++++++++
 drivers/net/mlx4/mlx4.c  | 43 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5.c  | 43 ++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 115 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst
index 88161781c..9e4fbf692 100644
--- a/doc/guides/nics/mlx4.rst
+++ b/doc/guides/nics/mlx4.rst
@@ -97,6 +97,11 @@ These options can be modified in the ``.config`` file.
   ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as
   missing with ``ldd(1)``.
 
+  It works by moving these dependencies to a purpose-built rdma-core "glue"
+  plug-in, which must either be installed in ``CONFIG_RTE_EAL_PMD_PATH`` if
+  set, or in a standard location for the dynamic linker (e.g. ``/lib``) if
+  left to the default empty string (``""``).
+
   This option has no performance impact.
 
 - ``CONFIG_RTE_LIBRTE_MLX4_DEBUG`` (default **n**)
@@ -113,6 +118,18 @@ These options can be modified in the ``.config`` file.
 
   This value is always 1 for RX queues since they use a single MP.
 
+Environment variables
+~~~~~~~~~~~~~~~~~~~~~
+
+- ``MLX4_GLUE_PATH``
+
+  A list of directories in which to search for the rdma-core "glue" plug-in,
+  separated by colons or semi-colons.
+
+  Only matters when compiled with ``CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS``
+  enabled and most useful when ``CONFIG_RTE_EAL_PMD_PATH`` is also set,
+  since ``LD_LIBRARY_PATH`` has no effect in this case.
+
 Run-time configuration
 ~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index a9e4bf51a..1635dff2b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -170,6 +170,11 @@ These options can be modified in the ``.config`` file.
   ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as
   missing with ``ldd(1)``.
 
+  It works by moving these dependencies to a purpose-built rdma-core "glue"
+  plug-in, which must either be installed in ``CONFIG_RTE_EAL_PMD_PATH`` if
+  set, or in a standard location for the dynamic linker (e.g. ``/lib``) if
+  left to the default empty string (``""``).
+
   This option has no performance impact.
 
 - ``CONFIG_RTE_LIBRTE_MLX5_DEBUG`` (default **n**)
@@ -189,6 +194,15 @@ These options can be modified in the ``.config`` file.
 Environment variables
 ~~~~~~~~~~~~~~~~~~~~~
 
+- ``MLX5_GLUE_PATH``
+
+  A list of directories in which to search for the rdma-core "glue" plug-in,
+  separated by colons or semi-colons.
+
+  Only matters when compiled with ``CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS``
+  enabled and most useful when ``CONFIG_RTE_EAL_PMD_PATH`` is also set,
+  since ``LD_LIBRARY_PATH`` has no effect in this case.
+
 - ``MLX5_PMD_ENABLE_PADDING``
 
   Enables HW packet padding in PCI bus transactions.
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 61a852fb9..4266cb1bb 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -741,11 +741,52 @@ static struct rte_pci_driver mlx4_driver = {
 static int
 mlx4_glue_init(void)
 {
+	const char *path[] = {
+		/*
+		 * A basic security check is necessary before trusting
+		 * MLX4_GLUE_PATH, which may override RTE_EAL_PMD_PATH.
+		 */
+		(geteuid() == getuid() && getegid() == getgid() ?
+		 getenv("MLX4_GLUE_PATH") : NULL),
+		RTE_EAL_PMD_PATH,
+	};
+	unsigned int i = 0;
 	void *handle = NULL;
 	void **sym;
 	const char *dlmsg;
 
-	handle = dlopen(MLX4_GLUE, RTLD_LAZY);
+	while (!handle && i != RTE_DIM(path)) {
+		const char *end;
+		size_t len;
+		int ret;
+
+		if (!path[i]) {
+			++i;
+			continue;
+		}
+		end = strpbrk(path[i], ":;");
+		if (!end)
+			end = path[i] + strlen(path[i]);
+		len = end - path[i];
+		ret = 0;
+		do {
+			char name[ret + 1];
+
+			ret = snprintf(name, ret, "%.*s%s" MLX4_GLUE "\n",
+				       (int)len, path[i],
+				       (!len || *(end - 1) == '/') ? "" : "/");
+			if (ret == -1)
+				break;
+			if (sizeof(name) != (size_t)ret + 1)
+				continue;
+			DEBUG("looking for rdma-core glue as \"%s\"", name);
+			handle = dlopen(name, RTLD_LAZY);
+			break;
+		} while (1);
+		path[i] = end + 1;
+		if (!*end)
+			++i;
+	}
 	if (!handle) {
 		rte_errno = EINVAL;
 		dlmsg = dlerror();
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 341230d2b..caa60339e 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1085,11 +1085,52 @@ static struct rte_pci_driver mlx5_driver = {
 static int
 mlx5_glue_init(void)
 {
+	const char *path[] = {
+		/*
+		 * A basic security check is necessary before trusting
+		 * MLX5_GLUE_PATH, which may override RTE_EAL_PMD_PATH.
+		 */
+		(geteuid() == getuid() && getegid() == getgid() ?
+		 getenv("MLX5_GLUE_PATH") : NULL),
+		RTE_EAL_PMD_PATH,
+	};
+	unsigned int i = 0;
 	void *handle = NULL;
 	void **sym;
 	const char *dlmsg;
 
-	handle = dlopen(MLX5_GLUE, RTLD_LAZY);
+	while (!handle && i != RTE_DIM(path)) {
+		const char *end;
+		size_t len;
+		int ret;
+
+		if (!path[i]) {
+			++i;
+			continue;
+		}
+		end = strpbrk(path[i], ":;");
+		if (!end)
+			end = path[i] + strlen(path[i]);
+		len = end - path[i];
+		ret = 0;
+		do {
+			char name[ret + 1];
+
+			ret = snprintf(name, ret, "%.*s%s" MLX5_GLUE "\n",
+				       (int)len, path[i],
+				       (!len || *(end - 1) == '/') ? "" : "/");
+			if (ret == -1)
+				break;
+			if (sizeof(name) != (size_t)ret + 1)
+				continue;
+			DEBUG("looking for rdma-core glue as \"%s\"", name);
+			handle = dlopen(name, RTLD_LAZY);
+			break;
+		} while (1);
+		path[i] = end + 1;
+		if (!*end)
+			++i;
+	}
 	if (!handle) {
 		rte_errno = EINVAL;
 		dlmsg = dlerror();
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration
  2018-02-02 15:16 [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
                   ` (3 preceding siblings ...)
  2018-02-02 15:16 ` [PATCH v1 4/4] net/mlx: make rdma-core glue path configurable Adrien Mazarguil
@ 2018-02-02 16:46 ` Adrien Mazarguil
  2018-02-02 16:46   ` [PATCH v2 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
                     ` (5 more replies)
  4 siblings, 6 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 16:46 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

The decision to deliver mlx4/mlx5 rdma-core glue plug-ins separately instead
of generating them at run time due to security concerns [1] led to a few
issues:

- They must be present on the file system before running DPDK.
- Their location must be known to the dynamic linker.
- Their names overlap and ABI compatibility is not guaranteed, which may
  lead to crashes.

This series addresses the above by adding version information to plug-ins
and taking CONFIG_RTE_EAL_PMD_PATH into account to locate them on the file
system.

[1] http://dpdk.org/ml/archives/dev/2018-January/089617.html

v2 changes:

- Fixed extra "\n" in glue file name generation (although it didn't break
  functionality).

Adrien Mazarguil (4):
  net/mlx: add debug checks to glue structure
  net/mlx: fix missing includes for rdma-core glue
  net/mlx: version rdma-core glue libraries
  net/mlx: make rdma-core glue path configurable

 doc/guides/nics/mlx4.rst     | 17 ++++++++++++
 doc/guides/nics/mlx5.rst     | 14 ++++++++++
 drivers/net/mlx4/Makefile    |  8 ++++--
 drivers/net/mlx4/mlx4.c      | 57 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx4/mlx4_glue.c |  4 +++
 drivers/net/mlx4/mlx4_glue.h |  9 +++++++
 drivers/net/mlx5/Makefile    |  8 ++++--
 drivers/net/mlx5/mlx5.c      | 57 ++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5_glue.c |  1 +
 drivers/net/mlx5/mlx5_glue.h |  7 +++++
 10 files changed, 176 insertions(+), 6 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 1/4] net/mlx: add debug checks to glue structure
  2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
@ 2018-02-02 16:46   ` Adrien Mazarguil
  2018-02-05 12:27     ` Marcelo Ricardo Leitner
  2018-02-02 16:46   ` [PATCH v2 2/4] net/mlx: fix missing includes for rdma-core glue Adrien Mazarguil
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 16:46 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

This code should catch mistakes early if a glue structure member is added
without a corresponding implementation in the library.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/mlx4.c | 9 +++++++++
 drivers/net/mlx5/mlx5.c | 9 +++++++++
 2 files changed, 18 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 50a55ee52..201d39b6e 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -799,6 +799,15 @@ rte_mlx4_pmd_init(void)
 		return;
 	assert(mlx4_glue);
 #endif
+#ifndef NDEBUG
+	/* Glue structure must not contain any NULL pointers. */
+	{
+		unsigned int i;
+
+		for (i = 0; i != sizeof(*mlx4_glue) / sizeof(void *); ++i)
+			assert(((const void *const *)mlx4_glue)[i]);
+	}
+#endif
 	mlx4_glue->fork_init();
 	rte_pci_register(&mlx4_driver);
 }
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 544599b01..050cfac0d 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1142,6 +1142,15 @@ rte_mlx5_pmd_init(void)
 		return;
 	assert(mlx5_glue);
 #endif
+#ifndef NDEBUG
+	/* Glue structure must not contain any NULL pointers. */
+	{
+		unsigned int i;
+
+		for (i = 0; i != sizeof(*mlx5_glue) / sizeof(void *); ++i)
+			assert(((const void *const *)mlx5_glue)[i]);
+	}
+#endif
 	mlx5_glue->fork_init();
 	rte_pci_register(&mlx5_driver);
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 2/4] net/mlx: fix missing includes for rdma-core glue
  2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  2018-02-02 16:46   ` [PATCH v2 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
@ 2018-02-02 16:46   ` Adrien Mazarguil
  2018-02-02 16:46   ` [PATCH v2 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 16:46 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

For consistency since these includes are already pulled by others.

Fixes: 6aca97d310 ("net/mlx4: move rdma-core calls to separate file")
Fixes: 7202118686 ("net/mlx5: move rdma-core calls to separate file")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/mlx4_glue.c | 3 +++
 drivers/net/mlx4/mlx4_glue.h | 3 +++
 drivers/net/mlx5/mlx5_glue.h | 1 +
 3 files changed, 7 insertions(+)

diff --git a/drivers/net/mlx4/mlx4_glue.c b/drivers/net/mlx4/mlx4_glue.c
index 30797bd2b..47ae7ad0f 100644
--- a/drivers/net/mlx4/mlx4_glue.c
+++ b/drivers/net/mlx4/mlx4_glue.c
@@ -3,6 +3,9 @@
  * Copyright 2018 Mellanox
  */
 
+#include <stddef.h>
+#include <stdint.h>
+
 /* Verbs headers do not support -pedantic. */
 #ifdef PEDANTIC
 #pragma GCC diagnostic ignored "-Wpedantic"
diff --git a/drivers/net/mlx4/mlx4_glue.h b/drivers/net/mlx4/mlx4_glue.h
index 0623511f2..de251c622 100644
--- a/drivers/net/mlx4/mlx4_glue.h
+++ b/drivers/net/mlx4/mlx4_glue.h
@@ -6,6 +6,9 @@
 #ifndef MLX4_GLUE_H_
 #define MLX4_GLUE_H_
 
+#include <stddef.h>
+#include <stdint.h>
+
 /* Verbs headers do not support -pedantic. */
 #ifdef PEDANTIC
 #pragma GCC diagnostic ignored "-Wpedantic"
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 6afb629ff..7fed302ba 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -6,6 +6,7 @@
 #ifndef MLX5_GLUE_H_
 #define MLX5_GLUE_H_
 
+#include <stddef.h>
 #include <stdint.h>
 
 /* Verbs headers do not support -pedantic. */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
  2018-02-02 16:46   ` [PATCH v2 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
  2018-02-02 16:46   ` [PATCH v2 2/4] net/mlx: fix missing includes for rdma-core glue Adrien Mazarguil
@ 2018-02-02 16:46   ` Adrien Mazarguil
  2018-02-04 14:29     ` Thomas Monjalon
  2018-02-02 16:46   ` [PATCH v2 4/4] net/mlx: make rdma-core glue path configurable Adrien Mazarguil
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 16:46 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

When built as separate objects, these libraries do not have unique names.
Since they do not maintain a stable ABI, loading an incompatible library
may result in a crash (e.g. in case multiple versions are installed).

This patch addresses the above by versioning glue libraries, both on the
file system (version suffix) and by comparing a dedicated version field
member in glue structures.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 drivers/net/mlx4/Makefile    | 8 ++++++--
 drivers/net/mlx4/mlx4.c      | 5 +++++
 drivers/net/mlx4/mlx4_glue.c | 1 +
 drivers/net/mlx4/mlx4_glue.h | 6 ++++++
 drivers/net/mlx5/Makefile    | 8 ++++++--
 drivers/net/mlx5/mlx5.c      | 5 +++++
 drivers/net/mlx5/mlx5_glue.c | 1 +
 drivers/net/mlx5/mlx5_glue.h | 6 ++++++
 8 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index c004ac71c..cc9db9977 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx4.a
-LIB_GLUE = librte_pmd_mlx4_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c
@@ -64,6 +66,7 @@ CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += $(WERROR_FLAGS)
 ifeq ($(CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS),y)
 CFLAGS += -DMLX4_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX4_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx4_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -131,6 +134,7 @@ $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx4_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx4
 
 mlx4_glue.o: mlx4_autoconf.h
@@ -139,6 +143,6 @@ endif
 
 clean_mlx4: FORCE
 	$Q rm -f -- mlx4_autoconf.h mlx4_autoconf.h.new
-	$Q rm -f -- mlx4_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx4_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx4
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 201d39b6e..61a852fb9 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -808,6 +808,11 @@ rte_mlx4_pmd_init(void)
 			assert(((const void *const *)mlx4_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx4_glue->version, MLX4_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx4_glue->version, MLX4_GLUE_VERSION);
+		return;
+	}
 	mlx4_glue->fork_init();
 	rte_pci_register(&mlx4_driver);
 }
diff --git a/drivers/net/mlx4/mlx4_glue.c b/drivers/net/mlx4/mlx4_glue.c
index 47ae7ad0f..3b79d320e 100644
--- a/drivers/net/mlx4/mlx4_glue.c
+++ b/drivers/net/mlx4/mlx4_glue.c
@@ -240,6 +240,7 @@ mlx4_glue_dv_set_context_attr(struct ibv_context *context,
 }
 
 const struct mlx4_glue *mlx4_glue = &(const struct mlx4_glue){
+	.version = MLX4_GLUE_VERSION,
 	.fork_init = mlx4_glue_fork_init,
 	.get_async_event = mlx4_glue_get_async_event,
 	.ack_async_event = mlx4_glue_ack_async_event,
diff --git a/drivers/net/mlx4/mlx4_glue.h b/drivers/net/mlx4/mlx4_glue.h
index de251c622..368f906bf 100644
--- a/drivers/net/mlx4/mlx4_glue.h
+++ b/drivers/net/mlx4/mlx4_glue.h
@@ -19,7 +19,13 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX4_GLUE_VERSION
+#define MLX4_GLUE_VERSION ""
+#endif
+
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx4_glue {
+	const char *version;
 	int (*fork_init)(void);
 	int (*get_async_event)(struct ibv_context *context,
 			       struct ibv_async_event *event);
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 4b20d718b..4086f2039 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
 
 # Library name.
 LIB = librte_pmd_mlx5.a
-LIB_GLUE = librte_pmd_mlx5_glue.so
+LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
+LIB_GLUE_BASE = librte_pmd_mlx5_glue.so
+LIB_GLUE_VERSION = 18.02.1
 
 # Sources.
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5.c
@@ -74,6 +76,7 @@ CFLAGS += $(WERROR_FLAGS)
 CFLAGS += -Wno-strict-prototypes
 ifeq ($(CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS),y)
 CFLAGS += -DMLX5_GLUE='"$(LIB_GLUE)"'
+CFLAGS += -DMLX5_GLUE_VERSION='"$(LIB_GLUE_VERSION)"'
 CFLAGS_mlx5_glue.o += -fPIC
 LDLIBS += -ldl
 else
@@ -180,6 +183,7 @@ $(LIB): $(LIB_GLUE)
 
 $(LIB_GLUE): mlx5_glue.o
 	$Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+		-Wl,-h,$(LIB_GLUE) \
 		-s -shared -o $@ $< -libverbs -lmlx5
 
 mlx5_glue.o: mlx5_autoconf.h
@@ -188,6 +192,6 @@ endif
 
 clean_mlx5: FORCE
 	$Q rm -f -- mlx5_autoconf.h mlx5_autoconf.h.new
-	$Q rm -f -- mlx5_glue.o $(LIB_GLUE)
+	$Q rm -f -- mlx5_glue.o $(LIB_GLUE_BASE)*
 
 clean: clean_mlx5
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 050cfac0d..341230d2b 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1151,6 +1151,11 @@ rte_mlx5_pmd_init(void)
 			assert(((const void *const *)mlx5_glue)[i]);
 	}
 #endif
+	if (strcmp(mlx5_glue->version, MLX5_GLUE_VERSION)) {
+		ERROR("rdma-core glue \"%s\" mismatch: \"%s\" is required",
+		      mlx5_glue->version, MLX5_GLUE_VERSION);
+		return;
+	}
 	mlx5_glue->fork_init();
 	rte_pci_register(&mlx5_driver);
 }
diff --git a/drivers/net/mlx5/mlx5_glue.c b/drivers/net/mlx5/mlx5_glue.c
index 8f500be6e..1c4396ada 100644
--- a/drivers/net/mlx5/mlx5_glue.c
+++ b/drivers/net/mlx5/mlx5_glue.c
@@ -308,6 +308,7 @@ mlx5_glue_dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type)
 }
 
 const struct mlx5_glue *mlx5_glue = &(const struct mlx5_glue){
+	.version = MLX5_GLUE_VERSION,
 	.fork_init = mlx5_glue_fork_init,
 	.alloc_pd = mlx5_glue_alloc_pd,
 	.dealloc_pd = mlx5_glue_dealloc_pd,
diff --git a/drivers/net/mlx5/mlx5_glue.h b/drivers/net/mlx5/mlx5_glue.h
index 7fed302ba..b5efee3b6 100644
--- a/drivers/net/mlx5/mlx5_glue.h
+++ b/drivers/net/mlx5/mlx5_glue.h
@@ -19,6 +19,10 @@
 #pragma GCC diagnostic error "-Wpedantic"
 #endif
 
+#ifndef MLX5_GLUE_VERSION
+#define MLX5_GLUE_VERSION ""
+#endif
+
 #ifndef HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT
 struct ibv_counter_set;
 struct ibv_counter_set_data;
@@ -27,7 +31,9 @@ struct ibv_counter_set_init_attr;
 struct ibv_query_counter_set_attr;
 #endif
 
+/* LIB_GLUE_VERSION must be updated every time this structure is modified. */
 struct mlx5_glue {
+	const char *version;
 	int (*fork_init)(void);
 	struct ibv_pd *(*alloc_pd)(struct ibv_context *context);
 	int (*dealloc_pd)(struct ibv_pd *pd);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 4/4] net/mlx: make rdma-core glue path configurable
  2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
                     ` (2 preceding siblings ...)
  2018-02-02 16:46   ` [PATCH v2 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
@ 2018-02-02 16:46   ` Adrien Mazarguil
  2018-02-02 16:52   ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Nélio Laranjeiro
  2018-02-06 11:31   ` Shahaf Shuler
  5 siblings, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-02 16:46 UTC (permalink / raw)
  To: Shahaf Shuler; +Cc: Nelio Laranjeiro, dev, Marcelo Ricardo Leitner

Since rdma-core glue libraries are intrinsically tied to their respective
PMDs and used as internal plug-ins, their presence in the default search
path among other system libraries for the dynamic linker is not necessarily
desired.

This commit enables their installation and subsequent look-up at run time
in RTE_EAL_PMD_PATH if configured to a nonempty string. This path can also
be overridden by environment variables MLX[45]_GLUE_PATH.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
 doc/guides/nics/mlx4.rst | 17 +++++++++++++++++
 doc/guides/nics/mlx5.rst | 14 ++++++++++++++
 drivers/net/mlx4/mlx4.c  | 43 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/net/mlx5/mlx5.c  | 43 ++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 115 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst
index 88161781c..9e4fbf692 100644
--- a/doc/guides/nics/mlx4.rst
+++ b/doc/guides/nics/mlx4.rst
@@ -97,6 +97,11 @@ These options can be modified in the ``.config`` file.
   ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as
   missing with ``ldd(1)``.
 
+  It works by moving these dependencies to a purpose-built rdma-core "glue"
+  plug-in, which must either be installed in ``CONFIG_RTE_EAL_PMD_PATH`` if
+  set, or in a standard location for the dynamic linker (e.g. ``/lib``) if
+  left to the default empty string (``""``).
+
   This option has no performance impact.
 
 - ``CONFIG_RTE_LIBRTE_MLX4_DEBUG`` (default **n**)
@@ -113,6 +118,18 @@ These options can be modified in the ``.config`` file.
 
   This value is always 1 for RX queues since they use a single MP.
 
+Environment variables
+~~~~~~~~~~~~~~~~~~~~~
+
+- ``MLX4_GLUE_PATH``
+
+  A list of directories in which to search for the rdma-core "glue" plug-in,
+  separated by colons or semi-colons.
+
+  Only matters when compiled with ``CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS``
+  enabled and most useful when ``CONFIG_RTE_EAL_PMD_PATH`` is also set,
+  since ``LD_LIBRARY_PATH`` has no effect in this case.
+
 Run-time configuration
 ~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index a9e4bf51a..1635dff2b 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -170,6 +170,11 @@ These options can be modified in the ``.config`` file.
   ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as
   missing with ``ldd(1)``.
 
+  It works by moving these dependencies to a purpose-built rdma-core "glue"
+  plug-in, which must either be installed in ``CONFIG_RTE_EAL_PMD_PATH`` if
+  set, or in a standard location for the dynamic linker (e.g. ``/lib``) if
+  left to the default empty string (``""``).
+
   This option has no performance impact.
 
 - ``CONFIG_RTE_LIBRTE_MLX5_DEBUG`` (default **n**)
@@ -189,6 +194,15 @@ These options can be modified in the ``.config`` file.
 Environment variables
 ~~~~~~~~~~~~~~~~~~~~~
 
+- ``MLX5_GLUE_PATH``
+
+  A list of directories in which to search for the rdma-core "glue" plug-in,
+  separated by colons or semi-colons.
+
+  Only matters when compiled with ``CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS``
+  enabled and most useful when ``CONFIG_RTE_EAL_PMD_PATH`` is also set,
+  since ``LD_LIBRARY_PATH`` has no effect in this case.
+
 - ``MLX5_PMD_ENABLE_PADDING``
 
   Enables HW packet padding in PCI bus transactions.
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 61a852fb9..4016ddb7b 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -741,11 +741,52 @@ static struct rte_pci_driver mlx4_driver = {
 static int
 mlx4_glue_init(void)
 {
+	const char *path[] = {
+		/*
+		 * A basic security check is necessary before trusting
+		 * MLX4_GLUE_PATH, which may override RTE_EAL_PMD_PATH.
+		 */
+		(geteuid() == getuid() && getegid() == getgid() ?
+		 getenv("MLX4_GLUE_PATH") : NULL),
+		RTE_EAL_PMD_PATH,
+	};
+	unsigned int i = 0;
 	void *handle = NULL;
 	void **sym;
 	const char *dlmsg;
 
-	handle = dlopen(MLX4_GLUE, RTLD_LAZY);
+	while (!handle && i != RTE_DIM(path)) {
+		const char *end;
+		size_t len;
+		int ret;
+
+		if (!path[i]) {
+			++i;
+			continue;
+		}
+		end = strpbrk(path[i], ":;");
+		if (!end)
+			end = path[i] + strlen(path[i]);
+		len = end - path[i];
+		ret = 0;
+		do {
+			char name[ret + 1];
+
+			ret = snprintf(name, sizeof(name), "%.*s%s" MLX4_GLUE,
+				       (int)len, path[i],
+				       (!len || *(end - 1) == '/') ? "" : "/");
+			if (ret == -1)
+				break;
+			if (sizeof(name) != (size_t)ret + 1)
+				continue;
+			DEBUG("looking for rdma-core glue as \"%s\"", name);
+			handle = dlopen(name, RTLD_LAZY);
+			break;
+		} while (1);
+		path[i] = end + 1;
+		if (!*end)
+			++i;
+	}
 	if (!handle) {
 		rte_errno = EINVAL;
 		dlmsg = dlerror();
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 341230d2b..d1441dcba 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1085,11 +1085,52 @@ static struct rte_pci_driver mlx5_driver = {
 static int
 mlx5_glue_init(void)
 {
+	const char *path[] = {
+		/*
+		 * A basic security check is necessary before trusting
+		 * MLX5_GLUE_PATH, which may override RTE_EAL_PMD_PATH.
+		 */
+		(geteuid() == getuid() && getegid() == getgid() ?
+		 getenv("MLX5_GLUE_PATH") : NULL),
+		RTE_EAL_PMD_PATH,
+	};
+	unsigned int i = 0;
 	void *handle = NULL;
 	void **sym;
 	const char *dlmsg;
 
-	handle = dlopen(MLX5_GLUE, RTLD_LAZY);
+	while (!handle && i != RTE_DIM(path)) {
+		const char *end;
+		size_t len;
+		int ret;
+
+		if (!path[i]) {
+			++i;
+			continue;
+		}
+		end = strpbrk(path[i], ":;");
+		if (!end)
+			end = path[i] + strlen(path[i]);
+		len = end - path[i];
+		ret = 0;
+		do {
+			char name[ret + 1];
+
+			ret = snprintf(name, sizeof(name), "%.*s%s" MLX5_GLUE,
+				       (int)len, path[i],
+				       (!len || *(end - 1) == '/') ? "" : "/");
+			if (ret == -1)
+				break;
+			if (sizeof(name) != (size_t)ret + 1)
+				continue;
+			DEBUG("looking for rdma-core glue as \"%s\"", name);
+			handle = dlopen(name, RTLD_LAZY);
+			break;
+		} while (1);
+		path[i] = end + 1;
+		if (!*end)
+			++i;
+	}
 	if (!handle) {
 		rte_errno = EINVAL;
 		dlmsg = dlerror();
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration
  2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
                     ` (3 preceding siblings ...)
  2018-02-02 16:46   ` [PATCH v2 4/4] net/mlx: make rdma-core glue path configurable Adrien Mazarguil
@ 2018-02-02 16:52   ` Nélio Laranjeiro
  2018-02-06 11:31   ` Shahaf Shuler
  5 siblings, 0 replies; 29+ messages in thread
From: Nélio Laranjeiro @ 2018-02-02 16:52 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Shahaf Shuler, dev, Marcelo Ricardo Leitner

On Fri, Feb 02, 2018 at 05:46:10PM +0100, Adrien Mazarguil wrote:
> The decision to deliver mlx4/mlx5 rdma-core glue plug-ins separately instead
> of generating them at run time due to security concerns [1] led to a few
> issues:
> 
> - They must be present on the file system before running DPDK.
> - Their location must be known to the dynamic linker.
> - Their names overlap and ABI compatibility is not guaranteed, which may
>   lead to crashes.
> 
> This series addresses the above by adding version information to plug-ins
> and taking CONFIG_RTE_EAL_PMD_PATH into account to locate them on the file
> system.
> 
> [1] http://dpdk.org/ml/archives/dev/2018-January/089617.html
> 
> v2 changes:
> 
> - Fixed extra "\n" in glue file name generation (although it didn't break
>   functionality).
> 
> Adrien Mazarguil (4):
>   net/mlx: add debug checks to glue structure
>   net/mlx: fix missing includes for rdma-core glue
>   net/mlx: version rdma-core glue libraries
>   net/mlx: make rdma-core glue path configurable
> 
>  doc/guides/nics/mlx4.rst     | 17 ++++++++++++
>  doc/guides/nics/mlx5.rst     | 14 ++++++++++
>  drivers/net/mlx4/Makefile    |  8 ++++--
>  drivers/net/mlx4/mlx4.c      | 57 ++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx4/mlx4_glue.c |  4 +++
>  drivers/net/mlx4/mlx4_glue.h |  9 +++++++
>  drivers/net/mlx5/Makefile    |  8 ++++--
>  drivers/net/mlx5/mlx5.c      | 57 ++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx5/mlx5_glue.c |  1 +
>  drivers/net/mlx5/mlx5_glue.h |  7 +++++
>  10 files changed, 176 insertions(+), 6 deletions(-)
> 
> -- 
> 2.11.0

For the series,

Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>

-- 
Nélio Laranjeiro
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-02 16:46   ` [PATCH v2 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
@ 2018-02-04 14:29     ` Thomas Monjalon
  2018-02-05 11:24       ` Adrien Mazarguil
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Monjalon @ 2018-02-04 14:29 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: dev, Shahaf Shuler, Nelio Laranjeiro, Marcelo Ricardo Leitner

02/02/2018 17:46, Adrien Mazarguil:
> --- a/drivers/net/mlx4/Makefile
> +++ b/drivers/net/mlx4/Makefile
> @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
>  
>  # Library name.
>  LIB = librte_pmd_mlx4.a
> -LIB_GLUE = librte_pmd_mlx4_glue.so
> +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> +LIB_GLUE_VERSION = 18.02.1

You should use the version number of the release, i.e. 18.02.0
Ideally, you should retrieve it from rte_version.h.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-04 14:29     ` Thomas Monjalon
@ 2018-02-05 11:24       ` Adrien Mazarguil
  2018-02-05 12:13         ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-05 11:24 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Shahaf Shuler, Nelio Laranjeiro, Marcelo Ricardo Leitner

On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> 02/02/2018 17:46, Adrien Mazarguil:
> > --- a/drivers/net/mlx4/Makefile
> > +++ b/drivers/net/mlx4/Makefile
> > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> >  
> >  # Library name.
> >  LIB = librte_pmd_mlx4.a
> > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > +LIB_GLUE_VERSION = 18.02.1
> 
> You should use the version number of the release, i.e. 18.02.0
> Ideally, you should retrieve it from rte_version.h.

Keep in mind this only needs to be updated when the glue API gets modified,
and this "18.02.1" string may remain unmodified for subsequent DPDK
releases, probably as long as the PMD doesn't use any new rdma-core calls.

We've already backported this patch to 17.02 and 17.11, both requiring
different sets of Verbs calls and thus a different version, hence the added
"18.02" as a starting point. The last digit may have to be modified possibly
several times between official DPDK releases while work is being done on the
PMD (i.e. per commit).

In short it's not meant to follow DPDK's public versioning scheme. If you
really think it should, doing so will make things more complex in the
Makefile, which will have to parse rte_version.h. What's your opinion?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 11:24       ` Adrien Mazarguil
@ 2018-02-05 12:13         ` Marcelo Ricardo Leitner
  2018-02-05 12:24           ` Van Haaren, Harry
  0 siblings, 1 reply; 29+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-02-05 12:13 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Thomas Monjalon, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > 02/02/2018 17:46, Adrien Mazarguil:
> > > --- a/drivers/net/mlx4/Makefile
> > > +++ b/drivers/net/mlx4/Makefile
> > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > >  
> > >  # Library name.
> > >  LIB = librte_pmd_mlx4.a
> > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > +LIB_GLUE_VERSION = 18.02.1
> > 
> > You should use the version number of the release, i.e. 18.02.0
> > Ideally, you should retrieve it from rte_version.h.
> 
> Keep in mind this only needs to be updated when the glue API gets modified,
> and this "18.02.1" string may remain unmodified for subsequent DPDK
> releases, probably as long as the PMD doesn't use any new rdma-core calls.
> 
> We've already backported this patch to 17.02 and 17.11, both requiring
> different sets of Verbs calls and thus a different version, hence the added
> "18.02" as a starting point. The last digit may have to be modified possibly
> several times between official DPDK releases while work is being done on the
> PMD (i.e. per commit).
> 
> In short it's not meant to follow DPDK's public versioning scheme. If you
> really think it should, doing so will make things more complex in the
> Makefile, which will have to parse rte_version.h. What's your opinion?

What about appending date +%s output to it? It would be stricter and
automated.

  Marcelo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 12:13         ` Marcelo Ricardo Leitner
@ 2018-02-05 12:24           ` Van Haaren, Harry
  2018-02-05 12:43             ` Marcelo Ricardo Leitner
  2018-02-05 12:58             ` Marcelo Ricardo Leitner
  0 siblings, 2 replies; 29+ messages in thread
From: Van Haaren, Harry @ 2018-02-05 12:24 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner, Adrien Mazarguil
  Cc: Thomas Monjalon, dev, Shahaf Shuler, Nelio Laranjeiro

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> Sent: Monday, February 5, 2018 12:14 PM
> To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> libraries
> 
> On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > --- a/drivers/net/mlx4/Makefile
> > > > +++ b/drivers/net/mlx4/Makefile
> > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > >
> > > >  # Library name.
> > > >  LIB = librte_pmd_mlx4.a
> > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > +LIB_GLUE_VERSION = 18.02.1
> > >
> > > You should use the version number of the release, i.e. 18.02.0
> > > Ideally, you should retrieve it from rte_version.h.
> >
> > Keep in mind this only needs to be updated when the glue API gets
> modified,
> > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> >
> > We've already backported this patch to 17.02 and 17.11, both requiring
> > different sets of Verbs calls and thus a different version, hence the
> added
> > "18.02" as a starting point. The last digit may have to be modified
> possibly
> > several times between official DPDK releases while work is being done on
> the
> > PMD (i.e. per commit).
> >
> > In short it's not meant to follow DPDK's public versioning scheme. If you
> > really think it should, doing so will make things more complex in the
> > Makefile, which will have to parse rte_version.h. What's your opinion?
> 
> What about appending date +%s output to it? It would be stricter and
> automated.

Adding current timestamp or date into a build breaks reproducibility of builds, so is
generally not recommended.

No opinion on string/version naming here.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/4] net/mlx: add debug checks to glue structure
  2018-02-02 16:46   ` [PATCH v2 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
@ 2018-02-05 12:27     ` Marcelo Ricardo Leitner
  2018-02-05 13:31       ` Adrien Mazarguil
  0 siblings, 1 reply; 29+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-02-05 12:27 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Shahaf Shuler, Nelio Laranjeiro, dev

On Fri, Feb 02, 2018 at 05:46:12PM +0100, Adrien Mazarguil wrote:
> This code should catch mistakes early if a glue structure member is added
> without a corresponding implementation in the library.
> 
> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> ---
>  drivers/net/mlx4/mlx4.c | 9 +++++++++
>  drivers/net/mlx5/mlx5.c | 9 +++++++++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index 50a55ee52..201d39b6e 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -799,6 +799,15 @@ rte_mlx4_pmd_init(void)
>  		return;
>  	assert(mlx4_glue);
>  #endif
> +#ifndef NDEBUG
> +	/* Glue structure must not contain any NULL pointers. */
> +	{
> +		unsigned int i;
> +
> +		for (i = 0; i != sizeof(*mlx4_glue) / sizeof(void *); ++i)
> +			assert(((const void *const *)mlx4_glue)[i]);
> +	}

This code will not catch the case on which mlx4_glue on PMD is smaller
than the one on the glue lib. Although this would be safe, as the PMD
won't place calls for functions that it is not aware of, I guess we
don't want to allow such situation anyhow.

One way to handle this is to add a size field and let the PMD check if
the size is the same. As this code is walking through it as an array
of pointers, an union around it to keep the alignment may be welcomed.

Also, this code would do read beyond buffer in the opposite case, when
mlx4_glue for the PMD is larger than the one on the glue lib.

> +#endif
>  	mlx4_glue->fork_init();
>  	rte_pci_register(&mlx4_driver);
>  }
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
> index 544599b01..050cfac0d 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -1142,6 +1142,15 @@ rte_mlx5_pmd_init(void)
>  		return;
>  	assert(mlx5_glue);
>  #endif
> +#ifndef NDEBUG
> +	/* Glue structure must not contain any NULL pointers. */
> +	{
> +		unsigned int i;
> +
> +		for (i = 0; i != sizeof(*mlx5_glue) / sizeof(void *); ++i)
> +			assert(((const void *const *)mlx5_glue)[i]);
> +	}
> +#endif
>  	mlx5_glue->fork_init();
>  	rte_pci_register(&mlx5_driver);
>  }
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 12:24           ` Van Haaren, Harry
@ 2018-02-05 12:43             ` Marcelo Ricardo Leitner
  2018-02-05 12:58             ` Marcelo Ricardo Leitner
  1 sibling, 0 replies; 29+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-02-05 12:43 UTC (permalink / raw)
  To: Van Haaren, Harry
  Cc: Adrien Mazarguil, Thomas Monjalon, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > Sent: Monday, February 5, 2018 12:14 PM
> > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > libraries
> > 
> > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > --- a/drivers/net/mlx4/Makefile
> > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > >
> > > > >  # Library name.
> > > > >  LIB = librte_pmd_mlx4.a
> > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > +LIB_GLUE_VERSION = 18.02.1
> > > >
> > > > You should use the version number of the release, i.e. 18.02.0
> > > > Ideally, you should retrieve it from rte_version.h.
> > >
> > > Keep in mind this only needs to be updated when the glue API gets
> > modified,
> > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > >
> > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > different sets of Verbs calls and thus a different version, hence the
> > added
> > > "18.02" as a starting point. The last digit may have to be modified
> > possibly
> > > several times between official DPDK releases while work is being done on
> > the
> > > PMD (i.e. per commit).
> > >
> > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > really think it should, doing so will make things more complex in the
> > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > 
> > What about appending date +%s output to it? It would be stricter and
> > automated.
> 
> Adding current timestamp or date into a build breaks reproducibility of builds, so is
> generally not recommended.

Good point.

> 
> No opinion on string/version naming here.
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 12:24           ` Van Haaren, Harry
  2018-02-05 12:43             ` Marcelo Ricardo Leitner
@ 2018-02-05 12:58             ` Marcelo Ricardo Leitner
  2018-02-05 13:44               ` Adrien Mazarguil
  1 sibling, 1 reply; 29+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-02-05 12:58 UTC (permalink / raw)
  To: Van Haaren, Harry
  Cc: Adrien Mazarguil, Thomas Monjalon, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > Sent: Monday, February 5, 2018 12:14 PM
> > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > libraries
> > 
> > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > --- a/drivers/net/mlx4/Makefile
> > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > >
> > > > >  # Library name.
> > > > >  LIB = librte_pmd_mlx4.a
> > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > +LIB_GLUE_VERSION = 18.02.1
> > > >
> > > > You should use the version number of the release, i.e. 18.02.0
> > > > Ideally, you should retrieve it from rte_version.h.
> > >
> > > Keep in mind this only needs to be updated when the glue API gets
> > modified,
> > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > >
> > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > different sets of Verbs calls and thus a different version, hence the
> > added
> > > "18.02" as a starting point. The last digit may have to be modified
> > possibly
> > > several times between official DPDK releases while work is being done on
> > the
> > > PMD (i.e. per commit).
> > >
> > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > really think it should, doing so will make things more complex in the
> > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > 
> > What about appending date +%s output to it? It would be stricter and
> > automated.
> 
> Adding current timestamp or date into a build breaks reproducibility of builds, so is
> generally not recommended.

Then the sha1sum of mlx4_glue.h.
With this the size check I mentioned on the other patch would become
redundant and unnecessary.

> 
> No opinion on string/version naming here.
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 1/4] net/mlx: add debug checks to glue structure
  2018-02-05 12:27     ` Marcelo Ricardo Leitner
@ 2018-02-05 13:31       ` Adrien Mazarguil
  0 siblings, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-05 13:31 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner; +Cc: Shahaf Shuler, Nelio Laranjeiro, dev

On Mon, Feb 05, 2018 at 10:27:10AM -0200, Marcelo Ricardo Leitner wrote:
> On Fri, Feb 02, 2018 at 05:46:12PM +0100, Adrien Mazarguil wrote:
> > This code should catch mistakes early if a glue structure member is added
> > without a corresponding implementation in the library.
> > 
> > Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > ---
> >  drivers/net/mlx4/mlx4.c | 9 +++++++++
> >  drivers/net/mlx5/mlx5.c | 9 +++++++++
> >  2 files changed, 18 insertions(+)
> > 
> > diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> > index 50a55ee52..201d39b6e 100644
> > --- a/drivers/net/mlx4/mlx4.c
> > +++ b/drivers/net/mlx4/mlx4.c
> > @@ -799,6 +799,15 @@ rte_mlx4_pmd_init(void)
> >  		return;
> >  	assert(mlx4_glue);
> >  #endif
> > +#ifndef NDEBUG
> > +	/* Glue structure must not contain any NULL pointers. */
> > +	{
> > +		unsigned int i;
> > +
> > +		for (i = 0; i != sizeof(*mlx4_glue) / sizeof(void *); ++i)
> > +			assert(((const void *const *)mlx4_glue)[i]);
> > +	}
> 
> This code will not catch the case on which mlx4_glue on PMD is smaller
> than the one on the glue lib. Although this would be safe, as the PMD
> won't place calls for functions that it is not aware of, I guess we
> don't want to allow such situation anyhow.
> 
> One way to handle this is to add a size field and let the PMD check if
> the size is the same. As this code is walking through it as an array
> of pointers, an union around it to keep the alignment may be welcomed.
> 
> Also, this code would do read beyond buffer in the opposite case, when
> mlx4_glue for the PMD is larger than the one on the glue lib.

While I generally agree, this block is pure debugging code not compiled in
by default and meant for PMD developers, not end users. It happened to me
while moving all these calls to the glue structure where I missed a couple
of functions and is a way to ensure such mistakes would be caught early on.

The version check that comes afterward is actually is the safe check you're
thinking about. Based on that, the PMD can be confident the symbol in
question has the expected properties. No need to check for NULLs.

If you think this code doesn't have its place in a PMD, I can remove it; the
version check should be enough. In my opinion it's better to keep it for
safety though, please confirm.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 12:58             ` Marcelo Ricardo Leitner
@ 2018-02-05 13:44               ` Adrien Mazarguil
  2018-02-05 14:16                 ` Thomas Monjalon
  0 siblings, 1 reply; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-05 13:44 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Van Haaren, Harry, Thomas Monjalon, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > Sent: Monday, February 5, 2018 12:14 PM
> > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > libraries
> > > 
> > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > >
> > > > > >  # Library name.
> > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > >
> > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > Ideally, you should retrieve it from rte_version.h.
> > > >
> > > > Keep in mind this only needs to be updated when the glue API gets
> > > modified,
> > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > >
> > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > different sets of Verbs calls and thus a different version, hence the
> > > added
> > > > "18.02" as a starting point. The last digit may have to be modified
> > > possibly
> > > > several times between official DPDK releases while work is being done on
> > > the
> > > > PMD (i.e. per commit).
> > > >
> > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > really think it should, doing so will make things more complex in the
> > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > 
> > > What about appending date +%s output to it? It would be stricter and
> > > automated.
> > 
> > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > generally not recommended.
> 
> Then the sha1sum of mlx4_glue.h.
> With this the size check I mentioned on the other patch would become
> redundant and unnecessary.

Using a strong hash algorithm to version a library/symbol, while possible,
seems a bit overkill and results in ugliness:

 librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3

Using a weak one like CRC32 for a shorter name poses a risk of
collision. Moreover the next time someone decides to update all version
notices or modify a comment will impact that hash. We'd need to isolate the
symbol definition itself, ignore parameter names in function prototypes and
only then we may get a somewhat meaningful hash describing a given ABI.

Given the added complexity, is there really a problem with simple version
numbers we increment every time something gets modified? (Note this is
already how our .map files work, they're not generated automatically)

How about keeping things as is?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 13:44               ` Adrien Mazarguil
@ 2018-02-05 14:16                 ` Thomas Monjalon
  2018-02-05 14:33                   ` Adrien Mazarguil
  2018-02-05 14:37                   ` Marcelo Ricardo Leitner
  0 siblings, 2 replies; 29+ messages in thread
From: Thomas Monjalon @ 2018-02-05 14:16 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Marcelo Ricardo Leitner, Van Haaren, Harry, dev, Shahaf Shuler,
	Nelio Laranjeiro

05/02/2018 14:44, Adrien Mazarguil:
> On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > libraries
> > > > 
> > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > >
> > > > > > >  # Library name.
> > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > >
> > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > >
> > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > modified,
> > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > >
> > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > different sets of Verbs calls and thus a different version, hence the
> > > > added
> > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > possibly
> > > > > several times between official DPDK releases while work is being done on
> > > > the
> > > > > PMD (i.e. per commit).
> > > > >
> > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > really think it should, doing so will make things more complex in the
> > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > 
> > > > What about appending date +%s output to it? It would be stricter and
> > > > automated.
> > > 
> > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > generally not recommended.
> > 
> > Then the sha1sum of mlx4_glue.h.
> > With this the size check I mentioned on the other patch would become
> > redundant and unnecessary.
> 
> Using a strong hash algorithm to version a library/symbol, while possible,
> seems a bit overkill and results in ugliness:
> 
>  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> 
> Using a weak one like CRC32 for a shorter name poses a risk of
> collision. Moreover the next time someone decides to update all version
> notices or modify a comment will impact that hash. We'd need to isolate the
> symbol definition itself, ignore parameter names in function prototypes and
> only then we may get a somewhat meaningful hash describing a given ABI.
> 
> Given the added complexity, is there really a problem with simple version
> numbers we increment every time something gets modified? (Note this is
> already how our .map files work, they're not generated automatically)

Our map files show the major version where a symbol was introduced.
It is simple because no symbol can be introduced in a minor version.

> How about keeping things as is?

You are using 18.02.1 while it is introduced in 18.02.0.
If you don't want to correlate the .so version number with DPDK version
number, maybe that 1, 2, 3 would be a simpler choice (less confusing).

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 14:16                 ` Thomas Monjalon
@ 2018-02-05 14:33                   ` Adrien Mazarguil
  2018-02-05 14:37                   ` Marcelo Ricardo Leitner
  1 sibling, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-05 14:33 UTC (permalink / raw)
  To: Thomas Monjalon, Shahaf Shuler
  Cc: Marcelo Ricardo Leitner, Van Haaren, Harry, dev, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> 05/02/2018 14:44, Adrien Mazarguil:
> > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > libraries
> > > > > 
> > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > >
> > > > > > > >  # Library name.
> > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > >
> > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > >
> > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > modified,
> > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > >
> > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > added
> > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > possibly
> > > > > > several times between official DPDK releases while work is being done on
> > > > > the
> > > > > > PMD (i.e. per commit).
> > > > > >
> > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > really think it should, doing so will make things more complex in the
> > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > 
> > > > > What about appending date +%s output to it? It would be stricter and
> > > > > automated.
> > > > 
> > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > generally not recommended.
> > > 
> > > Then the sha1sum of mlx4_glue.h.
> > > With this the size check I mentioned on the other patch would become
> > > redundant and unnecessary.
> > 
> > Using a strong hash algorithm to version a library/symbol, while possible,
> > seems a bit overkill and results in ugliness:
> > 
> >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > 
> > Using a weak one like CRC32 for a shorter name poses a risk of
> > collision. Moreover the next time someone decides to update all version
> > notices or modify a comment will impact that hash. We'd need to isolate the
> > symbol definition itself, ignore parameter names in function prototypes and
> > only then we may get a somewhat meaningful hash describing a given ABI.
> > 
> > Given the added complexity, is there really a problem with simple version
> > numbers we increment every time something gets modified? (Note this is
> > already how our .map files work, they're not generated automatically)
> 
> Our map files show the major version where a symbol was introduced.
> It is simple because no symbol can be introduced in a minor version.
> 
> > How about keeping things as is?
> 
> You are using 18.02.1 while it is introduced in 18.02.0.
> If you don't want to correlate the .so version number with DPDK version
> number, maybe that 1, 2, 3 would be a simpler choice (less confusing).

I don't really care as long as there's no confusion with their backported
counterparts (namely 17.11 and 17.02). I understand the possible confusion
for someone who'd grep the sources though.

If 18.02.0 is OK in everyone's opinion, let's use that. It satisfies the
uniqueness requirement. We'll add a digit or find some other versioning
scheme later if necessary.

Shahaf, can you make a minor adjustment while applying this series?

Both drivers/net/mlx4/Makefile and drivers/net/mlx5/Makefile need to be
modified as follows in patch 3/4:

 -LIB_GLUE_VERSION = 18.02.1
 +LIB_GLUE_VERSION = 18.02.0

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 14:16                 ` Thomas Monjalon
  2018-02-05 14:33                   ` Adrien Mazarguil
@ 2018-02-05 14:37                   ` Marcelo Ricardo Leitner
  2018-02-05 14:59                     ` Adrien Mazarguil
  1 sibling, 1 reply; 29+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-02-05 14:37 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Adrien Mazarguil, Van Haaren, Harry, dev, Shahaf Shuler,
	Nelio Laranjeiro

On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> 05/02/2018 14:44, Adrien Mazarguil:
> > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > libraries
> > > > > 
> > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > >
> > > > > > > >  # Library name.
> > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > >
> > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > >
> > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > modified,
> > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > >
> > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > added
> > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > possibly
> > > > > > several times between official DPDK releases while work is being done on
> > > > > the
> > > > > > PMD (i.e. per commit).
> > > > > >
> > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > really think it should, doing so will make things more complex in the
> > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > 
> > > > > What about appending date +%s output to it? It would be stricter and
> > > > > automated.
> > > > 
> > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > generally not recommended.
> > > 
> > > Then the sha1sum of mlx4_glue.h.
> > > With this the size check I mentioned on the other patch would become
> > > redundant and unnecessary.
> > 
> > Using a strong hash algorithm to version a library/symbol, while possible,
> > seems a bit overkill and results in ugliness:
> > 
> >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3

Ugh yes, but it wouldn't need to be that visible. A pointer on
mlx*_glue and a define on PMD would be enough already. As in, an
extended check to the versioning.

> > 
> > Using a weak one like CRC32 for a shorter name poses a risk of
> > collision. Moreover the next time someone decides to update all version
> > notices or modify a comment will impact that hash. We'd need to isolate the
> > symbol definition itself, ignore parameter names in function prototypes and
> > only then we may get a somewhat meaningful hash describing a given ABI.

That's what I meant with stricter. Yes it would catch such
situations, but you tell me on how much we want to protect/restrict
here.  Do you see a reason for building only the dpdk/pmd side and not
the glue library at a time?

> > 
> > Given the added complexity, is there really a problem with simple version
> > numbers we increment every time something gets modified? (Note this is
> > already how our .map files work, they're not generated automatically)
> 
> Our map files show the major version where a symbol was introduced.
> It is simple because no symbol can be introduced in a minor version.
> 
> > How about keeping things as is?

I don't really see the need of unique filenames. The next patch is
already leveraging RTE_EAL_PMD_PATH, which if versioned should be
enough for this, no?

> 
> You are using 18.02.1 while it is introduced in 18.02.0.
> If you don't want to correlate the .so version number with DPDK version
> number, maybe that 1, 2, 3 would be a simpler choice (less confusing).

+1

  Marcelo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 14:37                   ` Marcelo Ricardo Leitner
@ 2018-02-05 14:59                     ` Adrien Mazarguil
  2018-02-05 15:29                       ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-05 14:59 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > 05/02/2018 14:44, Adrien Mazarguil:
> > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > libraries
> > > > > > 
> > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > >
> > > > > > > > >  # Library name.
> > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > >
> > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > >
> > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > modified,
> > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > >
> > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > added
> > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > possibly
> > > > > > > several times between official DPDK releases while work is being done on
> > > > > > the
> > > > > > > PMD (i.e. per commit).
> > > > > > >
> > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > 
> > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > automated.
> > > > > 
> > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > generally not recommended.
> > > > 
> > > > Then the sha1sum of mlx4_glue.h.
> > > > With this the size check I mentioned on the other patch would become
> > > > redundant and unnecessary.
> > > 
> > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > seems a bit overkill and results in ugliness:
> > > 
> > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> 
> Ugh yes, but it wouldn't need to be that visible. A pointer on
> mlx*_glue and a define on PMD would be enough already. As in, an
> extended check to the versioning.

I thought you suggested this as a replacement. I'm not sure we need or want
to go this far. The current string comparison is really not worse than
standard symbol versioning, which doesn't check symbol properties besides
whether they are functions or other objects. We could have used C++ with
automatically mangled symbol names for that, however that again would make
things way more complex than necessary.

> > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > collision. Moreover the next time someone decides to update all version
> > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > symbol definition itself, ignore parameter names in function prototypes and
> > > only then we may get a somewhat meaningful hash describing a given ABI.
> 
> That's what I meant with stricter. Yes it would catch such
> situations, but you tell me on how much we want to protect/restrict
> here.  Do you see a reason for building only the dpdk/pmd side and not
> the glue library at a time?

No, they're always built together. We're only adding this versioning to
avoid issues when users somehow end up with several DPDK versions installed
on their system, or with leftovers of previous releases lying around. That's
all we need to solve here. dlopen()'ing the proper file takes care of that,
the symbol version number check afterward is performed just in case.

> > > Given the added complexity, is there really a problem with simple version
> > > numbers we increment every time something gets modified? (Note this is
> > > already how our .map files work, they're not generated automatically)
> > 
> > Our map files show the major version where a symbol was introduced.
> > It is simple because no symbol can be introduced in a minor version.
> > 
> > > How about keeping things as is?
> 
> I don't really see the need of unique filenames. The next patch is
> already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> enough for this, no?

As you said, "if" versioned. As an undocumented empty string by default,
there's no way to be sure. Leaving the PMD version its internal but
(unfortunately) exposed bits will certainly prevent mistakes.

> > You are using 18.02.1 while it is introduced in 18.02.0.
> > If you don't want to correlate the .so version number with DPDK version
> > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> 
> +1

Then are you fine with the "18.02.0" suffix?

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 14:59                     ` Adrien Mazarguil
@ 2018-02-05 15:29                       ` Marcelo Ricardo Leitner
  2018-02-05 15:54                         ` Adrien Mazarguil
  0 siblings, 1 reply; 29+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-02-05 15:29 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > 05/02/2018 14:44, Adrien Mazarguil:
> > > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > > libraries
> > > > > > > 
> > > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > > >
> > > > > > > > > >  # Library name.
> > > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > > >
> > > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > > >
> > > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > > modified,
> > > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > > >
> > > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > > added
> > > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > > possibly
> > > > > > > > several times between official DPDK releases while work is being done on
> > > > > > > the
> > > > > > > > PMD (i.e. per commit).
> > > > > > > >
> > > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > > 
> > > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > > automated.
> > > > > > 
> > > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > > generally not recommended.
> > > > > 
> > > > > Then the sha1sum of mlx4_glue.h.
> > > > > With this the size check I mentioned on the other patch would become
> > > > > redundant and unnecessary.
> > > > 
> > > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > > seems a bit overkill and results in ugliness:
> > > > 
> > > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > 
> > Ugh yes, but it wouldn't need to be that visible. A pointer on
> > mlx*_glue and a define on PMD would be enough already. As in, an
> > extended check to the versioning.
> 
> I thought you suggested this as a replacement. I'm not sure we need or want
> to go this far. The current string comparison is really not worse than
> standard symbol versioning, which doesn't check symbol properties besides
> whether they are functions or other objects. We could have used C++ with
> automatically mangled symbol names for that, however that again would make
> things way more complex than necessary.
> 
> > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > collision. Moreover the next time someone decides to update all version
> > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > 
> > That's what I meant with stricter. Yes it would catch such
> > situations, but you tell me on how much we want to protect/restrict
> > here.  Do you see a reason for building only the dpdk/pmd side and not
> > the glue library at a time?
> 
> No, they're always built together. We're only adding this versioning to
> avoid issues when users somehow end up with several DPDK versions installed
> on their system, or with leftovers of previous releases lying around. That's
> all we need to solve here. dlopen()'ing the proper file takes care of that,
> the symbol version number check afterward is performed just in case.

Interesting. These leftovers probably wouldn't be there if it wasn't
versioned in the first place. :-)

> 
> > > > Given the added complexity, is there really a problem with simple version
> > > > numbers we increment every time something gets modified? (Note this is
> > > > already how our .map files work, they're not generated automatically)
> > > 
> > > Our map files show the major version where a symbol was introduced.
> > > It is simple because no symbol can be introduced in a minor version.
> > > 
> > > > How about keeping things as is?
> > 
> > I don't really see the need of unique filenames. The next patch is
> > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > enough for this, no?
> 
> As you said, "if" versioned. As an undocumented empty string by default,
> there's no way to be sure. Leaving the PMD version its internal but
> (unfortunately) exposed bits will certainly prevent mistakes.
> 
> > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > If you don't want to correlate the .so version number with DPDK version
> > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > 
> > +1
> 
> Then are you fine with the "18.02.0" suffix?

Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
to dpdk version.

With the latest replies, I don't think the reasoning is enough to
justify these extra checks, but I won't oppose to including it.

  Marcelo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 15:29                       ` Marcelo Ricardo Leitner
@ 2018-02-05 15:54                         ` Adrien Mazarguil
  2018-02-05 17:06                           ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-05 15:54 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > 05/02/2018 14:44, Adrien Mazarguil:
> > > > > On Mon, Feb 05, 2018 at 10:58:06AM -0200, Marcelo Ricardo Leitner wrote:
> > > > > > On Mon, Feb 05, 2018 at 12:24:23PM +0000, Van Haaren, Harry wrote:
> > > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Marcelo Ricardo Leitner
> > > > > > > > Sent: Monday, February 5, 2018 12:14 PM
> > > > > > > > To: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> > > > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Shahaf Shuler
> > > > > > > > <shahafs@mellanox.com>; Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
> > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2 3/4] net/mlx: version rdma-core glue
> > > > > > > > libraries
> > > > > > > > 
> > > > > > > > On Mon, Feb 05, 2018 at 12:24:02PM +0100, Adrien Mazarguil wrote:
> > > > > > > > > On Sun, Feb 04, 2018 at 03:29:38PM +0100, Thomas Monjalon wrote:
> > > > > > > > > > 02/02/2018 17:46, Adrien Mazarguil:
> > > > > > > > > > > --- a/drivers/net/mlx4/Makefile
> > > > > > > > > > > +++ b/drivers/net/mlx4/Makefile
> > > > > > > > > > > @@ -33,7 +33,9 @@ include $(RTE_SDK)/mk/rte.vars.mk
> > > > > > > > > > >
> > > > > > > > > > >  # Library name.
> > > > > > > > > > >  LIB = librte_pmd_mlx4.a
> > > > > > > > > > > -LIB_GLUE = librte_pmd_mlx4_glue.so
> > > > > > > > > > > +LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)
> > > > > > > > > > > +LIB_GLUE_BASE = librte_pmd_mlx4_glue.so
> > > > > > > > > > > +LIB_GLUE_VERSION = 18.02.1
> > > > > > > > > >
> > > > > > > > > > You should use the version number of the release, i.e. 18.02.0
> > > > > > > > > > Ideally, you should retrieve it from rte_version.h.
> > > > > > > > >
> > > > > > > > > Keep in mind this only needs to be updated when the glue API gets
> > > > > > > > modified,
> > > > > > > > > and this "18.02.1" string may remain unmodified for subsequent DPDK
> > > > > > > > > releases, probably as long as the PMD doesn't use any new rdma-core calls.
> > > > > > > > >
> > > > > > > > > We've already backported this patch to 17.02 and 17.11, both requiring
> > > > > > > > > different sets of Verbs calls and thus a different version, hence the
> > > > > > > > added
> > > > > > > > > "18.02" as a starting point. The last digit may have to be modified
> > > > > > > > possibly
> > > > > > > > > several times between official DPDK releases while work is being done on
> > > > > > > > the
> > > > > > > > > PMD (i.e. per commit).
> > > > > > > > >
> > > > > > > > > In short it's not meant to follow DPDK's public versioning scheme. If you
> > > > > > > > > really think it should, doing so will make things more complex in the
> > > > > > > > > Makefile, which will have to parse rte_version.h. What's your opinion?
> > > > > > > > 
> > > > > > > > What about appending date +%s output to it? It would be stricter and
> > > > > > > > automated.
> > > > > > > 
> > > > > > > Adding current timestamp or date into a build breaks reproducibility of builds, so is
> > > > > > > generally not recommended.
> > > > > > 
> > > > > > Then the sha1sum of mlx4_glue.h.
> > > > > > With this the size check I mentioned on the other patch would become
> > > > > > redundant and unnecessary.
> > > > > 
> > > > > Using a strong hash algorithm to version a library/symbol, while possible,
> > > > > seems a bit overkill and results in ugliness:
> > > > > 
> > > > >  librte_pmd_mlx4.so.c4ca4eaf2fe975ead83453458f4f56db49e724f3
> > > 
> > > Ugh yes, but it wouldn't need to be that visible. A pointer on
> > > mlx*_glue and a define on PMD would be enough already. As in, an
> > > extended check to the versioning.
> > 
> > I thought you suggested this as a replacement. I'm not sure we need or want
> > to go this far. The current string comparison is really not worse than
> > standard symbol versioning, which doesn't check symbol properties besides
> > whether they are functions or other objects. We could have used C++ with
> > automatically mangled symbol names for that, however that again would make
> > things way more complex than necessary.
> > 
> > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > collision. Moreover the next time someone decides to update all version
> > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > 
> > > That's what I meant with stricter. Yes it would catch such
> > > situations, but you tell me on how much we want to protect/restrict
> > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > the glue library at a time?
> > 
> > No, they're always built together. We're only adding this versioning to
> > avoid issues when users somehow end up with several DPDK versions installed
> > on their system, or with leftovers of previous releases lying around. That's
> > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > the symbol version number check afterward is performed just in case.
> 
> Interesting. These leftovers probably wouldn't be there if it wasn't
> versioned in the first place. :-)

Seriously, we can't assume users will do everything using neat packages and
may run an unfortunate "make install" from the DPDK source tree without
noticing they wrecked their system. Someone will have to mop the ensuing but
preventable bug reports.

> > > > > Given the added complexity, is there really a problem with simple version
> > > > > numbers we increment every time something gets modified? (Note this is
> > > > > already how our .map files work, they're not generated automatically)
> > > > 
> > > > Our map files show the major version where a symbol was introduced.
> > > > It is simple because no symbol can be introduced in a minor version.
> > > > 
> > > > > How about keeping things as is?
> > > 
> > > I don't really see the need of unique filenames. The next patch is
> > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > enough for this, no?
> > 
> > As you said, "if" versioned. As an undocumented empty string by default,
> > there's no way to be sure. Leaving the PMD version its internal but
> > (unfortunately) exposed bits will certainly prevent mistakes.
> > 
> > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > If you don't want to correlate the .so version number with DPDK version
> > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > 
> > > +1
> > 
> > Then are you fine with the "18.02.0" suffix?
> 
> Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> to dpdk version.
> 
> With the latest replies, I don't think the reasoning is enough to
> justify these extra checks, but I won't oppose to including it.

18.02.0 makes it tied to the current release number, so I guess we agree.
The idea for now is this part remains tied to the DPDK release.

If a new ABI version is needed in a subsequent commit, the initial part gets
bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
intermediate commits break the glue ABI, a fourth digit is added
(e.g. 42.02.0.1).

This role is currently held by the third digit but since there's a confusion
with DPDK revisions, it won't be used internally by the PMD. Hopefully this
fourth digit will remain unused (otherwise I can add as many digits as
necessary to make it acceptable, I'll then re-consider the SHA1 idea :)

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 15:54                         ` Adrien Mazarguil
@ 2018-02-05 17:06                           ` Marcelo Ricardo Leitner
  2018-02-06 11:06                             ` Adrien Mazarguil
  0 siblings, 1 reply; 29+ messages in thread
From: Marcelo Ricardo Leitner @ 2018-02-05 17:06 UTC (permalink / raw)
  To: Adrien Mazarguil
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 04:54:55PM +0100, Adrien Mazarguil wrote:
> On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> > On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > > 05/02/2018 14:44, Adrien Mazarguil:
...
> > > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > > collision. Moreover the next time someone decides to update all version
> > > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > > 
> > > > That's what I meant with stricter. Yes it would catch such
> > > > situations, but you tell me on how much we want to protect/restrict
> > > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > > the glue library at a time?
> > > 
> > > No, they're always built together. We're only adding this versioning to
> > > avoid issues when users somehow end up with several DPDK versions installed
> > > on their system, or with leftovers of previous releases lying around. That's
> > > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > > the symbol version number check afterward is performed just in case.
> > 
> > Interesting. These leftovers probably wouldn't be there if it wasn't
> > versioned in the first place. :-)
> 
> Seriously, we can't assume users will do everything using neat packages and
> may run an unfortunate "make install" from the DPDK source tree without
> noticing they wrecked their system. Someone will have to mop the ensuing but
> preventable bug reports.
> 
> > > > > > Given the added complexity, is there really a problem with simple version
> > > > > > numbers we increment every time something gets modified? (Note this is
> > > > > > already how our .map files work, they're not generated automatically)
> > > > > 
> > > > > Our map files show the major version where a symbol was introduced.
> > > > > It is simple because no symbol can be introduced in a minor version.
> > > > > 
> > > > > > How about keeping things as is?
> > > > 
> > > > I don't really see the need of unique filenames. The next patch is
> > > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > > enough for this, no?
> > > 
> > > As you said, "if" versioned. As an undocumented empty string by default,
> > > there's no way to be sure. Leaving the PMD version its internal but
> > > (unfortunately) exposed bits will certainly prevent mistakes.
> > > 
> > > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > > If you don't want to correlate the .so version number with DPDK version
> > > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > > 
> > > > +1
> > > 
> > > Then are you fine with the "18.02.0" suffix?
> > 
> > Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> > to dpdk version.
> > 
> > With the latest replies, I don't think the reasoning is enough to
> > justify these extra checks, but I won't oppose to including it.
> 
> 18.02.0 makes it tied to the current release number, so I guess we agree.

It makes them equal, but not tied. If nobody patches it, when 18.02.1
is out, the glue lib will still be 18.02.0.

> The idea for now is this part remains tied to the DPDK release.
> 
> If a new ABI version is needed in a subsequent commit, the initial part gets
> bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
> intermediate commits break the glue ABI, a fourth digit is added
> (e.g. 42.02.0.1).

I'll defer this to other project developers. This is more about a
project standard than anything here. I could even argue that this glue
should be named after the pmd lib, such as
   ./usr/local/lib/librte_pmd_mlx4_glue.so.1.1
The fact of not providing the _glue.so symlink is enough to avoid
others from linking against it. But it's more of a project standard
than a technical decision, I guess, weather this lib is seen as a
plugin or as a (private) library.

Considering the versioning used for the PMD libs, such easy versioning
is my preferred choice, FWIW.

> 
> This role is currently held by the third digit but since there's a confusion
> with DPDK revisions, it won't be used internally by the PMD. Hopefully this
> fourth digit will remain unused (otherwise I can add as many digits as
> necessary to make it acceptable, I'll then re-consider the SHA1 idea :)

hehe :-)

  Marcelo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 3/4] net/mlx: version rdma-core glue libraries
  2018-02-05 17:06                           ` Marcelo Ricardo Leitner
@ 2018-02-06 11:06                             ` Adrien Mazarguil
  0 siblings, 0 replies; 29+ messages in thread
From: Adrien Mazarguil @ 2018-02-06 11:06 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: Thomas Monjalon, Van Haaren, Harry, dev, Shahaf Shuler, Nelio Laranjeiro

On Mon, Feb 05, 2018 at 03:06:19PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Feb 05, 2018 at 04:54:55PM +0100, Adrien Mazarguil wrote:
> > On Mon, Feb 05, 2018 at 01:29:42PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Mon, Feb 05, 2018 at 03:59:18PM +0100, Adrien Mazarguil wrote:
> > > > On Mon, Feb 05, 2018 at 12:37:34PM -0200, Marcelo Ricardo Leitner wrote:
> > > > > On Mon, Feb 05, 2018 at 03:16:21PM +0100, Thomas Monjalon wrote:
> > > > > > 05/02/2018 14:44, Adrien Mazarguil:
> ...
> > > > > > > Using a weak one like CRC32 for a shorter name poses a risk of
> > > > > > > collision. Moreover the next time someone decides to update all version
> > > > > > > notices or modify a comment will impact that hash. We'd need to isolate the
> > > > > > > symbol definition itself, ignore parameter names in function prototypes and
> > > > > > > only then we may get a somewhat meaningful hash describing a given ABI.
> > > > > 
> > > > > That's what I meant with stricter. Yes it would catch such
> > > > > situations, but you tell me on how much we want to protect/restrict
> > > > > here.  Do you see a reason for building only the dpdk/pmd side and not
> > > > > the glue library at a time?
> > > > 
> > > > No, they're always built together. We're only adding this versioning to
> > > > avoid issues when users somehow end up with several DPDK versions installed
> > > > on their system, or with leftovers of previous releases lying around. That's
> > > > all we need to solve here. dlopen()'ing the proper file takes care of that,
> > > > the symbol version number check afterward is performed just in case.
> > > 
> > > Interesting. These leftovers probably wouldn't be there if it wasn't
> > > versioned in the first place. :-)
> > 
> > Seriously, we can't assume users will do everything using neat packages and
> > may run an unfortunate "make install" from the DPDK source tree without
> > noticing they wrecked their system. Someone will have to mop the ensuing but
> > preventable bug reports.
> > 
> > > > > > > Given the added complexity, is there really a problem with simple version
> > > > > > > numbers we increment every time something gets modified? (Note this is
> > > > > > > already how our .map files work, they're not generated automatically)
> > > > > > 
> > > > > > Our map files show the major version where a symbol was introduced.
> > > > > > It is simple because no symbol can be introduced in a minor version.
> > > > > > 
> > > > > > > How about keeping things as is?
> > > > > 
> > > > > I don't really see the need of unique filenames. The next patch is
> > > > > already leveraging RTE_EAL_PMD_PATH, which if versioned should be
> > > > > enough for this, no?
> > > > 
> > > > As you said, "if" versioned. As an undocumented empty string by default,
> > > > there's no way to be sure. Leaving the PMD version its internal but
> > > > (unfortunately) exposed bits will certainly prevent mistakes.
> > > > 
> > > > > > You are using 18.02.1 while it is introduced in 18.02.0.
> > > > > > If you don't want to correlate the .so version number with DPDK version
> > > > > > number, maybe that 1, 2, 3 would be a simpler choice (less confusing).
> > > > > 
> > > > > +1
> > > > 
> > > > Then are you fine with the "18.02.0" suffix?
> > > 
> > > Not really, sorry. It was more for the "1, 2, 3" sequence or tying it
> > > to dpdk version.
> > > 
> > > With the latest replies, I don't think the reasoning is enough to
> > > justify these extra checks, but I won't oppose to including it.
> > 
> > 18.02.0 makes it tied to the current release number, so I guess we agree.
> 
> It makes them equal, but not tied. If nobody patches it, when 18.02.1
> is out, the glue lib will still be 18.02.0.

Well this must be understood as "this plug-in implements 18.02.0's mlx4 glue
ABI", which remains true (and compatible) with subsequent DPDK releases as
long as the glue code is not updated.

Note this is no different from a single-digit suffix, which wouldn't be
updated either if the ABI isn't. Again, these initial digits are needed
because otherwise there is already a confusion with stable branches that
implement different ABIs and are therefore incompatible:

 librte_pmd_mlx4_glue.so.17.02.1
 librte_pmd_mlx4_glue.so.17.11.1
 librte_pmd_mlx4_glue.so.18.02.0

With a single digit, all of them would be named "librte_pmd_mlx4_glue.so.1",
rendering versioning basically useless.

> > The idea for now is this part remains tied to the DPDK release.
> > 
> > If a new ABI version is needed in a subsequent commit, the initial part gets
> > bumped to the current WIP DPDK release (say, 42.02.0). If subsequent
> > intermediate commits break the glue ABI, a fourth digit is added
> > (e.g. 42.02.0.1).
> 
> I'll defer this to other project developers. This is more about a
> project standard than anything here. I could even argue that this glue
> should be named after the pmd lib, such as
>    ./usr/local/lib/librte_pmd_mlx4_glue.so.1.1
> The fact of not providing the _glue.so symlink is enough to avoid
> others from linking against it. But it's more of a project standard
> than a technical decision, I guess, weather this lib is seen as a
> plugin or as a (private) library.

I think you nailed it, I call it a "plug-in" because dlopen() is manually
performed on it, however it's in fact a private library whose API is not
exposed and no application is supposed to use directly.

For this reason, while up to package maintainers, my suggestion is to not
install it in a public location like "/usr/local/lib" but configure
RTE_EAL_PMD_PATH to some DPDK-specific path, e.g. "/usr/share/dpdk/pmd",
which is possible since patch 4/4 of this series.

> Considering the versioning used for the PMD libs, such easy versioning
> is my preferred choice, FWIW.

Problem remains that the DPDK projects manages its own backports/stable
releases system instead of relying on package maintainers for that, so
properly versioning things from the beginning to avoid collisions is really
always a concern. Had backports not been a requirement in the first place,
I agree a single digit would have been enough.

My suggestion of using 18.02.0 (instead of 18.02.1) stands. It addresses
Thomas' concern by properly matching the DPDK release the ABI was last
updated for and mine for the backports issues mentioned above. Let's go with
that and move on.

-- 
Adrien Mazarguil
6WIND

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration
  2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
                     ` (4 preceding siblings ...)
  2018-02-02 16:52   ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Nélio Laranjeiro
@ 2018-02-06 11:31   ` Shahaf Shuler
  5 siblings, 0 replies; 29+ messages in thread
From: Shahaf Shuler @ 2018-02-06 11:31 UTC (permalink / raw)
  To: Adrien Mazarguil; +Cc: Nélio Laranjeiro, dev, Marcelo Ricardo Leitner

Friday, February 2, 2018 6:46 PM, Adrien Mazarguil:
> The decision to deliver mlx4/mlx5 rdma-core glue plug-ins separately instead
> of generating them at run time due to security concerns [1] led to a few
> issues:
> 
> - They must be present on the file system before running DPDK.
> - Their location must be known to the dynamic linker.
> - Their names overlap and ABI compatibility is not guaranteed, which may
>   lead to crashes.
> 
> This series addresses the above by adding version information to plug-ins
> and taking CONFIG_RTE_EAL_PMD_PATH into account to locate them on the
> file system.

Series applied to next-net-mlx, with the following diff in patch 3/4:
diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index cc9db9977..cc800493b 100644                                 
--- a/drivers/net/mlx4/Makefile                                   
+++ b/drivers/net/mlx4/Makefile                                   
@@ -35,7 +35,7 @@ include $(RTE_SDK)/mk/rte.vars.mk               
 LIB = librte_pmd_mlx4.a                                          
 LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)                  
 LIB_GLUE_BASE = librte_pmd_mlx4_glue.so                          
-LIB_GLUE_VERSION = 18.02.1                                       
+LIB_GLUE_VERSION = 18.02.0                                       
                                                                  
 # Sources.                                                       
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c                     
diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index 4086f2039..3bc9736c9 100644                                 
--- a/drivers/net/mlx5/Makefile                                   
+++ b/drivers/net/mlx5/Makefile                                   
@@ -35,7 +35,7 @@ include $(RTE_SDK)/mk/rte.vars.mk               
 LIB = librte_pmd_mlx5.a                                          
 LIB_GLUE = $(LIB_GLUE_BASE).$(LIB_GLUE_VERSION)                  
 LIB_GLUE_BASE = librte_pmd_mlx5_glue.so                          
-LIB_GLUE_VERSION = 18.02.1                                       
+LIB_GLUE_VERSION = 18.02.0             

Thanks. 


> 
> [1]
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdpd
> k.org%2Fml%2Farchives%2Fdev%2F2018-
> January%2F089617.html&data=02%7C01%7Cshahafs%40mellanox.com%7C6d
> 6d87b37a574c15f41808d56a5c7eae%7Ca652971c7d2e4d9ba6a4d149256f461b
> %7C0%7C0%7C636531867854846685&sdata=9Bc7lEnU%2Fq4E5PxgOkEvgwDN
> zc46%2BZ1B5boHyxg1Cuo%3D&reserved=0
> 
> v2 changes:
> 
> - Fixed extra "\n" in glue file name generation (although it didn't break
>   functionality).
> 
> Adrien Mazarguil (4):
>   net/mlx: add debug checks to glue structure
>   net/mlx: fix missing includes for rdma-core glue
>   net/mlx: version rdma-core glue libraries
>   net/mlx: make rdma-core glue path configurable
> 
>  doc/guides/nics/mlx4.rst     | 17 ++++++++++++
>  doc/guides/nics/mlx5.rst     | 14 ++++++++++
>  drivers/net/mlx4/Makefile    |  8 ++++--
>  drivers/net/mlx4/mlx4.c      | 57
> ++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx4/mlx4_glue.c |  4 +++
>  drivers/net/mlx4/mlx4_glue.h |  9 +++++++
>  drivers/net/mlx5/Makefile    |  8 ++++--
>  drivers/net/mlx5/mlx5.c      | 57
> ++++++++++++++++++++++++++++++++++++++-
>  drivers/net/mlx5/mlx5_glue.c |  1 +
>  drivers/net/mlx5/mlx5_glue.h |  7 +++++
>  10 files changed, 176 insertions(+), 6 deletions(-)
> 
> --
> 2.11.0

^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2018-02-06 11:31 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-02 15:16 [PATCH v1 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
2018-02-02 15:16 ` [PATCH v1 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
2018-02-02 15:16 ` [PATCH v1 2/4] net/mlx: fix missing includes for rdma-core glue Adrien Mazarguil
2018-02-02 15:16 ` [PATCH v1 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
2018-02-02 15:16 ` [PATCH v1 4/4] net/mlx: make rdma-core glue path configurable Adrien Mazarguil
2018-02-02 16:46 ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Adrien Mazarguil
2018-02-02 16:46   ` [PATCH v2 1/4] net/mlx: add debug checks to glue structure Adrien Mazarguil
2018-02-05 12:27     ` Marcelo Ricardo Leitner
2018-02-05 13:31       ` Adrien Mazarguil
2018-02-02 16:46   ` [PATCH v2 2/4] net/mlx: fix missing includes for rdma-core glue Adrien Mazarguil
2018-02-02 16:46   ` [PATCH v2 3/4] net/mlx: version rdma-core glue libraries Adrien Mazarguil
2018-02-04 14:29     ` Thomas Monjalon
2018-02-05 11:24       ` Adrien Mazarguil
2018-02-05 12:13         ` Marcelo Ricardo Leitner
2018-02-05 12:24           ` Van Haaren, Harry
2018-02-05 12:43             ` Marcelo Ricardo Leitner
2018-02-05 12:58             ` Marcelo Ricardo Leitner
2018-02-05 13:44               ` Adrien Mazarguil
2018-02-05 14:16                 ` Thomas Monjalon
2018-02-05 14:33                   ` Adrien Mazarguil
2018-02-05 14:37                   ` Marcelo Ricardo Leitner
2018-02-05 14:59                     ` Adrien Mazarguil
2018-02-05 15:29                       ` Marcelo Ricardo Leitner
2018-02-05 15:54                         ` Adrien Mazarguil
2018-02-05 17:06                           ` Marcelo Ricardo Leitner
2018-02-06 11:06                             ` Adrien Mazarguil
2018-02-02 16:46   ` [PATCH v2 4/4] net/mlx: make rdma-core glue path configurable Adrien Mazarguil
2018-02-02 16:52   ` [PATCH v2 0/4] net/mlx: enhance rdma-core glue configuration Nélio Laranjeiro
2018-02-06 11:31   ` Shahaf Shuler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.