All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] lib/rib: Add Routing Information Base library
@ 2018-02-22 22:50 Medvedkin Vladimir
  2018-02-22 22:50 ` [PATCH v3 1/2] Add RIB library Medvedkin Vladimir
  2018-02-22 22:50 ` [PATCH v3 2/2] Add autotests for " Medvedkin Vladimir
  0 siblings, 2 replies; 7+ messages in thread
From: Medvedkin Vladimir @ 2018-02-22 22:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, Medvedkin Vladimir

This patch series introduces new library librte_rib which potentially could
replace librte_lpm.

Medvedkin Vladimir (2):
  Add RIB library
  Add autotests for RIB library

 config/common_base                 |   6 +
 doc/api/doxy-api.conf              |   1 +
 lib/Makefile                       |   2 +
 lib/librte_rib/Makefile            |  23 ++
 lib/librte_rib/rte_dir24_8.c       | 482 +++++++++++++++++++++++++++++++++
 lib/librte_rib/rte_dir24_8.h       | 115 ++++++++
 lib/librte_rib/rte_rib.c           | 526 +++++++++++++++++++++++++++++++++++++
 lib/librte_rib/rte_rib.h           | 322 +++++++++++++++++++++++
 lib/librte_rib/rte_rib_version.map |  18 ++
 mk/rte.app.mk                      |   1 +
 test/test/Makefile                 |   5 +
 test/test/test_rib.c               | 290 ++++++++++++++++++++
 test/test/test_rib_generate_rt.c   | 297 +++++++++++++++++++++
 test/test/test_rib_generate_rt.h   |  38 +++
 test/test/test_rib_lpm_comp.c      | 187 +++++++++++++
 test/test/test_rib_perf.c          | 163 ++++++++++++
 16 files changed, 2476 insertions(+)
 create mode 100644 lib/librte_rib/Makefile
 create mode 100644 lib/librte_rib/rte_dir24_8.c
 create mode 100644 lib/librte_rib/rte_dir24_8.h
 create mode 100644 lib/librte_rib/rte_rib.c
 create mode 100644 lib/librte_rib/rte_rib.h
 create mode 100644 lib/librte_rib/rte_rib_version.map
 create mode 100644 test/test/test_rib.c
 create mode 100644 test/test/test_rib_generate_rt.c
 create mode 100644 test/test/test_rib_generate_rt.h
 create mode 100644 test/test/test_rib_lpm_comp.c
 create mode 100644 test/test/test_rib_perf.c

-- 
1.9.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/2] Add RIB library
  2018-02-22 22:50 [PATCH v3 0/2] lib/rib: Add Routing Information Base library Medvedkin Vladimir
@ 2018-02-22 22:50 ` Medvedkin Vladimir
  2018-03-14 11:58   ` Bruce Richardson
  2018-03-15 14:27   ` Bruce Richardson
  2018-02-22 22:50 ` [PATCH v3 2/2] Add autotests for " Medvedkin Vladimir
  1 sibling, 2 replies; 7+ messages in thread
From: Medvedkin Vladimir @ 2018-02-22 22:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, Medvedkin Vladimir

RIB is an alternative to current LPM library.
It solves the following problems
 - Increases the speed of control plane operations against lpm such as
   adding/deleting routes
 - Adds abstraction from dataplane algorithms, so it is possible to add
   different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
   in addition to current dir24_8
 - It is possible to keep user defined application specific additional
   information in struct rte_rib_node which represents route entry.
   It can be next hop/set of next hops (i.e. active and feasible),
   pointers to link rte_rib_node based on some criteria (i.e. next_hop),
   plenty of additional control plane information.
 - For dir24_8 implementation it is possible to remove
   rte_lpm_tbl_entry.depth field that helps to save 6 bits.
 - Also new dir24_8 implementation supports different next_hop sizes
   (1/2/4/8 bytes per next hop)
 - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate
   ternary operator.
   Instead it returns special default value if there is no route.

Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>
---
 config/common_base                 |   6 +
 doc/api/doxy-api.conf              |   1 +
 lib/Makefile                       |   2 +
 lib/librte_rib/Makefile            |  23 ++
 lib/librte_rib/rte_dir24_8.c       | 482 +++++++++++++++++++++++++++++++++
 lib/librte_rib/rte_dir24_8.h       | 115 ++++++++
 lib/librte_rib/rte_rib.c           | 526 +++++++++++++++++++++++++++++++++++++
 lib/librte_rib/rte_rib.h           | 322 +++++++++++++++++++++++
 lib/librte_rib/rte_rib_version.map |  18 ++
 mk/rte.app.mk                      |   1 +
 10 files changed, 1496 insertions(+)
 create mode 100644 lib/librte_rib/Makefile
 create mode 100644 lib/librte_rib/rte_dir24_8.c
 create mode 100644 lib/librte_rib/rte_dir24_8.h
 create mode 100644 lib/librte_rib/rte_rib.c
 create mode 100644 lib/librte_rib/rte_rib.h
 create mode 100644 lib/librte_rib/rte_rib_version.map

diff --git a/config/common_base b/config/common_base
index ad03cf4..aceeff5 100644
--- a/config/common_base
+++ b/config/common_base
@@ -679,6 +679,12 @@ CONFIG_RTE_LIBRTE_LPM=y
 CONFIG_RTE_LIBRTE_LPM_DEBUG=n
 
 #
+# Compile librte_rib
+#
+CONFIG_RTE_LIBRTE_RIB=y
+CONFIG_RTE_LIBRTE_RIB_DEBUG=n
+
+#
 # Compile librte_acl
 #
 CONFIG_RTE_LIBRTE_ACL=y
diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
index cda52fd..8e4f969 100644
--- a/doc/api/doxy-api.conf
+++ b/doc/api/doxy-api.conf
@@ -60,6 +60,7 @@ INPUT                   = doc/api/doxy-api-index.md \
                           lib/librte_kvargs \
                           lib/librte_latencystats \
                           lib/librte_lpm \
+                          lib/librte_rib \
                           lib/librte_mbuf \
                           lib/librte_member \
                           lib/librte_mempool \
diff --git a/lib/Makefile b/lib/Makefile
index ec965a6..e4faf10 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -43,6 +43,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
 DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
 DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
 DEPDIRS-librte_lpm := librte_eal
+DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib
+DEPDIRS-librte_rib := librte_eal librte_mempool
 DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
 DEPDIRS-librte_acl := librte_eal
 DIRS-$(CONFIG_RTE_LIBRTE_MEMBER) += librte_member
diff --git a/lib/librte_rib/Makefile b/lib/librte_rib/Makefile
new file mode 100644
index 0000000..f6431b6
--- /dev/null
+++ b/lib/librte_rib/Makefile
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_rib.a
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
+LDLIBS += -lrte_eal -lrte_mempool
+
+EXPORT_MAP := rte_rib_version.map
+
+LIBABIVER := 1
+
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_RIB) := rte_rib.c rte_dir24_8.c
+
+# install this header file
+SYMLINK-$(CONFIG_RTE_LIBRTE_RIB)-include := rte_rib.h rte_dir24_8.h
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_rib/rte_dir24_8.c b/lib/librte_rib/rte_dir24_8.c
new file mode 100644
index 0000000..2fc55fe
--- /dev/null
+++ b/lib/librte_rib/rte_dir24_8.c
@@ -0,0 +1,482 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <rte_debug.h>
+#include <rte_malloc.h>
+#include <rte_prefetch.h>
+#include <rte_errno.h>
+
+#include <inttypes.h>
+
+#include <rte_memory.h>
+#include <rte_branch_prediction.h>
+
+#include <rte_rib.h>
+#include <rte_dir24_8.h>
+
+#define BITMAP_SLAB_BIT_SIZE_LOG2	6
+#define BITMAP_SLAB_BIT_SIZE		(1 << BITMAP_SLAB_BIT_SIZE_LOG2)
+#define BITMAP_SLAB_BITMASK		(BITMAP_SLAB_BIT_SIZE - 1)
+
+#define ROUNDUP(x, y)	 RTE_ALIGN_CEIL(x, (1 << (32 - y)))
+
+static __rte_always_inline __attribute__((pure)) void *
+get_tbl24_p(struct rte_dir24_8_tbl *fib, uint32_t ip)
+{
+	return (void *)&((uint8_t *)fib->tbl24)[(ip &
+		RTE_DIR24_8_TBL24_MASK) >> (8 - fib->nh_sz)];
+}
+
+#define LOOKUP_FUNC(suffix, type, bulk_prefetch)			\
+int rte_dir24_8_lookup_bulk_##suffix(void *fib_p, const uint32_t *ips,	\
+	uint64_t *next_hops, const unsigned int n)				\
+{									\
+	struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;	\
+	uint64_t tmp;							\
+	uint32_t i;							\
+	uint32_t prefetch_offset = RTE_MIN((unsigned int)bulk_prefetch, n);	\
+									\
+	RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (ips == NULL) ||	\
+		(next_hops == NULL)), -EINVAL);				\
+									\
+	for (i = 0; i < prefetch_offset; i++)				\
+		rte_prefetch0(get_tbl24_p(fib, ips[i]));		\
+	for (i = 0; i < (n - prefetch_offset); i++) {			\
+		rte_prefetch0(get_tbl24_p(fib, ips[i + prefetch_offset])); \
+		tmp = ((type *)fib->tbl24)[ips[i] >> 8];		\
+		if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) ==	\
+			RTE_DIR24_8_VALID_EXT_ENT)) {			\
+			tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] +	\
+				((tmp >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
+		}							\
+		next_hops[i] = tmp >> 1;				\
+	}								\
+	for (; i < n; i++) {						\
+		tmp = ((type *)fib->tbl24)[ips[i] >> 8];		\
+		if (unlikely((tmp & RTE_DIR24_8_VALID_EXT_ENT) ==	\
+			RTE_DIR24_8_VALID_EXT_ENT)) {			\
+			tmp = ((type *)fib->tbl8)[(uint8_t)ips[i] +	\
+				((tmp >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT)]; \
+		}							\
+		next_hops[i] = tmp >> 1;				\
+	}								\
+	return 0;							\
+}									\
+
+static void
+write_to_fib(void *ptr, uint64_t val, enum rte_dir24_8_nh_sz size, int n)
+{
+	int i;
+	uint8_t *ptr8 = (uint8_t *)ptr;
+	uint16_t *ptr16 = (uint16_t *)ptr;
+	uint32_t *ptr32 = (uint32_t *)ptr;
+	uint64_t *ptr64 = (uint64_t *)ptr;
+
+	switch (size) {
+	case RTE_DIR24_8_1B:
+		for (i = 0; i < n; i++)
+			ptr8[i] = (uint8_t)val;
+		break;
+	case RTE_DIR24_8_2B:
+		for (i = 0; i < n; i++)
+			ptr16[i] = (uint16_t)val;
+		break;
+	case RTE_DIR24_8_4B:
+		for (i = 0; i < n; i++)
+			ptr32[i] = (uint32_t)val;
+		break;
+	case RTE_DIR24_8_8B:
+		for (i = 0; i < n; i++)
+			ptr64[i] = (uint64_t)val;
+		break;
+	}
+}
+
+static int
+tbl8_get_idx(struct rte_dir24_8_tbl *fib)
+{
+	uint32_t i;
+	int bit_idx;
+
+	for (i = 0; (i < (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2)) &&
+		(fib->tbl8_idxes[i] == UINT64_MAX); i++)
+		;
+	if (i <= (fib->number_tbl8s >> BITMAP_SLAB_BIT_SIZE_LOG2)) {
+		bit_idx = __builtin_ctzll(~fib->tbl8_idxes[i]);
+		fib->tbl8_idxes[i] |= (1ULL << bit_idx);
+		return (i << BITMAP_SLAB_BIT_SIZE_LOG2) + bit_idx;
+	}
+	return -ENOSPC;
+}
+
+static inline void
+tbl8_free_idx(struct rte_dir24_8_tbl *fib, int idx)
+{
+	fib->tbl8_idxes[idx >> BITMAP_SLAB_BIT_SIZE_LOG2] &=
+		~(1ULL << (idx & BITMAP_SLAB_BITMASK));
+}
+
+static int
+tbl8_alloc(struct rte_dir24_8_tbl *fib, uint64_t nh)
+{
+	int	tbl8_idx;
+	uint8_t	*tbl8_ptr;
+
+	tbl8_idx = tbl8_get_idx(fib);
+	if (tbl8_idx < 0)
+		return tbl8_idx;
+	tbl8_ptr = (uint8_t *)fib->tbl8 +
+		((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) <<
+		fib->nh_sz);
+	/*Init tbl8 entries with nexthop from tbl24*/
+	write_to_fib((void *)tbl8_ptr, nh|
+		RTE_DIR24_8_VALID_EXT_ENT, fib->nh_sz,
+		RTE_DIR24_8_TBL8_GRP_NUM_ENT);
+	return tbl8_idx;
+}
+
+static void
+tbl8_recycle(struct rte_dir24_8_tbl *fib, uint32_t ip, uint64_t tbl8_idx)
+{
+	int i;
+	uint64_t nh;
+	uint8_t *ptr8;
+	uint16_t *ptr16;
+	uint32_t *ptr32;
+	uint64_t *ptr64;
+
+	switch (fib->nh_sz) {
+	case RTE_DIR24_8_1B:
+		ptr8 = &((uint8_t *)fib->tbl8)[tbl8_idx *
+				RTE_DIR24_8_TBL8_GRP_NUM_ENT];
+		nh = *ptr8;
+		for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
+			if (nh != ptr8[i])
+				return;
+		}
+		((uint8_t *)fib->tbl24)[ip >> 8] =
+			nh & ~RTE_DIR24_8_VALID_EXT_ENT;
+		for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
+			ptr8[i] = 0;
+		break;
+	case RTE_DIR24_8_2B:
+		ptr16 = &((uint16_t *)fib->tbl8)[tbl8_idx *
+				RTE_DIR24_8_TBL8_GRP_NUM_ENT];
+		nh = *ptr16;
+		for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
+			if (nh != ptr16[i])
+				return;
+		}
+		((uint16_t *)fib->tbl24)[ip >> 8] =
+			nh & ~RTE_DIR24_8_VALID_EXT_ENT;
+		for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
+			ptr16[i] = 0;
+		break;
+	case RTE_DIR24_8_4B:
+		ptr32 = &((uint32_t *)fib->tbl8)[tbl8_idx *
+				RTE_DIR24_8_TBL8_GRP_NUM_ENT];
+		nh = *ptr32;
+		for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
+			if (nh != ptr32[i])
+				return;
+		}
+		((uint32_t *)fib->tbl24)[ip >> 8] =
+			nh & ~RTE_DIR24_8_VALID_EXT_ENT;
+		for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
+			ptr32[i] = 0;
+		break;
+	case RTE_DIR24_8_8B:
+		ptr64 = &((uint64_t *)fib->tbl8)[tbl8_idx *
+				RTE_DIR24_8_TBL8_GRP_NUM_ENT];
+		nh = *ptr64;
+		for (i = 1; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++) {
+			if (nh != ptr64[i])
+				return;
+		}
+		((uint64_t *)fib->tbl24)[ip >> 8] =
+			nh & ~RTE_DIR24_8_VALID_EXT_ENT;
+		for (i = 0; i < RTE_DIR24_8_TBL8_GRP_NUM_ENT; i++)
+			ptr64[i] = 0;
+		break;
+	}
+	tbl8_free_idx(fib, tbl8_idx);
+}
+
+static int
+install_to_fib(struct rte_dir24_8_tbl *fib, uint32_t ledge, uint32_t redge,
+	uint64_t next_hop)
+{
+	uint64_t	tbl24_tmp;
+	int	tbl8_idx;
+	int tmp_tbl8_idx;
+	uint8_t	*tbl8_ptr;
+
+	/*case for 0.0.0.0/0*/
+	if (unlikely((ledge == 0) && (redge == 0))) {
+		write_to_fib(fib->tbl24, next_hop << 1, fib->nh_sz, 1 << 24);
+		return 0;
+	}
+	if (ROUNDUP(ledge, 24) <= redge) {
+		if (ledge < ROUNDUP(ledge, 24)) {
+			tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge);
+			if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
+				RTE_DIR24_8_VALID_EXT_ENT) {
+				tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
+				tmp_tbl8_idx = tbl8_get_idx(fib);
+				if ((tbl8_idx < 0) || (tmp_tbl8_idx < 0))
+					return -ENOSPC;
+				tbl8_free_idx(fib, tmp_tbl8_idx);
+				/*update dir24 entry with tbl8 index*/
+				write_to_fib(get_tbl24_p(fib, ledge),
+					(tbl8_idx << 1)|
+					RTE_DIR24_8_VALID_EXT_ENT,
+					fib->nh_sz, 1);
+			} else
+				tbl8_idx = tbl24_tmp >> 1;
+			tbl8_ptr = (uint8_t *)fib->tbl8 +
+				(((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) +
+				(ledge & ~RTE_DIR24_8_TBL24_MASK)) <<
+				fib->nh_sz);
+			/*update tbl8 with new next hop*/
+			write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
+				RTE_DIR24_8_VALID_EXT_ENT,
+				fib->nh_sz, ROUNDUP(ledge, 24) - ledge);
+			tbl8_recycle(fib, ledge, tbl8_idx);
+		}
+		if (ROUNDUP(ledge, 24) < (redge & RTE_DIR24_8_TBL24_MASK)) {
+			write_to_fib(get_tbl24_p(fib, ROUNDUP(ledge, 24)),
+				next_hop << 1, fib->nh_sz,
+				((redge & RTE_DIR24_8_TBL24_MASK) -
+				ROUNDUP(ledge, 24)) >> 8);
+		}
+		if (redge & ~RTE_DIR24_8_TBL24_MASK) {
+			tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, redge);
+			if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
+					RTE_DIR24_8_VALID_EXT_ENT) {
+				tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
+				if (tbl8_idx < 0)
+					return -ENOSPC;
+				/*update dir24 entry with tbl8 index*/
+				write_to_fib(get_tbl24_p(fib, redge),
+					(tbl8_idx << 1)|
+					RTE_DIR24_8_VALID_EXT_ENT,
+					fib->nh_sz, 1);
+			} else
+				tbl8_idx = tbl24_tmp >> 1;
+			tbl8_ptr = (uint8_t *)fib->tbl8 +
+				((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) <<
+				fib->nh_sz);
+			/*update tbl8 with new next hop*/
+			write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
+				RTE_DIR24_8_VALID_EXT_ENT,
+				fib->nh_sz, redge & ~RTE_DIR24_8_TBL24_MASK);
+			tbl8_recycle(fib, redge, tbl8_idx);
+		}
+	} else {
+		tbl24_tmp = RTE_DIR24_8_GET_TBL24(fib, ledge);
+		if ((tbl24_tmp & RTE_DIR24_8_VALID_EXT_ENT) !=
+			RTE_DIR24_8_VALID_EXT_ENT) {
+			tbl8_idx = tbl8_alloc(fib, tbl24_tmp);
+			if (tbl8_idx < 0)
+				return -ENOSPC;
+			/*update dir24 entry with tbl8 index*/
+			write_to_fib(get_tbl24_p(fib, ledge),
+				(tbl8_idx << 1)|
+				RTE_DIR24_8_VALID_EXT_ENT,
+				fib->nh_sz, 1);
+		} else
+			tbl8_idx = tbl24_tmp >> 1;
+		tbl8_ptr = (uint8_t *)fib->tbl8 +
+			(((tbl8_idx * RTE_DIR24_8_TBL8_GRP_NUM_ENT) +
+			(ledge & ~RTE_DIR24_8_TBL24_MASK)) <<
+			fib->nh_sz);
+		/*update tbl8 with new next hop*/
+		write_to_fib((void *)tbl8_ptr, (next_hop << 1)|
+			RTE_DIR24_8_VALID_EXT_ENT,
+			fib->nh_sz, redge - ledge);
+		tbl8_recycle(fib, ledge, tbl8_idx);
+	}
+	return 0;
+}
+
+static int
+modify_fib(struct rte_rib *rib, uint32_t ip, uint8_t depth,
+	uint64_t next_hop)
+{
+	struct rte_rib_node *tmp = NULL;
+	struct rte_dir24_8_tbl *fib;
+	uint32_t ledge, redge;
+	int ret;
+
+	fib = rib->fib;
+
+	if (next_hop > DIR24_8_MAX_NH(fib))
+		return -EINVAL;
+
+	ledge = ip;
+	do {
+		tmp = rte_rib_tree_get_nxt(rib, ip, depth, tmp,
+			RTE_RIB_GET_NXT_COVER);
+		if (tmp != NULL) {
+			if (tmp->depth == depth)
+				continue;
+			redge = tmp->key;
+			if (ledge == redge) {
+				ledge = redge +
+					(uint32_t)(1ULL << (32 - tmp->depth));
+				continue;
+			}
+			ret = install_to_fib(fib, ledge, redge,
+				next_hop);
+			if (ret != 0)
+				return ret;
+			ledge = redge +
+				(uint32_t)(1ULL << (32 - tmp->depth));
+		} else {
+			redge = ip + (uint32_t)(1ULL << (32 - depth));
+			ret = install_to_fib(fib, ledge, redge,
+				next_hop);
+			if (ret != 0)
+				return ret;
+		}
+	} while (tmp);
+
+	return 0;
+}
+
+int
+rte_dir24_8_modify(struct rte_rib *rib, uint32_t ip, uint8_t depth,
+	uint64_t next_hop, enum rte_rib_op op)
+{
+	struct rte_dir24_8_tbl *fib;
+	struct rte_rib_node *tmp = NULL;
+	struct rte_rib_node *node;
+	struct rte_rib_node *parent;
+	int ret = 0;
+
+	if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
+		return -EINVAL;
+
+	fib = rib->fib;
+	RTE_ASSERT(fib);
+
+	ip &= (uint32_t)(UINT64_MAX << (32 - depth));
+
+	node = rte_rib_tree_lookup_exact(rib, ip, depth);
+	switch (op) {
+	case RTE_RIB_ADD:
+		if (node != NULL) {
+			if (node->nh == next_hop)
+				return 0;
+			ret = modify_fib(rib, ip, depth, next_hop);
+			if (ret == 0)
+				node->nh = next_hop;
+			return 0;
+		}
+		if (depth > 24) {
+			tmp = rte_rib_tree_get_nxt(rib, ip, 24, NULL,
+				RTE_RIB_GET_NXT_COVER);
+			if ((tmp == NULL) &&
+				(fib->cur_tbl8s >= fib->number_tbl8s))
+				return -ENOSPC;
+
+		}
+		node = rte_rib_tree_insert(rib, ip, depth);
+		if (node == NULL)
+			return -rte_errno;
+		node->nh = next_hop;
+		parent = rte_rib_tree_lookup_parent(node);
+		if ((parent != NULL) && (parent->nh == next_hop))
+			return 0;
+		ret = modify_fib(rib, ip, depth, next_hop);
+		if (ret) {
+			rte_rib_tree_remove(rib, ip, depth);
+			return ret;
+		}
+		if ((depth > 24) && (tmp == NULL))
+			fib->cur_tbl8s++;
+		return 0;
+	case RTE_RIB_DEL:
+		if (node == NULL)
+			return -ENOENT;
+
+		parent = rte_rib_tree_lookup_parent(node);
+		if (parent != NULL) {
+			if (parent->nh != node->nh)
+				ret = modify_fib(rib, ip, depth, parent->nh);
+		} else
+			ret = modify_fib(rib, ip, depth, fib->def_nh);
+		if (ret == 0) {
+			rte_rib_tree_remove(rib, ip, depth);
+			if (depth > 24) {
+				tmp = rte_rib_tree_get_nxt(rib, ip, 24, NULL,
+					RTE_RIB_GET_NXT_COVER);
+				if (tmp == NULL)
+					fib->cur_tbl8s--;
+			}
+		}
+		return ret;
+	default:
+		break;
+	}
+	return -EINVAL;
+}
+
+struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int socket_id,
+	enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh)
+{
+	char mem_name[RTE_RIB_NAMESIZE];
+	struct rte_dir24_8_tbl *fib;
+
+	snprintf(mem_name, sizeof(mem_name), "FIB_%s", name);
+	fib = rte_zmalloc_socket(name, sizeof(struct rte_dir24_8_tbl) +
+		RTE_DIR24_8_TBL24_NUM_ENT * (1 << nh_sz), RTE_CACHE_LINE_SIZE,
+		socket_id);
+	if (fib == NULL)
+		return fib;
+
+	snprintf(mem_name, sizeof(mem_name), "TBL8_%s", name);
+	fib->tbl8 = rte_zmalloc_socket(mem_name, RTE_DIR24_8_TBL8_GRP_NUM_ENT *
+			(1 << nh_sz) * RTE_DIR24_8_TBL8_NUM_GROUPS,
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (fib->tbl8 == NULL) {
+		rte_free(fib);
+		return NULL;
+	}
+	fib->def_nh = def_nh;
+	fib->nh_sz = nh_sz;
+	fib->number_tbl8s = RTE_MIN((uint32_t)RTE_DIR24_8_TBL8_NUM_GROUPS,
+				DIR24_8_MAX_NH(fib));
+
+	snprintf(mem_name, sizeof(mem_name), "TBL8_idxes_%s", name);
+	fib->tbl8_idxes = rte_zmalloc_socket(mem_name,
+			RTE_ALIGN_CEIL(fib->number_tbl8s, 64) >> 3,
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (fib->tbl8_idxes == NULL) {
+		rte_free(fib->tbl8);
+		rte_free(fib);
+		return NULL;
+	}
+
+	return fib;
+}
+
+void
+rte_dir24_8_free(void *fib_p)
+{
+	struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
+
+	rte_free(fib->tbl8_idxes);
+	rte_free(fib->tbl8);
+	rte_free(fib);
+}
+
+LOOKUP_FUNC(1b, uint8_t, 5)
+LOOKUP_FUNC(2b, uint16_t, 6)
+LOOKUP_FUNC(4b, uint32_t, 15)
+LOOKUP_FUNC(8b, uint64_t, 12)
diff --git a/lib/librte_rib/rte_dir24_8.h b/lib/librte_rib/rte_dir24_8.h
new file mode 100644
index 0000000..a4a865d
--- /dev/null
+++ b/lib/librte_rib/rte_dir24_8.h
@@ -0,0 +1,115 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#ifndef _RTE_DIR24_8_H_
+#define _RTE_DIR24_8_H_
+
+/**
+ * @file
+ * RTE Longest Prefix Match (LPM)
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/** @internal Total number of tbl24 entries. */
+#define RTE_DIR24_8_TBL24_NUM_ENT	(1 << 24)
+
+/** Maximum depth value possible for IPv4 LPM. */
+#define RTE_DIR24_8_MAX_DEPTH		32
+
+/** @internal Number of entries in a tbl8 group. */
+#define RTE_DIR24_8_TBL8_GRP_NUM_ENT	256
+
+/** @internal Total number of tbl8 groups in the tbl8. */
+#define RTE_DIR24_8_TBL8_NUM_GROUPS	65536
+
+/** @internal bitmask with valid and valid_group fields set */
+#define RTE_DIR24_8_VALID_EXT_ENT	0x01
+
+#define RTE_DIR24_8_TBL24_MASK		0xffffff00
+
+/** Size of nexthop (1 << nh_sz) bits */
+enum rte_dir24_8_nh_sz {
+	RTE_DIR24_8_1B,
+	RTE_DIR24_8_2B,
+	RTE_DIR24_8_4B,
+	RTE_DIR24_8_8B
+};
+
+
+#define DIR24_8_BITS_IN_NH(fib)		(8 * (1 << fib->nh_sz))
+#define DIR24_8_MAX_NH(fib)	((1ULL << (DIR24_8_BITS_IN_NH(fib) - 1)) - 1)
+
+#define DIR24_8_TBL_IDX(a, fib)		((a) >> (3 - fib->nh_sz))
+#define DIR24_8_PSD_IDX(a, fib)		((a) & ((1 << (3 - fib->nh_sz)) - 1))
+
+#define DIR24_8_TBL24_VAL(ip)	(ip >> 8)
+#define DIR24_8_TBL8_VAL(res, ip)					\
+	((res >> 1) * RTE_DIR24_8_TBL8_GRP_NUM_ENT + (uint8_t)ip)	\
+
+#define DIR24_8_LOOKUP_MSK						\
+	(((1ULL << ((1 << (fib->nh_sz + 3)) - 1)) << 1) - 1)		\
+
+#define RTE_DIR24_8_GET_TBL24(fib, ip)					\
+	((fib->tbl24[DIR24_8_TBL_IDX(DIR24_8_TBL24_VAL(ip), fib)] >>	\
+	(DIR24_8_PSD_IDX(DIR24_8_TBL24_VAL(ip), fib) *			\
+	DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK)			\
+
+#define RTE_DIR24_8_GET_TBL8(fib, res, ip)				\
+	((fib->tbl8[DIR24_8_TBL_IDX(DIR24_8_TBL8_VAL(res, ip), fib)] >>	\
+	(DIR24_8_PSD_IDX(DIR24_8_TBL8_VAL(res, ip), fib) *		\
+	DIR24_8_BITS_IN_NH(fib))) & DIR24_8_LOOKUP_MSK)			\
+
+
+struct rte_dir24_8_tbl {
+	uint32_t	number_tbl8s;	/**< Total number of tbl8s. */
+	uint32_t	cur_tbl8s;	/**< Current cumber of tbl8s. */
+	uint64_t	def_nh;
+	enum rte_dir24_8_nh_sz	nh_sz;	/**< Size of nexthop entry */
+	uint64_t	*tbl8;		/**< LPM tbl8 table. */
+	uint64_t	*tbl8_idxes;
+	uint64_t	tbl24[0] __rte_cache_aligned; /**< LPM tbl24 table. */
+};
+
+struct rte_dir24_8_tbl *rte_dir24_8_create(const char *name, int socket_id,
+	enum rte_dir24_8_nh_sz nh_sz, uint64_t def_nh);
+void rte_dir24_8_free(void *fib_p);
+int rte_dir24_8_modify(struct rte_rib *rib, uint32_t key,
+	uint8_t depth, uint64_t next_hop, enum rte_rib_op op);
+int rte_dir24_8_lookup_bulk_1b(void *fib_p, const uint32_t *ips,
+	uint64_t *next_hops, const unsigned int n);
+int rte_dir24_8_lookup_bulk_2b(void *fib_p, const uint32_t *ips,
+	uint64_t *next_hops, const unsigned int n);
+int rte_dir24_8_lookup_bulk_4b(void *fib_p, const uint32_t *ips,
+	uint64_t *next_hops, const unsigned int n);
+int rte_dir24_8_lookup_bulk_8b(void *fib_p, const uint32_t *ips,
+	uint64_t *next_hops, const unsigned int n);
+
+
+static inline int
+rte_dir24_8_lookup(void *fib_p, uint32_t ip, uint64_t *next_hop)
+{
+	uint64_t res;
+	struct rte_dir24_8_tbl *fib = (struct rte_dir24_8_tbl *)fib_p;
+
+	RTE_RIB_RETURN_IF_TRUE(((fib == NULL) || (next_hop == NULL)), -EINVAL);
+
+	res = RTE_DIR24_8_GET_TBL24(fib, ip);
+	if (unlikely((res & RTE_DIR24_8_VALID_EXT_ENT) ==
+		RTE_DIR24_8_VALID_EXT_ENT)) {
+		res = RTE_DIR24_8_GET_TBL8(fib, res, ip);
+	}
+	*next_hop = res >> 1;
+	return 0;
+}
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_DIR24_8_H_ */
+
diff --git a/lib/librte_rib/rte_rib.c b/lib/librte_rib/rte_rib.c
new file mode 100644
index 0000000..7783b23
--- /dev/null
+++ b/lib/librte_rib/rte_rib.c
@@ -0,0 +1,526 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/queue.h>
+
+#include <rte_eal.h>
+#include <rte_eal_memconfig.h>
+#include <rte_common.h>
+#include <rte_tailq.h>
+#include <rte_errno.h>
+#include <rte_rwlock.h>
+#include <rte_memory.h>
+#include <rte_memzone.h>
+#include <rte_mempool.h>
+#include <rte_malloc.h>
+#include <rte_log.h>
+
+#include <rte_rib.h>
+#include <rte_dir24_8.h>
+
+TAILQ_HEAD(rte_rib_list, rte_tailq_entry);
+static struct rte_tailq_elem rte_rib_tailq = {
+	.name = "RTE_RIB",
+};
+EAL_REGISTER_TAILQ(rte_rib_tailq)
+
+static struct rte_rib_node *
+new_node_malloc(struct rte_rib *rib)
+{
+	struct rte_rib_node *ent;
+
+	ent =  malloc(rib->node_sz);
+	if (unlikely(ent == NULL))
+		return NULL;
+	++rib->cur_nodes;
+	return ent;
+}
+
+static void
+free_node_malloc(__rte_unused struct rte_rib *rib, struct rte_rib_node *ent)
+{
+	--rib->cur_nodes;
+	free(ent);
+}
+
+static struct rte_rib_node *
+new_node_mempool(struct rte_rib *rib)
+{
+	struct rte_rib_node *ent;
+	int ret;
+
+	ret = rte_mempool_get(rib->node_pool, (void *)&ent);
+	if (unlikely(ret != 0))
+		return NULL;
+	++rib->cur_nodes;
+	return ent;
+}
+
+static void
+free_node_mempool(struct rte_rib *rib, struct rte_rib_node *ent)
+{
+	--rib->cur_nodes;
+	rte_mempool_put(rib->node_pool, ent);
+}
+
+struct rte_rib_node *
+rte_rib_tree_lookup(struct rte_rib *rib, uint32_t key)
+{
+	struct rte_rib_node *cur = rib->trie;
+	struct rte_rib_node *prev = NULL;
+
+	while ((cur != NULL) && (((cur->key ^ key) &
+		(uint32_t)(UINT64_MAX << (32 - cur->depth))) == 0)) {
+		if ((cur->flag & RTE_RIB_VALID_NODE) == RTE_RIB_VALID_NODE)
+			prev = cur;
+		cur = RTE_RIB_GET_NXT_NODE(cur, key);
+	}
+	return prev;
+}
+
+struct rte_rib_node *
+rte_rib_tree_lookup_parent(struct rte_rib_node *ent)
+{
+	struct rte_rib_node *tmp;
+
+	if (ent == NULL)
+		return NULL;
+	tmp = ent->parent;
+	while ((tmp != NULL) &&
+		(tmp->flag & RTE_RIB_VALID_NODE) != RTE_RIB_VALID_NODE) {
+		tmp = tmp->parent;
+}
+	return tmp;
+}
+
+struct rte_rib_node *
+rte_rib_tree_lookup_exact(struct rte_rib *rib, uint32_t key, uint8_t depth)
+{
+	struct rte_rib_node *cur = rib->trie;
+
+	key &= (uint32_t)(UINT64_MAX << (32 - depth));
+	while (cur != NULL) {
+		if ((cur->key == key) && (cur->depth == depth) &&
+				(cur->flag & RTE_RIB_VALID_NODE))
+			return cur;
+		if ((cur->depth > depth) ||
+				(((uint64_t)key >> (32 - cur->depth)) !=
+				((uint64_t)cur->key >> (32 - cur->depth))))
+			break;
+		cur = RTE_RIB_GET_NXT_NODE(cur, key);
+	}
+	return NULL;
+}
+
+struct rte_rib_node *
+rte_rib_tree_get_nxt(struct rte_rib *rib, uint32_t key,
+	uint8_t depth, struct rte_rib_node *cur, int flag)
+{
+	struct rte_rib_node *tmp, *prev = NULL;
+
+	if (cur == NULL) {
+		tmp = rib->trie;
+		while ((tmp) && (tmp->depth < depth))
+			tmp = RTE_RIB_GET_NXT_NODE(tmp, key);
+	} else {
+		tmp = cur;
+		while ((tmp->parent != NULL) && (RTE_RIB_IS_RIGHT_NODE(tmp) ||
+				(tmp->parent->right == NULL))) {
+			tmp = tmp->parent;
+			if ((tmp->flag & RTE_RIB_VALID_NODE) &&
+				(RTE_RIB_IS_COVERED(tmp->key, tmp->depth,
+					key, depth)))
+				return tmp;
+		}
+		tmp = (tmp->parent) ? tmp->parent->right : NULL;
+	}
+	while (tmp) {
+		if ((tmp->flag & RTE_RIB_VALID_NODE) &&
+			(RTE_RIB_IS_COVERED(tmp->key, tmp->depth,
+				key, depth))) {
+			prev = tmp;
+			if (flag == RTE_RIB_GET_NXT_COVER)
+				return prev;
+		}
+		tmp = (tmp->left) ? tmp->left : tmp->right;
+	}
+	return prev;
+}
+
+void
+rte_rib_tree_remove(struct rte_rib *rib, uint32_t key, uint8_t depth)
+{
+	struct rte_rib_node *cur, *prev, *child;
+
+	cur = rte_rib_tree_lookup_exact(rib, key, depth);
+	if (cur == NULL)
+		return;
+
+	--rib->cur_routes;
+	cur->flag &= ~RTE_RIB_VALID_NODE;
+	while ((cur->flag & RTE_RIB_VALID_NODE) != RTE_RIB_VALID_NODE) {
+		if ((cur->left != NULL) && (cur->right != NULL))
+			return;
+		child = (cur->left == NULL) ? cur->right : cur->left;
+		if (child != NULL)
+			child->parent = cur->parent;
+		if (cur->parent == NULL) {
+			rib->trie = child;
+			rib->free_node(rib, cur);
+			return;
+		}
+		if (cur->parent->left == cur)
+			cur->parent->left = child;
+		else
+			cur->parent->right = child;
+		prev = cur;
+		cur = cur->parent;
+		rib->free_node(rib, prev);
+	}
+}
+
+struct rte_rib_node *
+rte_rib_tree_insert(struct rte_rib *rib, uint32_t key, uint8_t depth)
+{
+	struct rte_rib_node **tmp = &rib->trie;
+	struct rte_rib_node *prev = NULL;
+	struct rte_rib_node *new_node = NULL;
+	struct rte_rib_node *common_node = NULL;
+	int i = 0;
+	uint32_t common_prefix;
+	uint8_t common_depth;
+
+	if (depth > 32) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	key &= (uint32_t)(UINT64_MAX << (32 - depth));
+	new_node = rte_rib_tree_lookup_exact(rib, key, depth);
+	if (new_node != NULL) {
+		rte_errno = EEXIST;
+		return NULL;
+	}
+
+	new_node = rib->alloc_node(rib);
+	if (new_node == NULL) {
+		rte_errno = ENOMEM;
+		return NULL;
+	}
+	new_node->left = NULL;
+	new_node->right = NULL;
+	new_node->parent = NULL;
+	new_node->key = key;
+	new_node->depth = depth;
+	new_node->flag = RTE_RIB_VALID_NODE;
+
+	while (1) {
+		if (*tmp == NULL) {
+			*tmp = new_node;
+			new_node->parent = prev;
+		}
+		if ((key == (*tmp)->key) && (depth == (*tmp)->depth)) {
+			if (new_node != *tmp) {
+				rib->free_node(rib, new_node);
+				(*tmp)->flag |= RTE_RIB_VALID_NODE;
+			}
+			++rib->cur_routes;
+			return *tmp;
+		}
+		i = (*tmp)->depth;
+		if ((i >= depth) || (((uint64_t)key >> (32 - i)) !=
+				((uint64_t)(*tmp)->key >> (32 - i))))
+			break;
+		prev = *tmp;
+		tmp = (key & (1 << (31 - i))) ? &(*tmp)->right : &(*tmp)->left;
+	}
+	common_depth = RTE_MIN(depth, (*tmp)->depth);
+	common_prefix = key ^ (*tmp)->key;
+	i = __builtin_clz(common_prefix);
+
+	common_depth = RTE_MIN(i, common_depth);
+	common_prefix = key & (uint32_t)(UINT64_MAX << (32 - common_depth));
+	if ((common_prefix == key) && (common_depth == depth)) {
+		if ((*tmp)->key & (1 << (31 - depth)))
+			new_node->right = *tmp;
+		else
+			new_node->left = *tmp;
+		new_node->parent = (*tmp)->parent;
+		(*tmp)->parent = new_node;
+		*tmp = new_node;
+	} else {
+		common_node = rib->alloc_node(rib);
+		if (common_node == NULL) {
+			rib->free_node(rib, new_node);
+			rte_errno = ENOMEM;
+			return NULL;
+		}
+		common_node->key = common_prefix;
+		common_node->depth = common_depth;
+		common_node->flag = 0;
+		common_node->parent = (*tmp)->parent;
+		new_node->parent = common_node;
+		(*tmp)->parent = common_node;
+		if ((new_node->key & (1 << (31 - common_depth))) == 0) {
+			common_node->left = new_node;
+			common_node->right = *tmp;
+		} else {
+			common_node->left = *tmp;
+			common_node->right = new_node;
+		}
+		*tmp = common_node;
+	}
+	++rib->cur_routes;
+	return new_node;
+}
+
+struct rte_rib *
+rte_rib_create(const char *name, int socket_id, struct rte_rib_conf *conf)
+{
+	char mem_name[RTE_RIB_NAMESIZE];
+	struct rte_rib *rib = NULL;
+	struct rte_tailq_entry *te;
+	struct rte_rib_list *rib_list;
+	struct rte_mempool *node_pool = NULL;
+	enum rte_dir24_8_nh_sz dir24_8_nh_size;
+
+	/* Check user arguments. */
+	if ((name == NULL) || (conf == NULL) || (socket_id < -1) ||
+			(conf->type >= RTE_RIB_TYPE_MAX) ||
+			(conf->alloc_type >= RTE_RIB_ALLOC_MAX) ||
+			(conf->max_nodes == 0) ||
+			(conf->node_sz < sizeof(struct rte_rib_node))) {
+		rte_errno = EINVAL;
+		return NULL;
+	}
+
+	if (conf->alloc_type == RTE_RIB_MEMPOOL) {
+		snprintf(mem_name, sizeof(mem_name), "MP_%s", name);
+		node_pool = rte_mempool_create(mem_name, conf->max_nodes,
+			conf->node_sz, 0, 0, NULL, NULL, NULL, NULL,
+			socket_id, 0);
+
+		if (node_pool == NULL) {
+			RTE_LOG(ERR, LPM,
+				"Can not allocate mempool for RIB %s\n", name);
+			rte_errno = ENOMEM;
+			return NULL;
+		}
+
+	}
+
+	snprintf(mem_name, sizeof(mem_name), "RIB_%s", name);
+
+	rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
+
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+	/* guarantee there's no existing */
+	TAILQ_FOREACH(te, rib_list, next) {
+		rib = (struct rte_rib *)te->data;
+		if (strncmp(name, rib->name, RTE_RIB_NAMESIZE) == 0)
+			break;
+	}
+	rib = NULL;
+	if (te != NULL) {
+		rte_errno = EEXIST;
+		goto exit;
+	}
+
+	/* allocate tailq entry */
+	te = rte_zmalloc("RIB_TAILQ_ENTRY", sizeof(*te), 0);
+	if (te == NULL) {
+		RTE_LOG(ERR, LPM,
+			"Can not allocate tailq entry for RIB %s\n", name);
+		rte_errno = ENOMEM;
+		goto exit;
+	}
+
+	/* Allocate memory to store the LPM data structures. */
+	rib = (struct rte_rib *)rte_zmalloc_socket(mem_name,
+		sizeof(struct rte_rib),	RTE_CACHE_LINE_SIZE, socket_id);
+	if (rib == NULL) {
+		RTE_LOG(ERR, LPM, "RIB %s memory allocation failed\n", name);
+		rte_errno = ENOMEM;
+		goto free_te;
+	}
+	snprintf(rib->name, sizeof(rib->name), "%s", name);
+	rib->trie = NULL;
+	rib->max_nodes = conf->max_nodes;
+	rib->node_sz = conf->node_sz;
+	rib->type = conf->type;
+	rib->alloc_type = conf->alloc_type;
+
+	if (conf->type <= RTE_RIB_DIR24_8_8B) {
+		switch (conf->type) {
+		case RTE_RIB_DIR24_8_1B:
+			dir24_8_nh_size = RTE_DIR24_8_1B;
+			rib->lookup = rte_dir24_8_lookup_bulk_1b;
+			break;
+		case RTE_RIB_DIR24_8_2B:
+			dir24_8_nh_size = RTE_DIR24_8_2B;
+			rib->lookup = rte_dir24_8_lookup_bulk_2b;
+			break;
+		case RTE_RIB_DIR24_8_4B:
+			dir24_8_nh_size = RTE_DIR24_8_4B;
+			rib->lookup = rte_dir24_8_lookup_bulk_4b;
+			break;
+		case RTE_RIB_DIR24_8_8B:
+			dir24_8_nh_size = RTE_DIR24_8_8B;
+			rib->lookup = rte_dir24_8_lookup_bulk_8b;
+			break;
+		case RTE_RIB_TYPE_MAX:
+		default:
+			RTE_LOG(ERR, LPM, "Bad RIB %s type\n", name);
+			rte_errno = EINVAL;
+			goto free_rib;
+		}
+		rib->fib = (void *)rte_dir24_8_create(name, socket_id,
+				dir24_8_nh_size, conf->def_nh);
+		if (rib->fib == NULL) {
+			RTE_LOG(ERR, LPM, "Failed to allocate FIB %s\n", name);
+			rte_errno = ENOMEM;
+			goto free_rib;
+		}
+		rib->modify = rte_dir24_8_modify;
+	}
+
+	switch (conf->alloc_type) {
+	case RTE_RIB_MALLOC:
+		rib->alloc_node = new_node_malloc;
+		rib->free_node = free_node_malloc;
+		break;
+	case RTE_RIB_MEMPOOL:
+		rib->node_pool = node_pool;
+		rib->alloc_node = new_node_mempool;
+		rib->free_node = free_node_mempool;
+		break;
+	case RTE_RIB_ALLOC_MAX:
+	default:
+		RTE_LOG(ERR, LPM, "Bad RIB %s alloc type\n", name);
+		rte_errno = EINVAL;
+		goto free_fib;
+	}
+
+	te->data = (void *)rib;
+	TAILQ_INSERT_TAIL(rib_list, te, next);
+
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	return rib;
+
+free_fib:
+	switch (conf->type) {
+	case RTE_RIB_DIR24_8_1B:
+	case RTE_RIB_DIR24_8_2B:
+	case RTE_RIB_DIR24_8_4B:
+	case RTE_RIB_DIR24_8_8B:
+		rte_dir24_8_free(rib->fib);
+		break;
+	default:
+		break;
+	}
+free_rib:
+	rte_free(rib);
+free_te:
+	rte_free(te);
+exit:
+	if (conf->alloc_type == RTE_RIB_MEMPOOL)
+		rte_mempool_free(node_pool);
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	return NULL;
+}
+
+struct rte_rib *
+rte_rib_find_existing(const char *name)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_tailq_entry *te;
+	struct rte_rib_list *rib_list;
+
+	rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
+
+	rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK);
+	TAILQ_FOREACH(te, rib_list, next) {
+		rib = (struct rte_rib *) te->data;
+		if (strncmp(name, rib->name, RTE_RIB_NAMESIZE) == 0)
+			break;
+	}
+	rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	if (te == NULL) {
+		rte_errno = ENOENT;
+		return NULL;
+	}
+
+	return rib;
+}
+
+void
+rte_rib_free(struct rte_rib *rib)
+{
+	struct rte_tailq_entry *te;
+	struct rte_rib_list *rib_list;
+	struct rte_rib_node *tmp = NULL;
+
+	if (rib == NULL)
+		return;
+
+	rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
+
+	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
+
+	/* find our tailq entry */
+	TAILQ_FOREACH(te, rib_list, next) {
+		if (te->data == (void *)rib)
+			break;
+	}
+	if (te != NULL)
+		TAILQ_REMOVE(rib_list, te, next);
+
+	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
+
+	while ((tmp = rte_rib_tree_get_nxt(rib, 0, 0, tmp,
+		RTE_RIB_GET_NXT_ALL)) != NULL)
+		rte_rib_tree_remove(rib, tmp->key, tmp->depth);
+
+	if (rib->alloc_type == RTE_RIB_MEMPOOL)
+		rte_mempool_free(rib->node_pool);
+
+	switch (rib->type) {
+	case RTE_RIB_DIR24_8_1B:
+	case RTE_RIB_DIR24_8_2B:
+	case RTE_RIB_DIR24_8_4B:
+	case RTE_RIB_DIR24_8_8B:
+		rte_dir24_8_free(rib->fib);
+		break;
+	default:
+		break;
+	}
+
+	rte_free(rib);
+	rte_free(te);
+}
+
+int
+rte_rib_add(struct rte_rib *rib, uint32_t ip, uint8_t depth, uint64_t next_hop)
+{
+	if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
+		return -EINVAL;
+
+	return rib->modify(rib, ip, depth, next_hop, RTE_RIB_ADD);
+}
+
+int
+rte_rib_delete(struct rte_rib *rib, uint32_t ip, uint8_t depth)
+{
+	if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
+		return -EINVAL;
+
+	return rib->modify(rib, ip, depth, 0, RTE_RIB_DEL);
+}
diff --git a/lib/librte_rib/rte_rib.h b/lib/librte_rib/rte_rib.h
new file mode 100644
index 0000000..9ca1591
--- /dev/null
+++ b/lib/librte_rib/rte_rib.h
@@ -0,0 +1,322 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#ifndef _RTE_RIB_H_
+#define _RTE_RIB_H_
+
+/**
+ * @file
+ * Compressed trie implementation for Longest Prefix Match
+ */
+
+/** @internal Macro to enable/disable run-time checks. */
+#if defined(RTE_LIBRTE_RIB_DEBUG)
+#define RTE_RIB_RETURN_IF_TRUE(cond, retval) do {	\
+	if (cond)					\
+		return retval;				\
+} while (0)
+#else
+#define RTE_RIB_RETURN_IF_TRUE(cond, retval)
+#endif
+
+#define RTE_RIB_VALID_NODE	1
+#define RTE_RIB_GET_NXT_ALL	0
+#define RTE_RIB_GET_NXT_COVER	1
+
+#define RTE_RIB_INVALID_ROUTE	0
+#define RTE_RIB_VALID_ROUTE	1
+
+/** Max number of characters in RIB name. */
+#define RTE_RIB_NAMESIZE	64
+
+/** Maximum depth value possible for IPv4 RIB. */
+#define RTE_RIB_MAXDEPTH	32
+
+/**
+ * Macro to check if prefix1 {key1/depth1}
+ * is covered by prefix2 {key2/depth2}
+ */
+#define RTE_RIB_IS_COVERED(key1, depth1, key2, depth2)			\
+	((((key1 ^ key2) & (uint32_t)(UINT64_MAX << (32 - depth2))) == 0)\
+		&& (depth1 > depth2))
+
+/** @internal Macro to get next node in tree*/
+#define RTE_RIB_GET_NXT_NODE(node, key)					\
+	((key & (1 << (31 - node->depth))) ? node->right : node->left)
+/** @internal Macro to check if node is right child*/
+#define RTE_RIB_IS_RIGHT_NODE(node)	(node->parent->right == node)
+
+
+struct rte_rib_node {
+	struct rte_rib_node *left;
+	struct rte_rib_node *right;
+	struct rte_rib_node *parent;
+	uint32_t	key;
+	uint8_t		depth;
+	uint8_t		flag;
+	uint64_t	nh;
+	uint64_t	ext[0];
+};
+
+struct rte_rib;
+
+/** Type of FIB struct*/
+enum rte_rib_type {
+	RTE_RIB_DIR24_8_1B,
+	RTE_RIB_DIR24_8_2B,
+	RTE_RIB_DIR24_8_4B,
+	RTE_RIB_DIR24_8_8B,
+	RTE_RIB_TYPE_MAX
+};
+
+enum rte_rib_op {
+	RTE_RIB_ADD,
+	RTE_RIB_DEL
+};
+
+/** RIB nodes allocation type */
+enum rte_rib_alloc_type {
+	RTE_RIB_MALLOC,
+	RTE_RIB_MEMPOOL,
+	RTE_RIB_ALLOC_MAX
+};
+
+typedef int (*rte_rib_modify_fn_t)(struct rte_rib *rib, uint32_t key,
+	uint8_t depth, uint64_t next_hop, enum rte_rib_op op);
+typedef int (*rte_rib_tree_lookup_fn_t)(void *fib, const uint32_t *ips,
+	uint64_t *next_hops, const unsigned int n);
+typedef struct rte_rib_node *(*rte_rib_alloc_node_fn_t)(struct rte_rib *rib);
+typedef void (*rte_rib_free_node_fn_t)(struct rte_rib *rib,
+	struct rte_rib_node *node);
+
+struct rte_rib {
+	char name[RTE_RIB_NAMESIZE];
+	/*pointer to rib trie*/
+	struct rte_rib_node	*trie;
+	/*pointer to dataplane struct*/
+	void	*fib;
+	/*prefix modification*/
+	rte_rib_modify_fn_t	modify;
+	/* Bulk lookup fn*/
+	rte_rib_tree_lookup_fn_t	lookup;
+	/*alloc trie element*/
+	rte_rib_alloc_node_fn_t	alloc_node;
+	/*free trie element*/
+	rte_rib_free_node_fn_t	free_node;
+	struct rte_mempool	*node_pool;
+	uint32_t		cur_nodes;
+	uint32_t		cur_routes;
+	int			max_nodes;
+	int			node_sz;
+	enum rte_rib_type	type;
+	enum rte_rib_alloc_type	alloc_type;
+};
+
+/** RIB configuration structure */
+struct rte_rib_conf {
+	enum rte_rib_type	type;
+	enum rte_rib_alloc_type	alloc_type;
+	int	max_nodes;
+	size_t	node_sz;
+	uint64_t def_nh;
+};
+
+/**
+ * Lookup an IP into the RIB structure
+ *
+ * @param rib
+ *  RIB object handle
+ * @param key
+ *  IP to be looked up in the RIB
+ * @return
+ *  pointer to struct rte_rib_node on success,
+ *  NULL otherwise
+ */
+struct rte_rib_node *
+rte_rib_tree_lookup(struct rte_rib *rib, uint32_t key);
+
+/**
+ * Lookup less specific route into the RIB structure
+ *
+ * @param ent
+ *  Pointer to struct rte_rib_node that represents target route
+ * @return
+ *  pointer to struct rte_rib_node that represents
+ *  less specific route on success,
+ *  NULL otherwise
+ */
+struct rte_rib_node *
+rte_rib_tree_lookup_parent(struct rte_rib_node *ent);
+
+/**
+ * Lookup prefix into the RIB structure
+ *
+ * @param rib
+ *  RIB object handle
+ * @param key
+ *  net to be looked up in the RIB
+ * @param depth
+ *  prefix length
+ * @return
+ *  pointer to struct rte_rib_node on success,
+ *  NULL otherwise
+ */
+struct rte_rib_node *
+rte_rib_tree_lookup_exact(struct rte_rib *rib, uint32_t key, uint8_t depth);
+
+/**
+ * Retrieve next more specific prefix from the RIB
+ * that is covered by key/depth supernet
+ *
+ * @param rib
+ *  RIB object handle
+ * @param key
+ *  net address of supernet prefix that covers returned more specific prefixes
+ * @param depth
+ *  supernet prefix length
+ * @param cur
+ *   pointer to the last returned prefix to get next prefix
+ *   or
+ *   NULL to get first more specific prefix
+ * @param flag
+ *  -RTE_RIB_GET_NXT_ALL
+ *   get all prefixes from subtrie
+ *  -RTE_RIB_GET_NXT_COVER
+ *   get only first more specific prefix even if it have more specifics
+ * @return
+ *  pointer to the next more specific prefix
+ *  or
+ *  NULL if there is no prefixes left
+ */
+struct rte_rib_node *
+rte_rib_tree_get_nxt(struct rte_rib *rib, uint32_t key, uint8_t depth,
+	struct rte_rib_node *cur, int flag);
+
+/**
+ * Remove prefix from the RIB
+ *
+ * @param rib
+ *  RIB object handle
+ * @param key
+ *  net to be removed from the RIB
+ * @param depth
+ *  prefix length
+ */
+void
+rte_rib_tree_remove(struct rte_rib *rib, uint32_t key, uint8_t depth);
+
+/**
+ * Insert prefix into the RIB
+ *
+ * @param rib
+ *  RIB object handle
+ * @param key
+ *  net to be inserted to the RIB
+ * @param depth
+ *  prefix length
+ * @return
+ *  pointer to new rte_rib_node on success
+ *  NULL otherwise
+ */
+struct rte_rib_node *
+rte_rib_tree_insert(struct rte_rib *rib, uint32_t key, uint8_t depth);
+
+/**
+ * Create RIB
+ *
+ * @param name
+ *  RIB name
+ * @param socket_id
+ *  NUMA socket ID for RIB table memory allocation
+ * @param conf
+ *  Structure containing the configuration
+ * @return
+ *  Handle to RIB object on success
+ *  NULL otherwise with rte_errno set to an appropriate values.
+ */
+struct rte_rib *
+rte_rib_create(const char *name, int socket_id, struct rte_rib_conf *conf);
+
+/**
+ * Find an existing RIB object and return a pointer to it.
+ *
+ * @param name
+ *  Name of the rib object as passed to rte_rib_create()
+ * @return
+ *  Pointer to rib object or NULL if object not found with rte_errno
+ *  set appropriately. Possible rte_errno values include:
+ *   - ENOENT - required entry not available to return.
+ */
+struct rte_rib *
+rte_rib_find_existing(const char *name);
+
+/**
+ * Free an RIB object.
+ *
+ * @param rib
+ *   RIB object handle
+ * @return
+ *   None
+ */
+void
+rte_rib_free(struct rte_rib *rib);
+
+/**
+ * Add a rule to the RIB.
+ *
+ * @param rib
+ *   RIB object handle
+ * @param ip
+ *   IP of the rule to be added to the RIB
+ * @param depth
+ *   Depth of the rule to be added to the RIB
+ * @param next_hop
+ *   Next hop of the rule to be added to the RIB
+ * @return
+ *   0 on success, negative value otherwise
+ */
+int
+rte_rib_add(struct rte_rib *rib, uint32_t ip, uint8_t depth, uint64_t next_hop);
+
+/**
+ * Delete a rule from the RIB.
+ *
+ * @param rib
+ *   RIB object handle
+ * @param ip
+ *   IP of the rule to be deleted from the RIB
+ * @param depth
+ *   Depth of the rule to be deleted from the RIB
+ * @return
+ *   0 on success, negative value otherwise
+ */
+int
+rte_rib_delete(struct rte_rib *rib, uint32_t ip, uint8_t depth);
+
+/**
+ * Lookup multiple IP addresses in an FIB. This may be implemented as a
+ * macro, so the address of the function should not be used.
+ *
+ * @param RIB
+ *   RIB object handle
+ * @param ips
+ *   Array of IPs to be looked up in the FIB
+ * @param next_hops
+ *   Next hop of the most specific rule found for IP.
+ *   This is an array of eight byte values.
+ *   If the lookup for the given IP failed, then corresponding element would
+ *   contain default value, see description of then next parameter.
+ * @param n
+ *   Number of elements in ips (and next_hops) array to lookup. This should be a
+ *   compile time constant, and divisible by 8 for best performance.
+ * @param defv
+ *   Default value to populate into corresponding element of hop[] array,
+ *   if lookup would fail.
+ *  @return
+ *   -EINVAL for incorrect arguments, otherwise 0
+ */
+#define rte_rib_fib_lookup_bulk(rib, ips, next_hops, n)	\
+	rib->lookup(rib->fib, ips, next_hops, n)
+
+#endif /* _RTE_RIB_H_ */
diff --git a/lib/librte_rib/rte_rib_version.map b/lib/librte_rib/rte_rib_version.map
new file mode 100644
index 0000000..b193d6c
--- /dev/null
+++ b/lib/librte_rib/rte_rib_version.map
@@ -0,0 +1,18 @@
+DPDK_17.08 {
+	global:
+
+	rte_rib_create;
+	rte_rib_free;
+	rte_rib_tree_lookup;
+	rte_rib_tree_lookup_parent;
+	rte_rib_tree_lookup_exact;
+	rte_rib_tree_get_nxt;
+	rte_rib_tree_remove;
+	rte_rib_tree_insert;
+	rte_rib_find_existing;
+	rte_rib_add;
+	rte_rib_delete;
+	rte_rib_delete_all;
+
+	local: *;
+};
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 3eb41d1..4708aa4 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -70,6 +70,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_GRO)            += -lrte_gro
 _LDLIBS-$(CONFIG_RTE_LIBRTE_GSO)            += -lrte_gso
 _LDLIBS-$(CONFIG_RTE_LIBRTE_METER)          += -lrte_meter
 _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)            += -lrte_lpm
+_LDLIBS-$(CONFIG_RTE_LIBRTE_RIB)            += -lrte_rib
 # librte_acl needs --whole-archive because of weak functions
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += --whole-archive
 _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)            += -lrte_acl
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v3 2/2] Add autotests for RIB library
  2018-02-22 22:50 [PATCH v3 0/2] lib/rib: Add Routing Information Base library Medvedkin Vladimir
  2018-02-22 22:50 ` [PATCH v3 1/2] Add RIB library Medvedkin Vladimir
@ 2018-02-22 22:50 ` Medvedkin Vladimir
  1 sibling, 0 replies; 7+ messages in thread
From: Medvedkin Vladimir @ 2018-02-22 22:50 UTC (permalink / raw)
  To: dev; +Cc: thomas, bruce.richardson, Medvedkin Vladimir

Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>
---
 test/test/Makefile               |   5 +
 test/test/test_rib.c             | 290 ++++++++++++++++++++++++++++++++++++++
 test/test/test_rib_generate_rt.c | 297 +++++++++++++++++++++++++++++++++++++++
 test/test/test_rib_generate_rt.h |  38 +++++
 test/test/test_rib_lpm_comp.c    | 187 ++++++++++++++++++++++++
 test/test/test_rib_perf.c        | 163 +++++++++++++++++++++
 6 files changed, 980 insertions(+)
 create mode 100644 test/test/test_rib.c
 create mode 100644 test/test/test_rib_generate_rt.c
 create mode 100644 test/test/test_rib_generate_rt.h
 create mode 100644 test/test/test_rib_lpm_comp.c
 create mode 100644 test/test/test_rib_perf.c

diff --git a/test/test/Makefile b/test/test/Makefile
index a88cc38..7644f6d 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -119,6 +119,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm6.c
 SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_lpm6_perf.c
 
+SRCS-$(CONFIG_RTE_LIBRTE_RIB) += test_rib.c
+SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_rib_generate_rt.c
+SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_rib_perf.c
+SRCS-$(CONFIG_RTE_LIBRTE_LPM) += test_rib_lpm_comp.c
+
 SRCS-y += test_debug.c
 SRCS-y += test_errno.c
 SRCS-y += test_tailq.c
diff --git a/test/test/test_rib.c b/test/test/test_rib.c
new file mode 100644
index 0000000..197d412
--- /dev/null
+++ b/test/test/test_rib.c
@@ -0,0 +1,290 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include <rte_ip.h>
+#include <rte_rib.h>
+
+#include "test.h"
+#include "test_xmmt_ops.h"
+#include <rte_dir24_8.h>
+
+
+#define TEST_RIB_ASSERT(cond) do {				\
+	if (!(cond)) {						\
+		printf("Error at line %d:\n", __LINE__);	\
+		return -1;					\
+	}							\
+} while (0)
+
+typedef int32_t (*rte_rib_test)(void);
+
+static int32_t test0(void);
+static int32_t test1(void);
+static int32_t test2(void);
+static int32_t test3(void);
+static int32_t test4(void);
+static int32_t test5(void);
+
+static rte_rib_test tests[] = {
+/* Test Cases */
+	test0,
+	test1,
+	test2,
+	test3,
+	test4,
+	test5
+};
+
+#define NUM_RIB_TESTS (sizeof(tests)/sizeof(tests[0]))
+#define MAX_DEPTH 32
+#define MAX_RULES (1 << 22)
+#define NUMBER_TBL8S 4096
+#define PASS 0
+
+/*
+ * Check that rte_rib_create fails gracefully for incorrect user input
+ * arguments
+ */
+int32_t
+test0(void)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_rib_conf config;
+
+	config.type = RTE_RIB_DIR24_8_1B;
+	config.alloc_type = RTE_RIB_MALLOC;
+	config.max_nodes = MAX_RULES;
+	config.node_sz = sizeof(struct rte_rib_node);
+
+	/* rte_rib_create: rib name == NULL */
+	rib = rte_rib_create(NULL, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib == NULL);
+
+	/* rte_rib_create: config == NULL */
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, NULL);
+	TEST_RIB_ASSERT(rib == NULL);
+
+	/* socket_id < -1 is invalid */
+	rib = rte_rib_create(__func__, -2, &config);
+	TEST_RIB_ASSERT(rib == NULL);
+
+	/* rte_rib_create: max_nodes = 0 */
+	config.max_nodes = 0;
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib == NULL);
+	config.max_nodes = MAX_RULES;
+
+	/* rte_rib_create: node_sz = 0 */
+	config.node_sz = 0;
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib == NULL);
+	config.node_sz = sizeof(struct rte_rib_node);
+
+	/* rte_rib_create: invalid alloc type */
+	config.alloc_type = RTE_RIB_ALLOC_MAX;
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib == NULL);
+	config.alloc_type = RTE_RIB_MALLOC;
+
+	/* rte_rib_create: invalid type */
+	config.type = RTE_RIB_TYPE_MAX;
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib == NULL);
+
+	return PASS;
+}
+
+/*
+ * Create rib table then delete rib table 10 times
+ * for every rib type/node alloc type
+ * Use a slightly different rules size each time
+ */
+int32_t
+test1(void)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_rib_conf config;
+	int32_t i, j, k;
+
+	config.node_sz = sizeof(struct rte_rib_node);
+
+	for (k = RTE_RIB_MALLOC; k < RTE_RIB_ALLOC_MAX; k++) {
+		config.alloc_type = k;
+		for (j = RTE_RIB_DIR24_8_1B; j < RTE_RIB_TYPE_MAX; j++) {
+			config.type = j;
+			/* rte_rib_free: Free NULL */
+			for (i = 0; i < 2; i++) {
+				config.max_nodes = MAX_RULES - i;
+				rib = rte_rib_create(__func__, SOCKET_ID_ANY,
+					&config);
+				TEST_RIB_ASSERT(rib != NULL);
+				rte_rib_free(rib);
+			}
+		}
+	}
+	/* Can not test free so return success */
+	return PASS;
+}
+
+/*
+ * Call rte_rib_free for NULL pointer user input. Note: free has no return and
+ * therefore it is impossible to check for failure but this test is added to
+ * increase function coverage metrics and to validate that freeing null does
+ * not crash.
+ */
+int32_t
+test2(void)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_rib_conf config;
+
+	config.type = RTE_RIB_DIR24_8_1B;
+	config.alloc_type = RTE_RIB_MALLOC;
+	config.max_nodes = MAX_RULES;
+	config.node_sz = sizeof(struct rte_rib_node);
+
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib != NULL);
+
+	rte_rib_free(rib);
+	rte_rib_free(NULL);
+	return PASS;
+}
+
+/*
+ * Check that rte_rib_add fails gracefully for incorrect user input arguments
+ */
+int32_t
+test3(void)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_rib_conf config;
+	uint32_t ip = IPv4(0, 0, 0, 0);
+	uint64_t next_hop = 100;
+	uint8_t depth = 24;
+	int32_t status = 0;
+
+	config.type = RTE_RIB_DIR24_8_1B;
+	config.alloc_type = RTE_RIB_MALLOC;
+	config.max_nodes = MAX_RULES;
+	config.node_sz = sizeof(struct rte_rib_node);
+
+	/* rte_rib_add: rib == NULL */
+	status = rte_rib_add(NULL, ip, depth, next_hop);
+	TEST_RIB_ASSERT(status < 0);
+
+	/*Create valid rib to use in rest of test. */
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib != NULL);
+
+	/* rte_rib_add: depth > MAX_DEPTH */
+	status = rte_rib_add(rib, ip, (MAX_DEPTH + 1), next_hop);
+	TEST_RIB_ASSERT(status < 0);
+
+	rte_rib_free(rib);
+
+	return PASS;
+}
+
+/*
+ * Check that rte_rib_delete fails gracefully for incorrect user input
+ * arguments
+ */
+int32_t
+test4(void)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_rib_conf config;
+	uint32_t ip = IPv4(0, 0, 0, 0);
+	uint8_t depth = 24;
+	int32_t status = 0;
+
+	config.type = RTE_RIB_DIR24_8_1B;
+	config.alloc_type = RTE_RIB_MALLOC;
+	config.max_nodes = MAX_RULES;
+	config.node_sz = sizeof(struct rte_rib_node);
+
+	/* rte_rib_delete: rib == NULL */
+	status = rte_rib_delete(NULL, ip, depth);
+	TEST_RIB_ASSERT(status < 0);
+
+	/*Create valid rib to use in rest of test. */
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib != NULL);
+
+	/* rte_rib_delete: depth > MAX_DEPTH */
+	status = rte_rib_delete(rib, ip, (MAX_DEPTH + 1));
+	TEST_RIB_ASSERT(status < 0);
+
+	rte_rib_free(rib);
+
+	return PASS;
+}
+
+/*
+ * Call add, lookup and delete for a single rule with depth <= 24
+ */
+int32_t
+test5(void)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_rib_conf config;
+
+	uint32_t ip = IPv4(190, 2, 0, 0);
+	uint64_t next_hop_add = 10;
+	uint64_t next_hop_return = 20;
+	uint64_t next_hop_default = 14;
+	uint8_t depth = 24;
+	uint32_t status = 0;
+
+	config.type = RTE_RIB_DIR24_8_1B;
+	config.alloc_type = RTE_RIB_MALLOC;
+	config.max_nodes = MAX_RULES;
+	config.node_sz = sizeof(struct rte_rib_node);
+	config.def_nh = next_hop_default;
+
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(rib != NULL);
+
+	status = rte_rib_add(rib, ip, depth, next_hop_add);
+	TEST_RIB_ASSERT(status == 0);
+
+	status = rte_rib_fib_lookup_bulk(rib, &ip, &next_hop_return, 1);
+	TEST_RIB_ASSERT((status == 0) && (next_hop_return == next_hop_add));
+
+	status = rte_rib_delete(rib, ip, depth);
+	TEST_RIB_ASSERT(status == 0);
+	status = rte_rib_fib_lookup_bulk(rib, &ip, &next_hop_return, 1);
+	TEST_RIB_ASSERT(next_hop_return == next_hop_default);
+
+	rte_rib_free(rib);
+
+	return PASS;
+}
+
+/*
+ * Do all unit tests.
+ */
+static int
+test_rib(void)
+{
+	unsigned int i;
+	int status, global_status = 0;
+
+	for (i = 0; i < NUM_RIB_TESTS; i++) {
+		status = tests[i]();
+		if (status < 0) {
+			printf("ERROR: RIB Test %u: FAIL\n", i);
+			global_status = status;
+		}
+	}
+
+	return global_status;
+}
+
+REGISTER_TEST_COMMAND(rib_autotest, test_rib);
diff --git a/test/test/test_rib_generate_rt.c b/test/test/test_rib_generate_rt.c
new file mode 100644
index 0000000..36834ed
--- /dev/null
+++ b/test/test/test_rib_generate_rt.c
@@ -0,0 +1,297 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <math.h>
+
+#include <rte_random.h>
+#include <rte_cycles.h>
+#include <rte_ip.h>
+
+#include "test_rib_generate_rt.h"
+
+static uint32_t max_route_entries;
+static uint32_t num_route_entries;
+
+/* All following numbers of each depth of each common IP class are just
+ * got from previous large constant table in app/test/test_rib_routes.h .
+ * In order to match similar performance, they keep same depth and IP
+ * address coverage as previous constant table. These numbers don't
+ * include any private local IP address. As previous large const rule
+ * table was just dumped from a real router, there are no any IP address
+ * in class C or D.
+ */
+static struct route_rule_count rule_count = {
+	.a = { /* IP class A in which the most significant bit is 0 */
+		    0, /* depth =  1 */
+		    0, /* depth =  2 */
+		    1, /* depth =  3 */
+		    0, /* depth =  4 */
+		    2, /* depth =  5 */
+		    1, /* depth =  6 */
+		    3, /* depth =  7 */
+		  185, /* depth =  8 */
+		   26, /* depth =  9 */
+		   16, /* depth = 10 */
+		   39, /* depth = 11 */
+		  144, /* depth = 12 */
+		  233, /* depth = 13 */
+		  528, /* depth = 14 */
+		  866, /* depth = 15 */
+		 3856, /* depth = 16 */
+		 3268, /* depth = 17 */
+		 5662, /* depth = 18 */
+		17301, /* depth = 19 */
+		22226, /* depth = 20 */
+		11147, /* depth = 21 */
+		16746, /* depth = 22 */
+		17120, /* depth = 23 */
+		77578, /* depth = 24 */
+		  401, /* depth = 25 */
+		  656, /* depth = 26 */
+		 1107, /* depth = 27 */
+		 1121, /* depth = 28 */
+		 2316, /* depth = 29 */
+		  717, /* depth = 30 */
+		   10, /* depth = 31 */
+		   66  /* depth = 32 */
+	},
+	.b = { /* IP class A in which the most 2 significant bits are 10 */
+		    0, /* depth =  1 */
+		    0, /* depth =  2 */
+		    0, /* depth =  3 */
+		    0, /* depth =  4 */
+		    1, /* depth =  5 */
+		    1, /* depth =  6 */
+		    1, /* depth =  7 */
+		    3, /* depth =  8 */
+		    3, /* depth =  9 */
+		   30, /* depth = 10 */
+		   25, /* depth = 11 */
+		  168, /* depth = 12 */
+		  305, /* depth = 13 */
+		  569, /* depth = 14 */
+		 1129, /* depth = 15 */
+		50800, /* depth = 16 */
+		 1645, /* depth = 17 */
+		 1820, /* depth = 18 */
+		 3506, /* depth = 19 */
+		 3258, /* depth = 20 */
+		 3424, /* depth = 21 */
+		 4971, /* depth = 22 */
+		 6885, /* depth = 23 */
+		39771, /* depth = 24 */
+		  424, /* depth = 25 */
+		  170, /* depth = 26 */
+		  443, /* depth = 27 */
+		   92, /* depth = 28 */
+		  366, /* depth = 29 */
+		  377, /* depth = 30 */
+		    2, /* depth = 31 */
+		  200  /* depth = 32 */
+	},
+	.c = { /* IP class A in which the most 3 significant bits are 110 */
+		     0, /* depth =  1 */
+		     0, /* depth =  2 */
+		     0, /* depth =  3 */
+		     0, /* depth =  4 */
+		     0, /* depth =  5 */
+		     0, /* depth =  6 */
+		     0, /* depth =  7 */
+		    12, /* depth =  8 */
+		     8, /* depth =  9 */
+		     9, /* depth = 10 */
+		    33, /* depth = 11 */
+		    69, /* depth = 12 */
+		   237, /* depth = 13 */
+		  1007, /* depth = 14 */
+		  1717, /* depth = 15 */
+		 14663, /* depth = 16 */
+		  8070, /* depth = 17 */
+		 16185, /* depth = 18 */
+		 48261, /* depth = 19 */
+		 36870, /* depth = 20 */
+		 33960, /* depth = 21 */
+		 50638, /* depth = 22 */
+		 61422, /* depth = 23 */
+		466549, /* depth = 24 */
+		  1829, /* depth = 25 */
+		  4824, /* depth = 26 */
+		  4927, /* depth = 27 */
+		  5914, /* depth = 28 */
+		 10254, /* depth = 29 */
+		  4905, /* depth = 30 */
+		     1, /* depth = 31 */
+		   716  /* depth = 32 */
+	}
+};
+
+static void generate_random_rule_prefix(struct route_rule *rt,
+	uint32_t ip_class, uint8_t depth)
+{
+/* IP address class A, the most significant bit is 0 */
+#define IP_HEAD_MASK_A			0x00000000
+#define IP_HEAD_BIT_NUM_A		1
+
+/* IP address class B, the most significant 2 bits are 10 */
+#define IP_HEAD_MASK_B			0x80000000
+#define IP_HEAD_BIT_NUM_B		2
+
+/* IP address class C, the most significant 3 bits are 110 */
+#define IP_HEAD_MASK_C			0xC0000000
+#define IP_HEAD_BIT_NUM_C		3
+
+	uint32_t class_depth;
+	uint32_t range;
+	uint32_t mask;
+	uint32_t step;
+	uint32_t start;
+	uint32_t fixed_bit_num;
+	uint32_t ip_head_mask;
+	uint32_t rule_num;
+	uint32_t k;
+	struct route_rule *ptr_rule;
+
+	if (ip_class == IP_CLASS_A) {        /* IP Address class A */
+		fixed_bit_num = IP_HEAD_BIT_NUM_A;
+		ip_head_mask = IP_HEAD_MASK_A;
+		rule_num = rule_count.a[depth - 1];
+	} else if (ip_class == IP_CLASS_B) { /* IP Address class B */
+		fixed_bit_num = IP_HEAD_BIT_NUM_B;
+		ip_head_mask = IP_HEAD_MASK_B;
+		rule_num = rule_count.b[depth - 1];
+	} else {                             /* IP Address class C */
+		fixed_bit_num = IP_HEAD_BIT_NUM_C;
+		ip_head_mask = IP_HEAD_MASK_C;
+		rule_num = rule_count.c[depth - 1];
+	}
+
+	if ((rule_num == 0) || ((num_route_entries + rule_num) >=
+		max_route_entries))
+		return;
+
+	/* the number of rest bits which don't include the most significant
+	 * fixed bits for this IP address class
+	 */
+	class_depth = depth - fixed_bit_num;
+
+	/* range is the maximum number of rules for this depth and
+	 * this IP address class
+	 */
+	range = 1 << class_depth;
+
+	/* only mask the most depth significant generated bits
+	 * except fixed bits for IP address class
+	 */
+	mask = range - 1;
+
+	/* Widen coverage of IP address in generated rules */
+	if (range <= rule_num)
+		step = 1;
+	else
+		step = round((double)range / rule_num);
+
+	/* Only generate rest bits except the most significant
+	 * fixed bits for IP address class
+	 */
+	start = rte_rand() & mask;
+	ptr_rule = &rt[num_route_entries];
+	for (k = 0; k < rule_num; k++) {
+		ptr_rule->ip = (start << (RTE_RIB_MAXDEPTH - depth))
+			| ip_head_mask;
+		ptr_rule->depth = depth;
+		ptr_rule++;
+		start = (start + step) & mask;
+	}
+	num_route_entries += rule_num;
+}
+
+static void insert_rule_in_random_pos(struct route_rule *rt,
+	uint32_t ip, uint8_t depth)
+{
+	uint32_t pos;
+	int try_count = 0;
+	struct route_rule tmp;
+
+	do {
+		pos = rte_rand();
+		try_count++;
+	} while ((try_count < 10) && (pos > num_route_entries));
+
+	if ((pos > num_route_entries) || (pos > max_route_entries))
+		pos = num_route_entries >> 1;
+
+	tmp = rt[pos];
+	rt[pos].ip = ip;
+	rt[pos].depth = depth;
+	if (num_route_entries < max_route_entries)
+		rt[num_route_entries++] = tmp;
+}
+
+uint32_t
+generate_large_route_rule_table(uint32_t num_routes, struct route_rule *rt)
+{
+	uint32_t ip_class;
+	uint8_t  depth;
+
+	rte_srand(rte_rdtsc());
+	num_route_entries = 0;
+	max_route_entries = num_routes;
+	for (ip_class = IP_CLASS_A; ip_class <= IP_CLASS_C; ip_class++) {
+		for (depth = 1; depth <= RTE_RIB_MAXDEPTH; depth++)
+			generate_random_rule_prefix(rt, ip_class, depth);
+	}
+	/* Add following rules to keep same as previous large constant table,
+	 * they are 4 rules with private local IP address and 1 all-zeros prefix
+	 * with depth = 8.
+	 */
+	insert_rule_in_random_pos(rt, IPv4(0, 0, 0, 0), 8);
+	insert_rule_in_random_pos(rt, IPv4(10, 2, 23, 147), 32);
+	insert_rule_in_random_pos(rt, IPv4(192, 168, 100, 10), 24);
+	insert_rule_in_random_pos(rt, IPv4(192, 168, 25, 100), 24);
+	insert_rule_in_random_pos(rt, IPv4(192, 168, 129, 124), 32);
+
+	return num_route_entries;
+}
+
+void
+print_route_distribution(const struct route_rule *table, uint32_t n)
+{
+	unsigned int i, j;
+
+	printf("Route distribution per prefix width:\n");
+	printf("DEPTH    QUANTITY (PERCENT)\n");
+	printf("---------------------------\n");
+
+	/* Count depths. */
+	for (i = 1; i <= 32; i++) {
+		unsigned int depth_counter = 0;
+		double percent_hits;
+
+		for (j = 0; j < n; j++)
+			if (table[j].depth == (uint8_t) i)
+				depth_counter++;
+
+		percent_hits = ((double)depth_counter)/((double)n) * 100;
+		printf("%.2u%15u (%.2f)\n", i, depth_counter, percent_hits);
+	}
+	printf("\n");
+}
+
+void
+shuffle_rt(struct route_rule *rt, uint32_t n)
+{
+	uint32_t pos;
+	struct route_rule tmp;
+	uint32_t i;
+
+	for (i = 0; i < n; i++) {
+		pos = rte_rand() % n;
+		tmp = rt[pos];
+		rt[pos] = rt[i];
+		rt[i] = tmp;
+	}
+}
diff --git a/test/test/test_rib_generate_rt.h b/test/test/test_rib_generate_rt.h
new file mode 100644
index 0000000..90573c7
--- /dev/null
+++ b/test/test/test_rib_generate_rt.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#ifndef _TEST_RIB_GENERATE_RT_H_
+#define _TEST_RIB_GENERATE_RT_H_
+
+#define RTE_RIB_MAXDEPTH	32
+
+struct route_rule {
+	uint64_t nh;
+	uint32_t ip;
+	uint8_t depth;
+};
+
+enum {
+	IP_CLASS_A,
+	IP_CLASS_B,
+	IP_CLASS_C
+};
+
+/* struct route_rule_count defines the total number of rules in following a/b/c
+ * each item in a[]/b[]/c[] is the number of common IP address class A/B/C, not
+ * including the ones for private local network.
+ */
+struct route_rule_count {
+	uint32_t a[RTE_RIB_MAXDEPTH];
+	uint32_t b[RTE_RIB_MAXDEPTH];
+	uint32_t c[RTE_RIB_MAXDEPTH];
+};
+
+
+uint32_t generate_large_route_rule_table(uint32_t num_routes,
+	struct route_rule *rt);
+void print_route_distribution(const struct route_rule *table, uint32_t n);
+void shuffle_rt(struct route_rule *rt, uint32_t n);
+
+#endif /* _TEST_RIB_GENERATE_RT_H_ */
diff --git a/test/test/test_rib_lpm_comp.c b/test/test/test_rib_lpm_comp.c
new file mode 100644
index 0000000..7a3eea9
--- /dev/null
+++ b/test/test/test_rib_lpm_comp.c
@@ -0,0 +1,187 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include <rte_random.h>
+#include <rte_cycles.h>
+#include <rte_branch_prediction.h>
+#include <rte_ip.h>
+#include <rte_malloc.h>
+#include <rte_lpm.h>
+#include <rte_rib.h>
+
+#include "test.h"
+#include "test_xmmt_ops.h"
+#include "test_rib_generate_rt.h"
+
+#define TEST_RIB_ASSERT(cond) do {				\
+	if (!(cond)) {						\
+		printf("Error at line %d:\n", __LINE__);	\
+		return -1;					\
+	}							\
+} while (0)
+
+#define ITERATIONS (1 << 15)
+#define BATCH_SIZE (1 << 7)
+#define BULK_SIZE 32
+#define LPM_NH_MASK	((1 << 24) - 1)
+
+static const uint64_t default_nh;
+
+static int
+test_lookup(struct rte_rib *rib, struct rte_lpm *lpm)
+{
+	static uint32_t ip_batch[BATCH_SIZE];
+	uint64_t rib_next_hops[BULK_SIZE];
+	uint32_t lpm_next_hops[BULK_SIZE];
+	int i, j, k;
+
+	for (i = 0; i < ITERATIONS; i++) {
+		for (j = 0; j < BATCH_SIZE; j++)
+			ip_batch[j] = rte_rand();
+
+		for (j = 0; j < BATCH_SIZE; j += BULK_SIZE) {
+			rte_rib_fib_lookup_bulk(rib, &ip_batch[j],
+				rib_next_hops, BULK_SIZE);
+			rte_lpm_lookup_bulk(lpm, &ip_batch[j],
+				lpm_next_hops, BULK_SIZE);
+			for (k = 0; k < BULK_SIZE; k++) {
+				if (likely(lpm_next_hops[k] &
+					RTE_LPM_LOOKUP_SUCCESS))
+					lpm_next_hops[k] &= LPM_NH_MASK;
+				else
+					lpm_next_hops[k] = default_nh;
+			}
+			for (k = 0; k < BULK_SIZE; k++)
+				TEST_RIB_ASSERT(rib_next_hops[k] ==
+						lpm_next_hops[k]);
+		}
+	}
+	return 0;
+}
+
+static int
+test_rib_lpm_comp(void)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_lpm *lpm = NULL;
+	struct route_rule *rt = NULL;
+	unsigned int i;
+	int rib_add = 0, lpm_add = 0;
+	int ret, nh_bits, nr_tbl8;
+	uint32_t num_routes;
+	struct rte_rib_conf conf;
+	struct rte_lpm_config config;
+
+	rte_srand(rte_rdtsc());
+
+	conf.max_nodes = 3000000;
+	conf.node_sz = sizeof(struct rte_rib_node);
+	conf.type = RTE_RIB_DIR24_8_8B;
+	conf.alloc_type = RTE_RIB_MEMPOOL;
+	conf.def_nh	= default_nh;
+
+	nh_bits = RTE_MIN(((1 << (3 + conf.type)) - 2), 24);
+	nr_tbl8 = RTE_MIN((1 << nh_bits), 65536);
+	config.number_tbl8s = nr_tbl8;
+	config.max_rules = 2000000;
+	config.flags = 0;
+
+	num_routes = 1200000;
+
+	rt = rte_zmalloc("struct route_rule *", sizeof(struct route_rule) *
+		num_routes + 5, 0);
+	TEST_RIB_ASSERT(rt != NULL);
+
+	num_routes = generate_large_route_rule_table(num_routes, rt);
+	TEST_RIB_ASSERT(num_routes != 0);
+	printf("No. routes = %u\n", (unsigned int) num_routes);
+
+	shuffle_rt(rt, num_routes);
+
+	print_route_distribution(rt, (uint32_t) num_routes);
+
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &conf);
+	TEST_RIB_ASSERT(rib != NULL);
+
+	lpm = rte_lpm_create(__func__, SOCKET_ID_ANY, &config);
+	TEST_RIB_ASSERT(lpm != NULL);
+
+	for (i = 0; i < num_routes; i++)
+		rt[i].nh = rte_rand() & ((1ULL << nh_bits) - 1);
+
+	for (i = 0; i < num_routes; i++) {
+		ret = rte_rib_add(rib, rt[i].ip, rt[i].depth, rt[i].nh);
+		if (ret == 0)
+			rib_add++;
+		else
+			continue;
+
+		ret = rte_lpm_add(lpm, rt[i].ip, rt[i].depth, rt[i].nh);
+		if (ret == 0)
+			lpm_add++;
+		else {
+			rte_rib_delete(rib, rt[i].ip, rt[i].depth);
+			rib_add--;
+		}
+	}
+	TEST_RIB_ASSERT(rib_add == lpm_add);
+
+	ret = test_lookup(rib, lpm);
+	if (ret != 0)
+		return ret;
+
+	for (i = 0; i < num_routes; i++) {
+		if ((i % 3) == 0) {
+			ret = rte_rib_delete(rib, rt[i].ip, rt[i].depth);
+			if (ret == 0)
+				rib_add--;
+			else
+				continue;
+
+			ret = rte_lpm_delete(lpm, rt[i].ip, rt[i].depth);
+			if (ret == 0)
+				lpm_add--;
+		}
+	}
+	TEST_RIB_ASSERT(rib_add == lpm_add);
+
+	ret = test_lookup(rib, lpm);
+	if (ret != 0)
+		return ret;
+
+	for (i = 0; i < num_routes; i++) {
+		if ((i % 6) == 0) {
+			ret = rte_rib_add(rib, rt[i].ip, rt[i].depth, rt[i].nh);
+			if (ret == 0)
+				rib_add++;
+			else
+				continue;
+
+			ret = rte_lpm_add(lpm, rt[i].ip, rt[i].depth, rt[i].nh);
+			if (ret == 0)
+				lpm_add++;
+			else {
+				rte_rib_delete(rib, rt[i].ip, rt[i].depth);
+				rib_add--;
+			}
+		}
+	}
+	TEST_RIB_ASSERT(rib_add == lpm_add);
+
+	ret = test_lookup(rib, lpm);
+	if (ret != 0)
+		return ret;
+
+	rte_rib_free(rib);
+	rte_lpm_free(lpm);
+	rte_free(rt);
+
+	return 0;
+}
+
+REGISTER_TEST_COMMAND(rib_lpm_comp_autotest, test_rib_lpm_comp);
diff --git a/test/test/test_rib_perf.c b/test/test/test_rib_perf.c
new file mode 100644
index 0000000..f98cf06
--- /dev/null
+++ b/test/test/test_rib_perf.c
@@ -0,0 +1,163 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
+ */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#include <rte_cycles.h>
+#include <rte_random.h>
+#include <rte_branch_prediction.h>
+#include <rte_ip.h>
+#include <rte_malloc.h>
+#include <rte_rib.h>
+#include <rte_dir24_8.h>
+
+#include "test.h"
+#include "test_xmmt_ops.h"
+#include "test_rib_generate_rt.h"
+
+#define TEST_RIB_ASSERT(cond) do {				\
+	if (!(cond)) {						\
+		printf("Error at line %d:\n", __LINE__);	\
+		return -1;					\
+	}							\
+} while (0)
+
+#define ITERATIONS (1 << 15)
+#define BATCH_SIZE (1 << 12)
+#define BULK_SIZE 32
+
+static int
+test_rib_perf(void)
+{
+	struct rte_rib *rib = NULL;
+	struct rte_rib_conf conf;
+	struct route_rule *rt;
+	uint64_t begin, total_time;
+	uint64_t next_hop_add;
+	uint64_t default_nh = 0;
+	int64_t count = 0;
+	unsigned int i, j;
+	int status = 0;
+	int ret;
+	uint32_t num_routes;
+
+	conf.max_nodes = 3000000;
+	conf.node_sz = sizeof(struct rte_rib_node);
+	conf.type = RTE_RIB_DIR24_8_4B;
+	conf.alloc_type = RTE_RIB_MEMPOOL;
+	conf.def_nh = default_nh;
+	rte_srand(rte_rdtsc());
+
+	num_routes = 1200000;
+
+	rt = rte_zmalloc("struct route_rule *", sizeof(struct route_rule) *
+		num_routes, 0);
+	TEST_RIB_ASSERT(rt != NULL);
+
+	num_routes = generate_large_route_rule_table(num_routes, rt);
+	TEST_RIB_ASSERT(num_routes != 0);
+
+	printf("No. routes = %u\n", (unsigned int) num_routes);
+
+	shuffle_rt(rt, num_routes);
+
+	print_route_distribution(rt, (uint32_t) num_routes);
+
+	rib = rte_rib_create(__func__, SOCKET_ID_ANY, &conf);
+	TEST_RIB_ASSERT(rib != NULL);
+
+	/* Measue add. */
+	begin = rte_rdtsc();
+
+	for (i = 0; i < num_routes; i++) {
+		do {
+			next_hop_add = rte_rand() & ((1ULL << 30) - 1);
+		} while (next_hop_add == default_nh);
+
+		ret = rte_rib_add(rib, rt[i].ip, rt[i].depth, next_hop_add);
+		if ((ret == 0))
+			status++;
+	}
+
+	total_time = rte_rdtsc() - begin;
+
+	printf("Unique added entries = %d\n", status);
+	printf("Average RIB Add: %g cycles\n",
+			(double)total_time / num_routes);
+
+	/* Measure Lookup */
+	total_time = 0;
+	count = 0;
+	for (i = 0; i < ITERATIONS; i++) {
+		static uint32_t ip_batch[BATCH_SIZE];
+		uint64_t nh_64 = 0;
+		/* Create array of random IP addresses */
+		for (j = 0; j < BATCH_SIZE; j++)
+			ip_batch[j] = rte_rand();
+
+		/* Lookup per batch */
+		begin = rte_rdtsc();
+		for (j = 0; j < BATCH_SIZE; j++) {
+			ret = rte_dir24_8_lookup(rib->fib, ip_batch[j], &nh_64);
+			if (unlikely(nh_64 == default_nh))
+				count++;
+		}
+		total_time += rte_rdtsc() - begin;
+	}
+
+	printf("RIB Lookup: %.1f cycles (fails = %.1f%%)\n",
+			(double)total_time / ((double)ITERATIONS * BATCH_SIZE),
+			(count * 100.0) / (double)(ITERATIONS * BATCH_SIZE));
+
+	/* Measure bulk Lookup */
+	total_time = 0;
+	count = 0;
+	for (i = 0; i < ITERATIONS; i++) {
+		static uint32_t ip_batch[BATCH_SIZE];
+		uint64_t next_hops[BULK_SIZE];
+
+		/* Create array of random IP addresses */
+		for (j = 0; j < BATCH_SIZE; j++)
+			ip_batch[j] = rte_rand();
+
+		/* Lookup per batch */
+		begin = rte_rdtsc();
+		for (j = 0; j < BATCH_SIZE; j += BULK_SIZE)
+			rte_rib_fib_lookup_bulk(rib, &ip_batch[j], next_hops,
+				BULK_SIZE);
+
+		total_time += rte_rdtsc() - begin;
+		for (j = 0; j < BULK_SIZE; j++) {
+			if (next_hops[j] == default_nh)
+				count++;
+		}
+	}
+	printf("BULK RIB Lookup: %.1f cycles (fails = %.1f%%)\n",
+			(double)total_time / ((double)ITERATIONS * BATCH_SIZE),
+			(count * 100.0) / (double)(ITERATIONS * BATCH_SIZE));
+
+	/* Delete */
+	status = 0;
+	begin = rte_rdtsc();
+
+	for (i = 0; i < num_routes; i++) {
+		ret = rte_rib_delete(rib, rt[i].ip, rt[i].depth);
+		if (ret == 0)
+			status++;
+	}
+
+	total_time = rte_rdtsc() - begin;
+
+	printf("Average RIB Delete: %g cycles\n",
+			(double)total_time / num_routes);
+
+	rte_rib_free(rib);
+	rte_free(rt);
+
+	return 0;
+}
+
+REGISTER_TEST_COMMAND(rib_perf_autotest, test_rib_perf);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/2] Add RIB library
  2018-02-22 22:50 ` [PATCH v3 1/2] Add RIB library Medvedkin Vladimir
@ 2018-03-14 11:58   ` Bruce Richardson
  2018-03-15 14:27   ` Bruce Richardson
  1 sibling, 0 replies; 7+ messages in thread
From: Bruce Richardson @ 2018-03-14 11:58 UTC (permalink / raw)
  To: Medvedkin Vladimir; +Cc: dev, thomas

On Thu, Feb 22, 2018 at 10:50:55PM +0000, Medvedkin Vladimir wrote:
> RIB is an alternative to current LPM library.
> It solves the following problems
>  - Increases the speed of control plane operations against lpm such as
>    adding/deleting routes
>  - Adds abstraction from dataplane algorithms, so it is possible to add
>    different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
>    in addition to current dir24_8
>  - It is possible to keep user defined application specific additional
>    information in struct rte_rib_node which represents route entry.
>    It can be next hop/set of next hops (i.e. active and feasible),
>    pointers to link rte_rib_node based on some criteria (i.e. next_hop),
>    plenty of additional control plane information.
>  - For dir24_8 implementation it is possible to remove
>    rte_lpm_tbl_entry.depth field that helps to save 6 bits.
>  - Also new dir24_8 implementation supports different next_hop sizes
>    (1/2/4/8 bytes per next hop)
>  - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate
>    ternary operator.
>    Instead it returns special default value if there is no route.
> 
> Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>
> ---
>  config/common_base                 |   6 +
>  doc/api/doxy-api.conf              |   1 +
>  lib/Makefile                       |   2 +
>  lib/librte_rib/Makefile            |  23 ++
>  lib/librte_rib/rte_dir24_8.c       | 482 +++++++++++++++++++++++++++++++++
>  lib/librte_rib/rte_dir24_8.h       | 115 ++++++++
>  lib/librte_rib/rte_rib.c           | 526 +++++++++++++++++++++++++++++++++++++
>  lib/librte_rib/rte_rib.h           | 322 +++++++++++++++++++++++
>  lib/librte_rib/rte_rib_version.map |  18 ++
>  mk/rte.app.mk                      |   1 +
>  10 files changed, 1496 insertions(+)
>  create mode 100644 lib/librte_rib/Makefile
>  create mode 100644 lib/librte_rib/rte_dir24_8.c
>  create mode 100644 lib/librte_rib/rte_dir24_8.h
>  create mode 100644 lib/librte_rib/rte_rib.c
>  create mode 100644 lib/librte_rib/rte_rib.h
>  create mode 100644 lib/librte_rib/rte_rib_version.map
> 
Sorry, I didn't see there was a V3, so made comments to V2. Hopefully the
comments all still apply.
For future versions, please include a diff log below the cut-line so that
we can see what changes between each version.

Thanks,
/Bruce

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/2] Add RIB library
  2018-02-22 22:50 ` [PATCH v3 1/2] Add RIB library Medvedkin Vladimir
  2018-03-14 11:58   ` Bruce Richardson
@ 2018-03-15 14:27   ` Bruce Richardson
  2018-03-25 18:35     ` Vladimir Medvedkin
  1 sibling, 1 reply; 7+ messages in thread
From: Bruce Richardson @ 2018-03-15 14:27 UTC (permalink / raw)
  To: Medvedkin Vladimir; +Cc: dev, thomas

On Thu, Feb 22, 2018 at 10:50:55PM +0000, Medvedkin Vladimir wrote:
> RIB is an alternative to current LPM library.
> It solves the following problems
>  - Increases the speed of control plane operations against lpm such as
>    adding/deleting routes
>  - Adds abstraction from dataplane algorithms, so it is possible to add
>    different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
>    in addition to current dir24_8
>  - It is possible to keep user defined application specific additional
>    information in struct rte_rib_node which represents route entry.
>    It can be next hop/set of next hops (i.e. active and feasible),
>    pointers to link rte_rib_node based on some criteria (i.e. next_hop),
>    plenty of additional control plane information.
>  - For dir24_8 implementation it is possible to remove
>    rte_lpm_tbl_entry.depth field that helps to save 6 bits.
>  - Also new dir24_8 implementation supports different next_hop sizes
>    (1/2/4/8 bytes per next hop)
>  - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate
>    ternary operator.
>    Instead it returns special default value if there is no route.
> 
> Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>

More comments inline below. Mostly for rte_rib.c file.

/Bruce

> ---
>  config/common_base                 |   6 +
>  doc/api/doxy-api.conf              |   1 +
>  lib/Makefile                       |   2 +
>  lib/librte_rib/Makefile            |  23 ++

Don't forget meson.build file too, to build with meson and ninja. [Strongly
recommend it for your day-to-day development work too, incremental builds
are much, much faster using ninja].

>  lib/librte_rib/rte_dir24_8.c       | 482 +++++++++++++++++++++++++++++++++
>  lib/librte_rib/rte_dir24_8.h       | 115 ++++++++
>  lib/librte_rib/rte_rib.c           | 526 +++++++++++++++++++++++++++++++++++++
>  lib/librte_rib/rte_rib.h           | 322 +++++++++++++++++++++++
>  lib/librte_rib/rte_rib_version.map |  18 ++
>  mk/rte.app.mk                      |   1 +
>  10 files changed, 1496 insertions(+)
>  create mode 100644 lib/librte_rib/Makefile
>  create mode 100644 lib/librte_rib/rte_dir24_8.c
>  create mode 100644 lib/librte_rib/rte_dir24_8.h
>  create mode 100644 lib/librte_rib/rte_rib.c
>  create mode 100644 lib/librte_rib/rte_rib.h
>  create mode 100644 lib/librte_rib/rte_rib_version.map
> 
> diff --git a/config/common_base b/config/common_base
> index ad03cf4..aceeff5 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -679,6 +679,12 @@ CONFIG_RTE_LIBRTE_LPM=y
>  CONFIG_RTE_LIBRTE_LPM_DEBUG=n
>  
>  #
> +# Compile librte_rib
> +#
> +CONFIG_RTE_LIBRTE_RIB=y
> +CONFIG_RTE_LIBRTE_RIB_DEBUG=n
> +
> +#
>  # Compile librte_acl
>  #
>  CONFIG_RTE_LIBRTE_ACL=y
> diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
> index cda52fd..8e4f969 100644
> --- a/doc/api/doxy-api.conf
> +++ b/doc/api/doxy-api.conf
> @@ -60,6 +60,7 @@ INPUT                   = doc/api/doxy-api-index.md \
>                            lib/librte_kvargs \
>                            lib/librte_latencystats \
>                            lib/librte_lpm \
> +                          lib/librte_rib \
>                            lib/librte_mbuf \
>                            lib/librte_member \
>                            lib/librte_mempool \
> diff --git a/lib/Makefile b/lib/Makefile
> index ec965a6..e4faf10 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -43,6 +43,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
>  DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
>  DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
>  DEPDIRS-librte_lpm := librte_eal
> +DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib
> +DEPDIRS-librte_rib := librte_eal librte_mempool
>  DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
>  DEPDIRS-librte_acl := librte_eal
>  DIRS-$(CONFIG_RTE_LIBRTE_MEMBER) += librte_member
> diff --git a/lib/librte_rib/Makefile b/lib/librte_rib/Makefile
> new file mode 100644
> index 0000000..f6431b6
> --- /dev/null
> +++ b/lib/librte_rib/Makefile
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_rib.a
> +
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> +LDLIBS += -lrte_eal -lrte_mempool
> +
> +EXPORT_MAP := rte_rib_version.map
> +
> +LIBABIVER := 1
> +
> +# all source are stored in SRCS-y
> +SRCS-$(CONFIG_RTE_LIBRTE_RIB) := rte_rib.c rte_dir24_8.c
> +
> +# install this header file
> +SYMLINK-$(CONFIG_RTE_LIBRTE_RIB)-include := rte_rib.h rte_dir24_8.h
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_rib/rte_dir24_8.c b/lib/librte_rib/rte_dir24_8.c
> new file mode 100644
> index 0000000..2fc55fe
> --- /dev/null
> +++ b/lib/librte_rib/rte_dir24_8.c

For future patches, it would be good if you can split the dir24_8 code into
a separate patch from the main rib code. The more you can split it into
separate patch blocks, the easier it is to review.

> @@ -0,0 +1,482 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> + */

<snip>

> diff --git a/lib/librte_rib/rte_rib.c b/lib/librte_rib/rte_rib.c
> new file mode 100644
> index 0000000..7783b23
> --- /dev/null
> +++ b/lib/librte_rib/rte_rib.c
> @@ -0,0 +1,526 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> + */
> +
> +#include <stdint.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <sys/queue.h>
> +
> +#include <rte_eal.h>
> +#include <rte_eal_memconfig.h>
> +#include <rte_common.h>
> +#include <rte_tailq.h>
> +#include <rte_errno.h>
> +#include <rte_rwlock.h>
> +#include <rte_memory.h>
> +#include <rte_memzone.h>
> +#include <rte_mempool.h>
> +#include <rte_malloc.h>
> +#include <rte_log.h>
> +
> +#include <rte_rib.h>
> +#include <rte_dir24_8.h>
> +
> +TAILQ_HEAD(rte_rib_list, rte_tailq_entry);
> +static struct rte_tailq_elem rte_rib_tailq = {
> +	.name = "RTE_RIB",
> +};
> +EAL_REGISTER_TAILQ(rte_rib_tailq)
> +
> +static struct rte_rib_node *
> +new_node_malloc(struct rte_rib *rib)
> +{
> +	struct rte_rib_node *ent;
> +
> +	ent =  malloc(rib->node_sz);
> +	if (unlikely(ent == NULL))
> +		return NULL;
> +	++rib->cur_nodes;
> +	return ent;
> +}
> +
> +static void
> +free_node_malloc(__rte_unused struct rte_rib *rib, struct rte_rib_node *ent)
> +{
> +	--rib->cur_nodes;
> +	free(ent);
> +}
> +
> +static struct rte_rib_node *
> +new_node_mempool(struct rte_rib *rib)
> +{
> +	struct rte_rib_node *ent;
> +	int ret;
> +
> +	ret = rte_mempool_get(rib->node_pool, (void *)&ent);
> +	if (unlikely(ret != 0))
> +		return NULL;
> +	++rib->cur_nodes;
> +	return ent;
> +}
> +
> +static void
> +free_node_mempool(struct rte_rib *rib, struct rte_rib_node *ent)
> +{
> +	--rib->cur_nodes;
> +	rte_mempool_put(rib->node_pool, ent);
> +}
> +
> +struct rte_rib_node *
> +rte_rib_tree_lookup(struct rte_rib *rib, uint32_t key)
> +{
> +	struct rte_rib_node *cur = rib->trie;
> +	struct rte_rib_node *prev = NULL;
> +
> +	while ((cur != NULL) && (((cur->key ^ key) &
> +		(uint32_t)(UINT64_MAX << (32 - cur->depth))) == 0)) {
> +		if ((cur->flag & RTE_RIB_VALID_NODE) == RTE_RIB_VALID_NODE)
> +			prev = cur;
> +		cur = RTE_RIB_GET_NXT_NODE(cur, key);
> +	}
> +	return prev;
> +}
> +
> +struct rte_rib_node *
> +rte_rib_tree_lookup_parent(struct rte_rib_node *ent)
> +{
> +	struct rte_rib_node *tmp;
> +
> +	if (ent == NULL)
> +		return NULL;
> +	tmp = ent->parent;
> +	while ((tmp != NULL) &&
> +		(tmp->flag & RTE_RIB_VALID_NODE) != RTE_RIB_VALID_NODE) {
> +		tmp = tmp->parent;
> +}
Watch indentation here. The close brace obviously needs an indent, and the
line continuation of while should have different indent to body. Coding
standards suggest double indent on the continuation, ie. two tabs before
"(tmp->flag".

> +	return tmp;
> +}
> +
> +struct rte_rib_node *
> +rte_rib_tree_lookup_exact(struct rte_rib *rib, uint32_t key, uint8_t depth)
> +{
> +	struct rte_rib_node *cur = rib->trie;
> +
> +	key &= (uint32_t)(UINT64_MAX << (32 - depth));
This mask generation seems a common idiom. Maybe make an inline fn or macro
out of it.

> +	while (cur != NULL) {
> +		if ((cur->key == key) && (cur->depth == depth) &&
> +				(cur->flag & RTE_RIB_VALID_NODE))
> +			return cur;
> +		if ((cur->depth > depth) ||
> +				(((uint64_t)key >> (32 - cur->depth)) !=
> +				((uint64_t)cur->key >> (32 - cur->depth))))
> +			break;
> +		cur = RTE_RIB_GET_NXT_NODE(cur, key);
> +	}
> +	return NULL;
> +}
> +
> +struct rte_rib_node *
> +rte_rib_tree_get_nxt(struct rte_rib *rib, uint32_t key,
> +	uint8_t depth, struct rte_rib_node *cur, int flag)
> +{
> +	struct rte_rib_node *tmp, *prev = NULL;
> +
> +	if (cur == NULL) {
> +		tmp = rib->trie;
> +		while ((tmp) && (tmp->depth < depth))
> +			tmp = RTE_RIB_GET_NXT_NODE(tmp, key);
> +	} else {
> +		tmp = cur;
> +		while ((tmp->parent != NULL) && (RTE_RIB_IS_RIGHT_NODE(tmp) ||
> +				(tmp->parent->right == NULL))) {
> +			tmp = tmp->parent;
> +			if ((tmp->flag & RTE_RIB_VALID_NODE) &&
> +				(RTE_RIB_IS_COVERED(tmp->key, tmp->depth,
> +					key, depth)))
> +				return tmp;
> +		}
> +		tmp = (tmp->parent) ? tmp->parent->right : NULL;
> +	}
> +	while (tmp) {
> +		if ((tmp->flag & RTE_RIB_VALID_NODE) &&
> +			(RTE_RIB_IS_COVERED(tmp->key, tmp->depth,
> +				key, depth))) {
> +			prev = tmp;
> +			if (flag == RTE_RIB_GET_NXT_COVER)
> +				return prev;
> +		}
> +		tmp = (tmp->left) ? tmp->left : tmp->right;
> +	}
> +	return prev;
> +}

I think this function could do with some comments explaining the logic
behind it.

> +
> +void
> +rte_rib_tree_remove(struct rte_rib *rib, uint32_t key, uint8_t depth)
> +{
> +	struct rte_rib_node *cur, *prev, *child;
> +
> +	cur = rte_rib_tree_lookup_exact(rib, key, depth);
> +	if (cur == NULL)
> +		return;
> +
> +	--rib->cur_routes;
> +	cur->flag &= ~RTE_RIB_VALID_NODE;
> +	while ((cur->flag & RTE_RIB_VALID_NODE) != RTE_RIB_VALID_NODE) {
> +		if ((cur->left != NULL) && (cur->right != NULL))
> +			return;
> +		child = (cur->left == NULL) ? cur->right : cur->left;
> +		if (child != NULL)
> +			child->parent = cur->parent;
> +		if (cur->parent == NULL) {
> +			rib->trie = child;
> +			rib->free_node(rib, cur);
> +			return;
> +		}
> +		if (cur->parent->left == cur)
> +			cur->parent->left = child;
> +		else
> +			cur->parent->right = child;
> +		prev = cur;
> +		cur = cur->parent;
> +		rib->free_node(rib, prev);
> +	}
> +}
> +
> +struct rte_rib_node *
> +rte_rib_tree_insert(struct rte_rib *rib, uint32_t key, uint8_t depth)
> +{
> +	struct rte_rib_node **tmp = &rib->trie;
> +	struct rte_rib_node *prev = NULL;
> +	struct rte_rib_node *new_node = NULL;
> +	struct rte_rib_node *common_node = NULL;
> +	int i = 0;
> +	uint32_t common_prefix;
> +	uint8_t common_depth;
> +
> +	if (depth > 32) {
> +		rte_errno = EINVAL;
> +		return NULL;
> +	}
> +
> +	key &= (uint32_t)(UINT64_MAX << (32 - depth));
> +	new_node = rte_rib_tree_lookup_exact(rib, key, depth);
> +	if (new_node != NULL) {
> +		rte_errno = EEXIST;
> +		return NULL;
> +	}
> +
> +	new_node = rib->alloc_node(rib);
> +	if (new_node == NULL) {
> +		rte_errno = ENOMEM;
> +		return NULL;
> +	}
> +	new_node->left = NULL;
> +	new_node->right = NULL;
> +	new_node->parent = NULL;
> +	new_node->key = key;
> +	new_node->depth = depth;
> +	new_node->flag = RTE_RIB_VALID_NODE;
> +
> +	while (1) {
> +		if (*tmp == NULL) {
> +			*tmp = new_node;
> +			new_node->parent = prev;
> +		}

I think it would be clearer to have a return in this block, rather than
having fallthrough to the next one.

> +		if ((key == (*tmp)->key) && (depth == (*tmp)->depth)) {
> +			if (new_node != *tmp) {
> +				rib->free_node(rib, new_node);
> +				(*tmp)->flag |= RTE_RIB_VALID_NODE;
> +			}
> +			++rib->cur_routes;
> +			return *tmp;
> +		}

Comment in the above block please. My understanding is that the previous
search just confirmed that there wasn't already a valid entry present for
this new item, but did not report if there was already an invalid entry,
which is what this block is matching.

> +		i = (*tmp)->depth;

I think "d" might be a better choice of name here, rather than "i", which
you expect to be a loop variable.

> +		if ((i >= depth) || (((uint64_t)key >> (32 - i)) !=
> +				((uint64_t)(*tmp)->key >> (32 - i))))
> +			break;
> +		prev = *tmp;
> +		tmp = (key & (1 << (31 - i))) ? &(*tmp)->right : &(*tmp)->left;
> +	}
> +	common_depth = RTE_MIN(depth, (*tmp)->depth);
> +	common_prefix = key ^ (*tmp)->key;
> +	i = __builtin_clz(common_prefix);
> +
> +	common_depth = RTE_MIN(i, common_depth);
> +	common_prefix = key & (uint32_t)(UINT64_MAX << (32 - common_depth));
> +	if ((common_prefix == key) && (common_depth == depth)) {
> +		if ((*tmp)->key & (1 << (31 - depth)))
> +			new_node->right = *tmp;
> +		else
> +			new_node->left = *tmp;
> +		new_node->parent = (*tmp)->parent;
> +		(*tmp)->parent = new_node;
> +		*tmp = new_node;
> +	} else {
> +		common_node = rib->alloc_node(rib);
> +		if (common_node == NULL) {
> +			rib->free_node(rib, new_node);
> +			rte_errno = ENOMEM;
> +			return NULL;
> +		}
> +		common_node->key = common_prefix;
> +		common_node->depth = common_depth;
> +		common_node->flag = 0;
> +		common_node->parent = (*tmp)->parent;
> +		new_node->parent = common_node;
> +		(*tmp)->parent = common_node;
> +		if ((new_node->key & (1 << (31 - common_depth))) == 0) {
> +			common_node->left = new_node;
> +			common_node->right = *tmp;
> +		} else {
> +			common_node->left = *tmp;
> +			common_node->right = new_node;
> +		}
> +		*tmp = common_node;
> +	}

Again some commenting of the logic here would help the reader.

> +	++rib->cur_routes;
> +	return new_node;
> +}
> +
> +struct rte_rib *
> +rte_rib_create(const char *name, int socket_id, struct rte_rib_conf *conf)
> +{
> +	char mem_name[RTE_RIB_NAMESIZE];
> +	struct rte_rib *rib = NULL;
> +	struct rte_tailq_entry *te;
> +	struct rte_rib_list *rib_list;
> +	struct rte_mempool *node_pool = NULL;
> +	enum rte_dir24_8_nh_sz dir24_8_nh_size;
> +
> +	/* Check user arguments. */
> +	if ((name == NULL) || (conf == NULL) || (socket_id < -1) ||
> +			(conf->type >= RTE_RIB_TYPE_MAX) ||
> +			(conf->alloc_type >= RTE_RIB_ALLOC_MAX) ||
> +			(conf->max_nodes == 0) ||
> +			(conf->node_sz < sizeof(struct rte_rib_node))) {
> +		rte_errno = EINVAL;
> +		return NULL;
> +	}
> +
> +	if (conf->alloc_type == RTE_RIB_MEMPOOL) {

Since you are forcing the user always to specify the max number of nodes,
why not always use mempool allocation type? What is the use-case for
malloc-based allocation instead?

> +		snprintf(mem_name, sizeof(mem_name), "MP_%s", name);
> +		node_pool = rte_mempool_create(mem_name, conf->max_nodes,
> +			conf->node_sz, 0, 0, NULL, NULL, NULL, NULL,
> +			socket_id, 0);
> +
> +		if (node_pool == NULL) {
> +			RTE_LOG(ERR, LPM,
> +				"Can not allocate mempool for RIB %s\n", name);
> +			rte_errno = ENOMEM;
> +			return NULL;
> +		}
> +
> +	}
> +
> +	snprintf(mem_name, sizeof(mem_name), "RIB_%s", name);
> +
> +	rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
> +
> +	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
> +	/* guarantee there's no existing */
> +	TAILQ_FOREACH(te, rib_list, next) {
> +		rib = (struct rte_rib *)te->data;
> +		if (strncmp(name, rib->name, RTE_RIB_NAMESIZE) == 0)
> +			break;
> +	}
> +	rib = NULL;
> +	if (te != NULL) {
> +		rte_errno = EEXIST;
> +		goto exit;
> +	}
> +
> +	/* allocate tailq entry */
> +	te = rte_zmalloc("RIB_TAILQ_ENTRY", sizeof(*te), 0);
> +	if (te == NULL) {
> +		RTE_LOG(ERR, LPM,
> +			"Can not allocate tailq entry for RIB %s\n", name);
> +		rte_errno = ENOMEM;
> +		goto exit;
> +	}
> +
> +	/* Allocate memory to store the LPM data structures. */
> +	rib = (struct rte_rib *)rte_zmalloc_socket(mem_name,
> +		sizeof(struct rte_rib),	RTE_CACHE_LINE_SIZE, socket_id);
> +	if (rib == NULL) {
> +		RTE_LOG(ERR, LPM, "RIB %s memory allocation failed\n", name);
> +		rte_errno = ENOMEM;
> +		goto free_te;
> +	}
> +	snprintf(rib->name, sizeof(rib->name), "%s", name);
> +	rib->trie = NULL;
> +	rib->max_nodes = conf->max_nodes;
> +	rib->node_sz = conf->node_sz;
> +	rib->type = conf->type;
> +	rib->alloc_type = conf->alloc_type;
> +
> +	if (conf->type <= RTE_RIB_DIR24_8_8B) {
> +		switch (conf->type) {
> +		case RTE_RIB_DIR24_8_1B:
> +			dir24_8_nh_size = RTE_DIR24_8_1B;
> +			rib->lookup = rte_dir24_8_lookup_bulk_1b;
> +			break;
> +		case RTE_RIB_DIR24_8_2B:
> +			dir24_8_nh_size = RTE_DIR24_8_2B;
> +			rib->lookup = rte_dir24_8_lookup_bulk_2b;
> +			break;
> +		case RTE_RIB_DIR24_8_4B:
> +			dir24_8_nh_size = RTE_DIR24_8_4B;
> +			rib->lookup = rte_dir24_8_lookup_bulk_4b;
> +			break;
> +		case RTE_RIB_DIR24_8_8B:
> +			dir24_8_nh_size = RTE_DIR24_8_8B;
> +			rib->lookup = rte_dir24_8_lookup_bulk_8b;
> +			break;
> +		case RTE_RIB_TYPE_MAX:
> +		default:
> +			RTE_LOG(ERR, LPM, "Bad RIB %s type\n", name);
> +			rte_errno = EINVAL;
> +			goto free_rib;
> +		}
> +		rib->fib = (void *)rte_dir24_8_create(name, socket_id,
> +				dir24_8_nh_size, conf->def_nh);
> +		if (rib->fib == NULL) {
> +			RTE_LOG(ERR, LPM, "Failed to allocate FIB %s\n", name);
> +			rte_errno = ENOMEM;
> +			goto free_rib;
> +		}
> +		rib->modify = rte_dir24_8_modify;
> +	}
> +
> +	switch (conf->alloc_type) {
> +	case RTE_RIB_MALLOC:
> +		rib->alloc_node = new_node_malloc;
> +		rib->free_node = free_node_malloc;
> +		break;
> +	case RTE_RIB_MEMPOOL:
> +		rib->node_pool = node_pool;
> +		rib->alloc_node = new_node_mempool;
> +		rib->free_node = free_node_mempool;
> +		break;
> +	case RTE_RIB_ALLOC_MAX:
> +	default:
> +		RTE_LOG(ERR, LPM, "Bad RIB %s alloc type\n", name);
> +		rte_errno = EINVAL;
> +		goto free_fib;
> +	}
> +
> +	te->data = (void *)rib;
> +	TAILQ_INSERT_TAIL(rib_list, te, next);
> +
> +	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
> +
> +	return rib;
> +
> +free_fib:
> +	switch (conf->type) {
> +	case RTE_RIB_DIR24_8_1B:
> +	case RTE_RIB_DIR24_8_2B:
> +	case RTE_RIB_DIR24_8_4B:
> +	case RTE_RIB_DIR24_8_8B:
> +		rte_dir24_8_free(rib->fib);
> +		break;
> +	default:
> +		break;
> +	}
> +free_rib:
> +	rte_free(rib);
> +free_te:
> +	rte_free(te);
> +exit:
> +	if (conf->alloc_type == RTE_RIB_MEMPOOL)
> +		rte_mempool_free(node_pool);
> +	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
> +
> +	return NULL;
> +}
> +
> +struct rte_rib *
> +rte_rib_find_existing(const char *name)
> +{
> +	struct rte_rib *rib = NULL;
> +	struct rte_tailq_entry *te;
> +	struct rte_rib_list *rib_list;
> +
> +	rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
> +
> +	rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK);
> +	TAILQ_FOREACH(te, rib_list, next) {
> +		rib = (struct rte_rib *) te->data;
> +		if (strncmp(name, rib->name, RTE_RIB_NAMESIZE) == 0)
> +			break;
> +	}
> +	rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK);
> +
> +	if (te == NULL) {
> +		rte_errno = ENOENT;
> +		return NULL;
> +	}
> +
> +	return rib;
> +}
> +
> +void
> +rte_rib_free(struct rte_rib *rib)
> +{
> +	struct rte_tailq_entry *te;
> +	struct rte_rib_list *rib_list;
> +	struct rte_rib_node *tmp = NULL;
> +
> +	if (rib == NULL)
> +		return;
> +
> +	rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
> +
> +	rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
> +
> +	/* find our tailq entry */
> +	TAILQ_FOREACH(te, rib_list, next) {
> +		if (te->data == (void *)rib)
> +			break;
> +	}
> +	if (te != NULL)
> +		TAILQ_REMOVE(rib_list, te, next);
> +
> +	rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
> +
> +	while ((tmp = rte_rib_tree_get_nxt(rib, 0, 0, tmp,
> +		RTE_RIB_GET_NXT_ALL)) != NULL)
> +		rte_rib_tree_remove(rib, tmp->key, tmp->depth);
> +
> +	if (rib->alloc_type == RTE_RIB_MEMPOOL)
> +		rte_mempool_free(rib->node_pool);
> +
> +	switch (rib->type) {
> +	case RTE_RIB_DIR24_8_1B:
> +	case RTE_RIB_DIR24_8_2B:
> +	case RTE_RIB_DIR24_8_4B:
> +	case RTE_RIB_DIR24_8_8B:
> +		rte_dir24_8_free(rib->fib);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	rte_free(rib);
> +	rte_free(te);
> +}
> +
> +int
> +rte_rib_add(struct rte_rib *rib, uint32_t ip, uint8_t depth, uint64_t next_hop)
> +{
> +	if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
> +		return -EINVAL;
> +
> +	return rib->modify(rib, ip, depth, next_hop, RTE_RIB_ADD);
> +}
> +
> +int
> +rte_rib_delete(struct rte_rib *rib, uint32_t ip, uint8_t depth)
> +{
> +	if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
> +		return -EINVAL;
> +
> +	return rib->modify(rib, ip, depth, 0, RTE_RIB_DEL);
> +}
<snip>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/2] Add RIB library
  2018-03-15 14:27   ` Bruce Richardson
@ 2018-03-25 18:35     ` Vladimir Medvedkin
  2018-03-26  9:55       ` Bruce Richardson
  0 siblings, 1 reply; 7+ messages in thread
From: Vladimir Medvedkin @ 2018-03-25 18:35 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, thomas

2018-03-15 17:27 GMT+03:00 Bruce Richardson <bruce.richardson@intel.com>:

> On Thu, Feb 22, 2018 at 10:50:55PM +0000, Medvedkin Vladimir wrote:
> > RIB is an alternative to current LPM library.
> > It solves the following problems
> >  - Increases the speed of control plane operations against lpm such as
> >    adding/deleting routes
> >  - Adds abstraction from dataplane algorithms, so it is possible to add
> >    different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
> >    in addition to current dir24_8
> >  - It is possible to keep user defined application specific additional
> >    information in struct rte_rib_node which represents route entry.
> >    It can be next hop/set of next hops (i.e. active and feasible),
> >    pointers to link rte_rib_node based on some criteria (i.e. next_hop),
> >    plenty of additional control plane information.
> >  - For dir24_8 implementation it is possible to remove
> >    rte_lpm_tbl_entry.depth field that helps to save 6 bits.
> >  - Also new dir24_8 implementation supports different next_hop sizes
> >    (1/2/4/8 bytes per next hop)
> >  - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate
> >    ternary operator.
> >    Instead it returns special default value if there is no route.
> >
> > Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>
>
> More comments inline below. Mostly for rte_rib.c file.
>
> /Bruce
>
> > ---
> >  config/common_base                 |   6 +
> >  doc/api/doxy-api.conf              |   1 +
> >  lib/Makefile                       |   2 +
> >  lib/librte_rib/Makefile            |  23 ++
>
> Don't forget meson.build file too, to build with meson and ninja. [Strongly
> recommend it for your day-to-day development work too, incremental builds
> are much, much faster using ninja].
>
Will add

>
> >  lib/librte_rib/rte_dir24_8.c       | 482 ++++++++++++++++++++++++++++++
> +++
> >  lib/librte_rib/rte_dir24_8.h       | 115 ++++++++
> >  lib/librte_rib/rte_rib.c           | 526 ++++++++++++++++++++++++++++++
> +++++++
> >  lib/librte_rib/rte_rib.h           | 322 +++++++++++++++++++++++
> >  lib/librte_rib/rte_rib_version.map |  18 ++
> >  mk/rte.app.mk                      |   1 +
> >  10 files changed, 1496 insertions(+)
> >  create mode 100644 lib/librte_rib/Makefile
> >  create mode 100644 lib/librte_rib/rte_dir24_8.c
> >  create mode 100644 lib/librte_rib/rte_dir24_8.h
> >  create mode 100644 lib/librte_rib/rte_rib.c
> >  create mode 100644 lib/librte_rib/rte_rib.h
> >  create mode 100644 lib/librte_rib/rte_rib_version.map
> >
> > diff --git a/config/common_base b/config/common_base
> > index ad03cf4..aceeff5 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -679,6 +679,12 @@ CONFIG_RTE_LIBRTE_LPM=y
> >  CONFIG_RTE_LIBRTE_LPM_DEBUG=n
> >
> >  #
> > +# Compile librte_rib
> > +#
> > +CONFIG_RTE_LIBRTE_RIB=y
> > +CONFIG_RTE_LIBRTE_RIB_DEBUG=n
> > +
> > +#
> >  # Compile librte_acl
> >  #
> >  CONFIG_RTE_LIBRTE_ACL=y
> > diff --git a/doc/api/doxy-api.conf b/doc/api/doxy-api.conf
> > index cda52fd..8e4f969 100644
> > --- a/doc/api/doxy-api.conf
> > +++ b/doc/api/doxy-api.conf
> > @@ -60,6 +60,7 @@ INPUT                   = doc/api/doxy-api-index.md \
> >                            lib/librte_kvargs \
> >                            lib/librte_latencystats \
> >                            lib/librte_lpm \
> > +                          lib/librte_rib \
> >                            lib/librte_mbuf \
> >                            lib/librte_member \
> >                            lib/librte_mempool \
> > diff --git a/lib/Makefile b/lib/Makefile
> > index ec965a6..e4faf10 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -43,6 +43,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_EFD) += librte_efd
> >  DEPDIRS-librte_efd := librte_eal librte_ring librte_hash
> >  DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
> >  DEPDIRS-librte_lpm := librte_eal
> > +DIRS-$(CONFIG_RTE_LIBRTE_RIB) += librte_rib
> > +DEPDIRS-librte_rib := librte_eal librte_mempool
> >  DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
> >  DEPDIRS-librte_acl := librte_eal
> >  DIRS-$(CONFIG_RTE_LIBRTE_MEMBER) += librte_member
> > diff --git a/lib/librte_rib/Makefile b/lib/librte_rib/Makefile
> > new file mode 100644
> > index 0000000..f6431b6
> > --- /dev/null
> > +++ b/lib/librte_rib/Makefile
> > @@ -0,0 +1,23 @@
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# library name
> > +LIB = librte_rib.a
> > +
> > +CFLAGS += -O3
> > +CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
> > +LDLIBS += -lrte_eal -lrte_mempool
> > +
> > +EXPORT_MAP := rte_rib_version.map
> > +
> > +LIBABIVER := 1
> > +
> > +# all source are stored in SRCS-y
> > +SRCS-$(CONFIG_RTE_LIBRTE_RIB) := rte_rib.c rte_dir24_8.c
> > +
> > +# install this header file
> > +SYMLINK-$(CONFIG_RTE_LIBRTE_RIB)-include := rte_rib.h rte_dir24_8.h
> > +
> > +include $(RTE_SDK)/mk/rte.lib.mk
> > diff --git a/lib/librte_rib/rte_dir24_8.c b/lib/librte_rib/rte_dir24_8.c
> > new file mode 100644
> > index 0000000..2fc55fe
> > --- /dev/null
> > +++ b/lib/librte_rib/rte_dir24_8.c
>
> For future patches, it would be good if you can split the dir24_8 code into
> a separate patch from the main rib code. The more you can split it into
> separate patch blocks, the easier it is to review.
>
Ok

>
> > @@ -0,0 +1,482 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> > + */
>
> <snip>
>
> > diff --git a/lib/librte_rib/rte_rib.c b/lib/librte_rib/rte_rib.c
> > new file mode 100644
> > index 0000000..7783b23
> > --- /dev/null
> > +++ b/lib/librte_rib/rte_rib.c
> > @@ -0,0 +1,526 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2018 Vladimir Medvedkin <medvedkinv@gmail.com>
> > + */
> > +
> > +#include <stdint.h>
> > +#include <stdlib.h>
> > +#include <stdio.h>
> > +#include <string.h>
> > +#include <sys/queue.h>
> > +
> > +#include <rte_eal.h>
> > +#include <rte_eal_memconfig.h>
> > +#include <rte_common.h>
> > +#include <rte_tailq.h>
> > +#include <rte_errno.h>
> > +#include <rte_rwlock.h>
> > +#include <rte_memory.h>
> > +#include <rte_memzone.h>
> > +#include <rte_mempool.h>
> > +#include <rte_malloc.h>
> > +#include <rte_log.h>
> > +
> > +#include <rte_rib.h>
> > +#include <rte_dir24_8.h>
> > +
> > +TAILQ_HEAD(rte_rib_list, rte_tailq_entry);
> > +static struct rte_tailq_elem rte_rib_tailq = {
> > +     .name = "RTE_RIB",
> > +};
> > +EAL_REGISTER_TAILQ(rte_rib_tailq)
> > +
> > +static struct rte_rib_node *
> > +new_node_malloc(struct rte_rib *rib)
> > +{
> > +     struct rte_rib_node *ent;
> > +
> > +     ent =  malloc(rib->node_sz);
> > +     if (unlikely(ent == NULL))
> > +             return NULL;
> > +     ++rib->cur_nodes;
> > +     return ent;
> > +}
> > +
> > +static void
> > +free_node_malloc(__rte_unused struct rte_rib *rib, struct rte_rib_node
> *ent)
> > +{
> > +     --rib->cur_nodes;
> > +     free(ent);
> > +}
> > +
> > +static struct rte_rib_node *
> > +new_node_mempool(struct rte_rib *rib)
> > +{
> > +     struct rte_rib_node *ent;
> > +     int ret;
> > +
> > +     ret = rte_mempool_get(rib->node_pool, (void *)&ent);
> > +     if (unlikely(ret != 0))
> > +             return NULL;
> > +     ++rib->cur_nodes;
> > +     return ent;
> > +}
> > +
> > +static void
> > +free_node_mempool(struct rte_rib *rib, struct rte_rib_node *ent)
> > +{
> > +     --rib->cur_nodes;
> > +     rte_mempool_put(rib->node_pool, ent);
> > +}
> > +
> > +struct rte_rib_node *
> > +rte_rib_tree_lookup(struct rte_rib *rib, uint32_t key)
> > +{
> > +     struct rte_rib_node *cur = rib->trie;
> > +     struct rte_rib_node *prev = NULL;
> > +
> > +     while ((cur != NULL) && (((cur->key ^ key) &
> > +             (uint32_t)(UINT64_MAX << (32 - cur->depth))) == 0)) {
> > +             if ((cur->flag & RTE_RIB_VALID_NODE) == RTE_RIB_VALID_NODE)
> > +                     prev = cur;
> > +             cur = RTE_RIB_GET_NXT_NODE(cur, key);
> > +     }
> > +     return prev;
> > +}
> > +
> > +struct rte_rib_node *
> > +rte_rib_tree_lookup_parent(struct rte_rib_node *ent)
> > +{
> > +     struct rte_rib_node *tmp;
> > +
> > +     if (ent == NULL)
> > +             return NULL;
> > +     tmp = ent->parent;
> > +     while ((tmp != NULL) &&
> > +             (tmp->flag & RTE_RIB_VALID_NODE) != RTE_RIB_VALID_NODE) {
> > +             tmp = tmp->parent;
> > +}
> Watch indentation here. The close brace obviously needs an indent, and the
> line continuation of while should have different indent to body. Coding
> standards suggest double indent on the continuation, ie. two tabs before
> "(tmp->flag".
>
 Ah, I missed it, thanks


>
> > +     return tmp;
> > +}
> > +
> > +struct rte_rib_node *
> > +rte_rib_tree_lookup_exact(struct rte_rib *rib, uint32_t key, uint8_t
> depth)
> > +{
> > +     struct rte_rib_node *cur = rib->trie;
> > +
> > +     key &= (uint32_t)(UINT64_MAX << (32 - depth));
> This mask generation seems a common idiom. Maybe make an inline fn or macro
> out of it.
>
Got it


>
> > +     while (cur != NULL) {
> > +             if ((cur->key == key) && (cur->depth == depth) &&
> > +                             (cur->flag & RTE_RIB_VALID_NODE))
> > +                     return cur;
> > +             if ((cur->depth > depth) ||
> > +                             (((uint64_t)key >> (32 - cur->depth)) !=
> > +                             ((uint64_t)cur->key >> (32 - cur->depth))))
> > +                     break;
> > +             cur = RTE_RIB_GET_NXT_NODE(cur, key);
> > +     }
> > +     return NULL;
> > +}
> > +
> > +struct rte_rib_node *
> > +rte_rib_tree_get_nxt(struct rte_rib *rib, uint32_t key,
> > +     uint8_t depth, struct rte_rib_node *cur, int flag)
> > +{
> > +     struct rte_rib_node *tmp, *prev = NULL;
> > +
> > +     if (cur == NULL) {
> > +             tmp = rib->trie;
> > +             while ((tmp) && (tmp->depth < depth))
> > +                     tmp = RTE_RIB_GET_NXT_NODE(tmp, key);
> > +     } else {
> > +             tmp = cur;
> > +             while ((tmp->parent != NULL) &&
> (RTE_RIB_IS_RIGHT_NODE(tmp) ||
> > +                             (tmp->parent->right == NULL))) {
> > +                     tmp = tmp->parent;
> > +                     if ((tmp->flag & RTE_RIB_VALID_NODE) &&
> > +                             (RTE_RIB_IS_COVERED(tmp->key, tmp->depth,
> > +                                     key, depth)))
> > +                             return tmp;
> > +             }
> > +             tmp = (tmp->parent) ? tmp->parent->right : NULL;
> > +     }
> > +     while (tmp) {
> > +             if ((tmp->flag & RTE_RIB_VALID_NODE) &&
> > +                     (RTE_RIB_IS_COVERED(tmp->key, tmp->depth,
> > +                             key, depth))) {
> > +                     prev = tmp;
> > +                     if (flag == RTE_RIB_GET_NXT_COVER)
> > +                             return prev;
> > +             }
> > +             tmp = (tmp->left) ? tmp->left : tmp->right;
> > +     }
> > +     return prev;
> > +}
>
> I think this function could do with some comments explaining the logic
> behind it.
>
This function traverses on subtree and retrieves more specific routes for a
given in args key/depth prefix (treat it like a top of the subtree).
Traverse without using recursion but using some kind of stack. It uses *cur
argument like a pointer to the last returned node to resume retrieval after
cur node.


> > +
> > +void
> > +rte_rib_tree_remove(struct rte_rib *rib, uint32_t key, uint8_t depth)
> > +{
> > +     struct rte_rib_node *cur, *prev, *child;
> > +
> > +     cur = rte_rib_tree_lookup_exact(rib, key, depth);
> > +     if (cur == NULL)
> > +             return;
> > +
> > +     --rib->cur_routes;
> > +     cur->flag &= ~RTE_RIB_VALID_NODE;
> > +     while ((cur->flag & RTE_RIB_VALID_NODE) != RTE_RIB_VALID_NODE) {
> > +             if ((cur->left != NULL) && (cur->right != NULL))
> > +                     return;
> > +             child = (cur->left == NULL) ? cur->right : cur->left;
> > +             if (child != NULL)
> > +                     child->parent = cur->parent;
> > +             if (cur->parent == NULL) {
> > +                     rib->trie = child;
> > +                     rib->free_node(rib, cur);
> > +                     return;
> > +             }
> > +             if (cur->parent->left == cur)
> > +                     cur->parent->left = child;
> > +             else
> > +                     cur->parent->right = child;
> > +             prev = cur;
> > +             cur = cur->parent;
> > +             rib->free_node(rib, prev);
> > +     }
> > +}
> > +
> > +struct rte_rib_node *
> > +rte_rib_tree_insert(struct rte_rib *rib, uint32_t key, uint8_t depth)
> > +{
> > +     struct rte_rib_node **tmp = &rib->trie;
> > +     struct rte_rib_node *prev = NULL;
> > +     struct rte_rib_node *new_node = NULL;
> > +     struct rte_rib_node *common_node = NULL;
> > +     int i = 0;
> > +     uint32_t common_prefix;
> > +     uint8_t common_depth;
> > +
> > +     if (depth > 32) {
> > +             rte_errno = EINVAL;
> > +             return NULL;
> > +     }
> > +
> > +     key &= (uint32_t)(UINT64_MAX << (32 - depth));
> > +     new_node = rte_rib_tree_lookup_exact(rib, key, depth);
> > +     if (new_node != NULL) {
> > +             rte_errno = EEXIST;
> > +             return NULL;
> > +     }
> > +
> > +     new_node = rib->alloc_node(rib);
> > +     if (new_node == NULL) {
> > +             rte_errno = ENOMEM;
> > +             return NULL;
> > +     }
> > +     new_node->left = NULL;
> > +     new_node->right = NULL;
> > +     new_node->parent = NULL;
> > +     new_node->key = key;
> > +     new_node->depth = depth;
> > +     new_node->flag = RTE_RIB_VALID_NODE;
> > +
> > +     while (1) {
> > +             if (*tmp == NULL) {
> > +                     *tmp = new_node;
> > +                     new_node->parent = prev;
> > +             }
>
> I think it would be clearer to have a return in this block, rather than
> having fallthrough to the next one.
>
Ok


>
> > +             if ((key == (*tmp)->key) && (depth == (*tmp)->depth)) {
> > +                     if (new_node != *tmp) {
> > +                             rib->free_node(rib, new_node);
> > +                             (*tmp)->flag |= RTE_RIB_VALID_NODE;
> > +                     }
> > +                     ++rib->cur_routes;
> > +                     return *tmp;
> > +             }
>
> Comment in the above block please. My understanding is that the previous
> search just confirmed that there wasn't already a valid entry present for
> this new item, but did not report if there was already an invalid entry,
> which is what this block is matching.
>
Right.
In this while loop it traverse down the tree to find matching node (or find
closest matching).
In the block above if node matches to search criteria ("if ((key ==
(*tmp)->key) && (depth == (*tmp)->depth))")
it means there is already inserted node (as an intermediate node) with
proper criteria but with RTE_RIB_VALID_NODE flag (becaue of
rte_rib_tree_lookup_exact returns NULL).  So it frees new allocated node
(new_node), inits flag and so on. The only case I check for equality
between *tmp and new_node ("if (new_node != *tmp)") is fallthrough for case
above ("if (*tmp == NULL)"). But if there will be return (as you mentioned
early), it is possible to eliminate if statement ("if (new_node != *tmp)").


>
> > +             i = (*tmp)->depth;
>
> I think "d" might be a better choice of name here, rather than "i", which
> you expect to be a loop variable.
>
got it

>
> > +             if ((i >= depth) || (((uint64_t)key >> (32 - i)) !=
> > +                             ((uint64_t)(*tmp)->key >> (32 - i))))
> > +                     break;
> > +             prev = *tmp;
> > +             tmp = (key & (1 << (31 - i))) ? &(*tmp)->right :
> &(*tmp)->left;
> > +     }
> > +     common_depth = RTE_MIN(depth, (*tmp)->depth);
> > +     common_prefix = key ^ (*tmp)->key;
> > +     i = __builtin_clz(common_prefix);
> > +
> > +     common_depth = RTE_MIN(i, common_depth);
> > +     common_prefix = key & (uint32_t)(UINT64_MAX << (32 -
> common_depth));
> > +     if ((common_prefix == key) && (common_depth == depth)) {
> > +             if ((*tmp)->key & (1 << (31 - depth)))
> > +                     new_node->right = *tmp;
> > +             else
> > +                     new_node->left = *tmp;
> > +             new_node->parent = (*tmp)->parent;
> > +             (*tmp)->parent = new_node;
> > +             *tmp = new_node;
> > +     } else {
> > +             common_node = rib->alloc_node(rib);
> > +             if (common_node == NULL) {
> > +                     rib->free_node(rib, new_node);
> > +                     rte_errno = ENOMEM;
> > +                     return NULL;
> > +             }
> > +             common_node->key = common_prefix;
> > +             common_node->depth = common_depth;
> > +             common_node->flag = 0;
> > +             common_node->parent = (*tmp)->parent;
> > +             new_node->parent = common_node;
> > +             (*tmp)->parent = common_node;
> > +             if ((new_node->key & (1 << (31 - common_depth))) == 0) {
> > +                     common_node->left = new_node;
> > +                     common_node->right = *tmp;
> > +             } else {
> > +                     common_node->left = *tmp;
> > +                     common_node->right = new_node;
> > +             }
> > +             *tmp = common_node;
> > +     }
>
> Again some commenting of the logic here would help the reader.

After closest matching prefix was found in a while loop it finds common
(for a given key/depth and found closest) key and prefix length.
After it whether inserts new node as a parent for found prefix (if
statement "((common_prefix == key) && (common_depth == depth))" is true) or
as a child in otherwise.


> > +     ++rib->cur_routes;
> > +     return new_node;
> > +}
> > +
> > +struct rte_rib *
> > +rte_rib_create(const char *name, int socket_id, struct rte_rib_conf
> *conf)
> > +{
> > +     char mem_name[RTE_RIB_NAMESIZE];
> > +     struct rte_rib *rib = NULL;
> > +     struct rte_tailq_entry *te;
> > +     struct rte_rib_list *rib_list;
> > +     struct rte_mempool *node_pool = NULL;
> > +     enum rte_dir24_8_nh_sz dir24_8_nh_size;
> > +
> > +     /* Check user arguments. */
> > +     if ((name == NULL) || (conf == NULL) || (socket_id < -1) ||
> > +                     (conf->type >= RTE_RIB_TYPE_MAX) ||
> > +                     (conf->alloc_type >= RTE_RIB_ALLOC_MAX) ||
> > +                     (conf->max_nodes == 0) ||
> > +                     (conf->node_sz < sizeof(struct rte_rib_node))) {
> > +             rte_errno = EINVAL;
> > +             return NULL;
> > +     }
> > +
> > +     if (conf->alloc_type == RTE_RIB_MEMPOOL) {
>
> Since you are forcing the user always to specify the max number of nodes,
> why not always use mempool allocation type? What is the use-case for
> malloc-based allocation instead?
>
Malloc based allocation was done in a first incarnation. As I wrote earlier
I think to remove malloc. On performance tests with adding/deleting huge
amount of routes malloc is slower.


> > +             snprintf(mem_name, sizeof(mem_name), "MP_%s", name);
> > +             node_pool = rte_mempool_create(mem_name, conf->max_nodes,
> > +                     conf->node_sz, 0, 0, NULL, NULL, NULL, NULL,
> > +                     socket_id, 0);
> > +
> > +             if (node_pool == NULL) {
> > +                     RTE_LOG(ERR, LPM,
> > +                             "Can not allocate mempool for RIB %s\n",
> name);
> > +                     rte_errno = ENOMEM;
> > +                     return NULL;
> > +             }
> > +
> > +     }
> > +
> > +     snprintf(mem_name, sizeof(mem_name), "RIB_%s", name);
> > +
> > +     rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
> > +
> > +     rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
> > +     /* guarantee there's no existing */
> > +     TAILQ_FOREACH(te, rib_list, next) {
> > +             rib = (struct rte_rib *)te->data;
> > +             if (strncmp(name, rib->name, RTE_RIB_NAMESIZE) == 0)
> > +                     break;
> > +     }
> > +     rib = NULL;
> > +     if (te != NULL) {
> > +             rte_errno = EEXIST;
> > +             goto exit;
> > +     }
> > +
> > +     /* allocate tailq entry */
> > +     te = rte_zmalloc("RIB_TAILQ_ENTRY", sizeof(*te), 0);
> > +     if (te == NULL) {
> > +             RTE_LOG(ERR, LPM,
> > +                     "Can not allocate tailq entry for RIB %s\n", name);
> > +             rte_errno = ENOMEM;
> > +             goto exit;
> > +     }
> > +
> > +     /* Allocate memory to store the LPM data structures. */
> > +     rib = (struct rte_rib *)rte_zmalloc_socket(mem_name,
> > +             sizeof(struct rte_rib), RTE_CACHE_LINE_SIZE, socket_id);
> > +     if (rib == NULL) {
> > +             RTE_LOG(ERR, LPM, "RIB %s memory allocation failed\n",
> name);
> > +             rte_errno = ENOMEM;
> > +             goto free_te;
> > +     }
> > +     snprintf(rib->name, sizeof(rib->name), "%s", name);
> > +     rib->trie = NULL;
> > +     rib->max_nodes = conf->max_nodes;
> > +     rib->node_sz = conf->node_sz;
> > +     rib->type = conf->type;
> > +     rib->alloc_type = conf->alloc_type;
> > +
> > +     if (conf->type <= RTE_RIB_DIR24_8_8B) {
> > +             switch (conf->type) {
> > +             case RTE_RIB_DIR24_8_1B:
> > +                     dir24_8_nh_size = RTE_DIR24_8_1B;
> > +                     rib->lookup = rte_dir24_8_lookup_bulk_1b;
> > +                     break;
> > +             case RTE_RIB_DIR24_8_2B:
> > +                     dir24_8_nh_size = RTE_DIR24_8_2B;
> > +                     rib->lookup = rte_dir24_8_lookup_bulk_2b;
> > +                     break;
> > +             case RTE_RIB_DIR24_8_4B:
> > +                     dir24_8_nh_size = RTE_DIR24_8_4B;
> > +                     rib->lookup = rte_dir24_8_lookup_bulk_4b;
> > +                     break;
> > +             case RTE_RIB_DIR24_8_8B:
> > +                     dir24_8_nh_size = RTE_DIR24_8_8B;
> > +                     rib->lookup = rte_dir24_8_lookup_bulk_8b;
> > +                     break;
> > +             case RTE_RIB_TYPE_MAX:
> > +             default:
> > +                     RTE_LOG(ERR, LPM, "Bad RIB %s type\n", name);
> > +                     rte_errno = EINVAL;
> > +                     goto free_rib;
> > +             }
> > +             rib->fib = (void *)rte_dir24_8_create(name, socket_id,
> > +                             dir24_8_nh_size, conf->def_nh);
> > +             if (rib->fib == NULL) {
> > +                     RTE_LOG(ERR, LPM, "Failed to allocate FIB %s\n",
> name);
> > +                     rte_errno = ENOMEM;
> > +                     goto free_rib;
> > +             }
> > +             rib->modify = rte_dir24_8_modify;
> > +     }
> > +
> > +     switch (conf->alloc_type) {
> > +     case RTE_RIB_MALLOC:
> > +             rib->alloc_node = new_node_malloc;
> > +             rib->free_node = free_node_malloc;
> > +             break;
> > +     case RTE_RIB_MEMPOOL:
> > +             rib->node_pool = node_pool;
> > +             rib->alloc_node = new_node_mempool;
> > +             rib->free_node = free_node_mempool;
> > +             break;
> > +     case RTE_RIB_ALLOC_MAX:
> > +     default:
> > +             RTE_LOG(ERR, LPM, "Bad RIB %s alloc type\n", name);
> > +             rte_errno = EINVAL;
> > +             goto free_fib;
> > +     }
> > +
> > +     te->data = (void *)rib;
> > +     TAILQ_INSERT_TAIL(rib_list, te, next);
> > +
> > +     rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
> > +
> > +     return rib;
> > +
> > +free_fib:
> > +     switch (conf->type) {
> > +     case RTE_RIB_DIR24_8_1B:
> > +     case RTE_RIB_DIR24_8_2B:
> > +     case RTE_RIB_DIR24_8_4B:
> > +     case RTE_RIB_DIR24_8_8B:
> > +             rte_dir24_8_free(rib->fib);
> > +             break;
> > +     default:
> > +             break;
> > +     }
> > +free_rib:
> > +     rte_free(rib);
> > +free_te:
> > +     rte_free(te);
> > +exit:
> > +     if (conf->alloc_type == RTE_RIB_MEMPOOL)
> > +             rte_mempool_free(node_pool);
> > +     rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
> > +
> > +     return NULL;
> > +}
> > +
> > +struct rte_rib *
> > +rte_rib_find_existing(const char *name)
> > +{
> > +     struct rte_rib *rib = NULL;
> > +     struct rte_tailq_entry *te;
> > +     struct rte_rib_list *rib_list;
> > +
> > +     rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
> > +
> > +     rte_rwlock_read_lock(RTE_EAL_TAILQ_RWLOCK);
> > +     TAILQ_FOREACH(te, rib_list, next) {
> > +             rib = (struct rte_rib *) te->data;
> > +             if (strncmp(name, rib->name, RTE_RIB_NAMESIZE) == 0)
> > +                     break;
> > +     }
> > +     rte_rwlock_read_unlock(RTE_EAL_TAILQ_RWLOCK);
> > +
> > +     if (te == NULL) {
> > +             rte_errno = ENOENT;
> > +             return NULL;
> > +     }
> > +
> > +     return rib;
> > +}
> > +
> > +void
> > +rte_rib_free(struct rte_rib *rib)
> > +{
> > +     struct rte_tailq_entry *te;
> > +     struct rte_rib_list *rib_list;
> > +     struct rte_rib_node *tmp = NULL;
> > +
> > +     if (rib == NULL)
> > +             return;
> > +
> > +     rib_list = RTE_TAILQ_CAST(rte_rib_tailq.head, rte_rib_list);
> > +
> > +     rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
> > +
> > +     /* find our tailq entry */
> > +     TAILQ_FOREACH(te, rib_list, next) {
> > +             if (te->data == (void *)rib)
> > +                     break;
> > +     }
> > +     if (te != NULL)
> > +             TAILQ_REMOVE(rib_list, te, next);
> > +
> > +     rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK);
> > +
> > +     while ((tmp = rte_rib_tree_get_nxt(rib, 0, 0, tmp,
> > +             RTE_RIB_GET_NXT_ALL)) != NULL)
> > +             rte_rib_tree_remove(rib, tmp->key, tmp->depth);
> > +
> > +     if (rib->alloc_type == RTE_RIB_MEMPOOL)
> > +             rte_mempool_free(rib->node_pool);
> > +
> > +     switch (rib->type) {
> > +     case RTE_RIB_DIR24_8_1B:
> > +     case RTE_RIB_DIR24_8_2B:
> > +     case RTE_RIB_DIR24_8_4B:
> > +     case RTE_RIB_DIR24_8_8B:
> > +             rte_dir24_8_free(rib->fib);
> > +             break;
> > +     default:
> > +             break;
> > +     }
> > +
> > +     rte_free(rib);
> > +     rte_free(te);
> > +}
> > +
> > +int
> > +rte_rib_add(struct rte_rib *rib, uint32_t ip, uint8_t depth, uint64_t
> next_hop)
> > +{
> > +     if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
> > +             return -EINVAL;
> > +
> > +     return rib->modify(rib, ip, depth, next_hop, RTE_RIB_ADD);
> > +}
> > +
> > +int
> > +rte_rib_delete(struct rte_rib *rib, uint32_t ip, uint8_t depth)
> > +{
> > +     if ((rib == NULL) || (depth > RTE_RIB_MAXDEPTH))
> > +             return -EINVAL;
> > +
> > +     return rib->modify(rib, ip, depth, 0, RTE_RIB_DEL);
> > +}
> <snip>
>



-- 
Regards,
Vladimir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3 1/2] Add RIB library
  2018-03-25 18:35     ` Vladimir Medvedkin
@ 2018-03-26  9:55       ` Bruce Richardson
  0 siblings, 0 replies; 7+ messages in thread
From: Bruce Richardson @ 2018-03-26  9:55 UTC (permalink / raw)
  To: Vladimir Medvedkin; +Cc: dev, thomas

On Sun, Mar 25, 2018 at 09:35:35PM +0300, Vladimir Medvedkin wrote:
> 2018-03-15 17:27 GMT+03:00 Bruce Richardson <bruce.richardson@intel.com>:
> 
> > On Thu, Feb 22, 2018 at 10:50:55PM +0000, Medvedkin Vladimir wrote:
> > > RIB is an alternative to current LPM library.
> > > It solves the following problems
> > >  - Increases the speed of control plane operations against lpm such as
> > >    adding/deleting routes
> > >  - Adds abstraction from dataplane algorithms, so it is possible to add
> > >    different ip route lookup algorythms such as DXR/poptrie/lpc-trie/etc
> > >    in addition to current dir24_8
> > >  - It is possible to keep user defined application specific additional
> > >    information in struct rte_rib_node which represents route entry.
> > >    It can be next hop/set of next hops (i.e. active and feasible),
> > >    pointers to link rte_rib_node based on some criteria (i.e. next_hop),
> > >    plenty of additional control plane information.
> > >  - For dir24_8 implementation it is possible to remove
> > >    rte_lpm_tbl_entry.depth field that helps to save 6 bits.
> > >  - Also new dir24_8 implementation supports different next_hop sizes
> > >    (1/2/4/8 bytes per next hop)
> > >  - Removed RTE_LPM_LOOKUP_SUCCESS to save 1 bit and to eleminate
> > >    ternary operator.
> > >    Instead it returns special default value if there is no route.
> > >
> > > Signed-off-by: Medvedkin Vladimir <medvedkinv@gmail.com>
> >
> > More comments inline below. Mostly for rte_rib.c file.
> >
> > /Bruce
> >
<snip>
> >
> > > +     while (cur != NULL) {
> > > +             if ((cur->key == key) && (cur->depth == depth) &&
> > > +                             (cur->flag & RTE_RIB_VALID_NODE))
> > > +                     return cur;
> > > +             if ((cur->depth > depth) ||
> > > +                             (((uint64_t)key >> (32 - cur->depth)) !=
> > > +                             ((uint64_t)cur->key >> (32 - cur->depth))))
> > > +                     break;
> > > +             cur = RTE_RIB_GET_NXT_NODE(cur, key);
> > > +     }
> > > +     return NULL;
> > > +}
> > > +
> > > +struct rte_rib_node *
> > > +rte_rib_tree_get_nxt(struct rte_rib *rib, uint32_t key,
> > > +     uint8_t depth, struct rte_rib_node *cur, int flag)
> > > +{
> > > +     struct rte_rib_node *tmp, *prev = NULL;
> > > +
> > > +     if (cur == NULL) {
> > > +             tmp = rib->trie;
> > > +             while ((tmp) && (tmp->depth < depth))
> > > +                     tmp = RTE_RIB_GET_NXT_NODE(tmp, key);
> > > +     } else {
> > > +             tmp = cur;
> > > +             while ((tmp->parent != NULL) &&
> > (RTE_RIB_IS_RIGHT_NODE(tmp) ||
> > > +                             (tmp->parent->right == NULL))) {
> > > +                     tmp = tmp->parent;
> > > +                     if ((tmp->flag & RTE_RIB_VALID_NODE) &&
> > > +                             (RTE_RIB_IS_COVERED(tmp->key, tmp->depth,
> > > +                                     key, depth)))
> > > +                             return tmp;
> > > +             }
> > > +             tmp = (tmp->parent) ? tmp->parent->right : NULL;
> > > +     }
> > > +     while (tmp) {
> > > +             if ((tmp->flag & RTE_RIB_VALID_NODE) &&
> > > +                     (RTE_RIB_IS_COVERED(tmp->key, tmp->depth,
> > > +                             key, depth))) {
> > > +                     prev = tmp;
> > > +                     if (flag == RTE_RIB_GET_NXT_COVER)
> > > +                             return prev;
> > > +             }
> > > +             tmp = (tmp->left) ? tmp->left : tmp->right;
> > > +     }
> > > +     return prev;
> > > +}
> >
> > I think this function could do with some comments explaining the logic
> > behind it.
> >
> This function traverses on subtree and retrieves more specific routes for a
> given in args key/depth prefix (treat it like a top of the subtree).
> Traverse without using recursion but using some kind of stack. It uses *cur
> argument like a pointer to the last returned node to resume retrieval after
> cur node.
> 
Yes. Please add such an explanation into the code for future patches. [Same
with the other additional explanations in your reply email].

Thanks,
/Bruce

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-03-26  9:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-22 22:50 [PATCH v3 0/2] lib/rib: Add Routing Information Base library Medvedkin Vladimir
2018-02-22 22:50 ` [PATCH v3 1/2] Add RIB library Medvedkin Vladimir
2018-03-14 11:58   ` Bruce Richardson
2018-03-15 14:27   ` Bruce Richardson
2018-03-25 18:35     ` Vladimir Medvedkin
2018-03-26  9:55       ` Bruce Richardson
2018-02-22 22:50 ` [PATCH v3 2/2] Add autotests for " Medvedkin Vladimir

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.