[PATCH v4 0/3] bpf: add longest prefix match map

* [PATCH v4 0/3] bpf: add longest prefix match map
@ 2017-01-21 16:26 Daniel Mack
  2017-01-21 16:26 ` [PATCH v4 1/3] bpf: add a longest prefix match trie map implementation Daniel Mack
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Daniel Mack @ 2017-01-21 16:26 UTC (permalink / raw)
  To: ast; +Cc: dh.herrmann, daniel, netdev, davem, Daniel Mack

This patch set adds a longest prefix match algorithm that can be used
to match IP addresses to a stored set of ranges. It is exposed as a
bpf map type.

Internally, data is stored in an unbalanced tree of nodes that has a
maximum height of n, where n is the prefixlen the trie was created
with.

Note that this has nothing to do with fib or fib6 and is in no way meant
to replace or share code with it. It's rather a much simpler
implementation that is specifically written with bpf maps in mind.

Patch 1/2 adds the implementation, 2/2 an extensive test suite and 3/3
has benchmarking code for the new trie type.

Feedback is much appreciated.

Thanks,
Daniel

Changelog:

v3 -> v4:
	* David added a 3rd patch that augments map_perf_test for
	  LPM trie benchmarks
	* Limit allocation of maps of this new type to CAP_SYS_ADMIN
	  for now, as requested by Alexei
	* Add a stub .map_delete_elem so the core does not stumble
	  over a NULL pointer when the syscall is invoked
	* Tests for non-power-of-2 prefix lengths were added
	* More comment style fixes

v2 -> v3:
	* Store both the key match data and the caller provided
	  value in the same byte array attached to a node. This
	  avoids double allocations
	* Bring back node->flags to distinguish between 'real'
	  and intermediate nodes
	* Fix comment style and some typos

v1 -> v2:
	* Turn spin lock into raw spinlock
	* Lock with irqsave options during trie_update_elem()
	* Return -ENOMEM properly from trie_alloc()
	* Force attr->flags == BPF_F_NO_PREALLOC during creation
	* Set trie->map.pages after creation to account for map memory
	* Allow arbitrary value sizes
	* Removed node->flags and denode intermediate nodes through
	  node->value == NULL instead

rfc -> v1:
	* Add __rcu pointer annotations to make sparse happy
	* Fold _lpm_trie_find_target_node() into its only caller
	* Fix some minor documentation issues

Daniel Mack (1):
  bpf: add a longest prefix match trie map implementation

David Herrmann (2):
  bpf: Add tests for the lpm trie map
  samples/bpf: add lpm-trie benchmark

 include/uapi/linux/bpf.h                   |   7 +
 kernel/bpf/Makefile                        |   2 +-
 kernel/bpf/lpm_trie.c                      | 503 +++++++++++++++++++++++++++++
 samples/bpf/map_perf_test_kern.c           |  30 ++
 samples/bpf/map_perf_test_user.c           |  49 +++
 tools/testing/selftests/bpf/.gitignore     |   1 +
 tools/testing/selftests/bpf/Makefile       |   4 +-
 tools/testing/selftests/bpf/test_lpm_map.c | 358 ++++++++++++++++++++
 8 files changed, 951 insertions(+), 3 deletions(-)
 create mode 100644 kernel/bpf/lpm_trie.c
 create mode 100644 tools/testing/selftests/bpf/test_lpm_map.c

-- 
2.9.3

^ permalink raw reply	[flat|nested] 10+ messages in thread