* [PATCH bpf 0/2] bpf: Derive source IP addr via bpf_*_fib_lookup()
@ 2023-09-29 15:07 Martynas Pumputis
2023-09-29 15:07 ` [PATCH bpf 1/2] " Martynas Pumputis
2023-09-29 15:07 ` [PATCH bpf 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests Martynas Pumputis
0 siblings, 2 replies; 5+ messages in thread
From: Martynas Pumputis @ 2023-09-29 15:07 UTC (permalink / raw)
To: bpf
Cc: Daniel Borkmann, netdev, Martin KaFai Lau, Nikolay Aleksandrov,
Martynas Pumputis
The patchset fixes the limitation of bpf_*_fib_lookup() helper, which
prevents it from being used in BPF dataplanes with network interfaces
which have more than one IP addr. See the first patch for more details.
Thanks!
Martynas Pumputis (2):
bpf: Derive source IP addr via bpf_*_fib_lookup()
selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests
include/uapi/linux/bpf.h | 9 +++
net/core/filter.c | 13 +++-
tools/include/uapi/linux/bpf.h | 10 +++
.../selftests/bpf/prog_tests/fib_lookup.c | 76 +++++++++++++++++--
4 files changed, 101 insertions(+), 7 deletions(-)
--
2.42.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH bpf 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup()
2023-09-29 15:07 [PATCH bpf 0/2] bpf: Derive source IP addr via bpf_*_fib_lookup() Martynas Pumputis
@ 2023-09-29 15:07 ` Martynas Pumputis
2023-09-30 15:09 ` kernel test robot
2023-10-01 10:06 ` kernel test robot
2023-09-29 15:07 ` [PATCH bpf 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests Martynas Pumputis
1 sibling, 2 replies; 5+ messages in thread
From: Martynas Pumputis @ 2023-09-29 15:07 UTC (permalink / raw)
To: bpf
Cc: Daniel Borkmann, netdev, Martin KaFai Lau, Nikolay Aleksandrov,
Martynas Pumputis
Extend the bpf_fib_lookup() helper by making it to return the source
IPv4/IPv6 address if the BPF_FIB_LOOKUP_SET_SRC flag is set.
For example, the following snippet can be used to derive the desired
source IP address:
struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr };
ret = bpf_skb_fib_lookup(skb, p, sizeof(p),
BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH);
if (ret != BPF_FIB_LKUP_RET_SUCCESS)
return TC_ACT_SHOT;
/* the p.ipv4_src now contains the source address */
The inability to derive the proper source address may cause malfunctions
in BPF-based dataplanes for hosts containing netdevs with more than one
routable IP address or for multi-homed hosts.
For example, Cilium implements packet masquerading in BPF. If an
egressing netdev to which the Cilium's BPF prog is attached has
multiple IP addresses, then only one [hardcoded] IP address can be used for
masquerading. This breaks connectivity if any other IP address should have
been selected instead, for example, when a public and private addresses
are attached to the same egress interface.
The change was tested with Cilium [1].
Nikolay Aleksandrov helped to figure out the IPv6 addr selection.
[1]: https://github.com/cilium/cilium/pull/28283
Signed-off-by: Martynas Pumputis <m@lambda.lt>
---
include/uapi/linux/bpf.h | 9 +++++++++
net/core/filter.c | 13 ++++++++++++-
tools/include/uapi/linux/bpf.h | 10 ++++++++++
3 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0448700890f7..a6bf686eecbc 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3257,6 +3257,10 @@ union bpf_attr {
* and *params*->smac will not be set as output. A common
* use case is to call **bpf_redirect_neigh**\ () after
* doing **bpf_fib_lookup**\ ().
+ * **BPF_FIB_LOOKUP_SET_SRC**
+ * Derive and set source IP addr in *params*->ipv{4,6}_src
+ * for the nexthop. If the src addr cannot be derived,
+ * **BPF_FIB_LKUP_RET_NO_SRC_ADDR** is returned.
*
* *ctx* is either **struct xdp_md** for XDP programs or
* **struct sk_buff** tc cls_act programs.
@@ -6953,6 +6957,7 @@ enum {
BPF_FIB_LOOKUP_OUTPUT = (1U << 1),
BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2),
BPF_FIB_LOOKUP_TBID = (1U << 3),
+ BPF_FIB_LOOKUP_SET_SRC = (1U << 4),
};
enum {
@@ -6965,6 +6970,7 @@ enum {
BPF_FIB_LKUP_RET_UNSUPP_LWT, /* fwd requires encapsulation */
BPF_FIB_LKUP_RET_NO_NEIGH, /* no neighbor entry for nh */
BPF_FIB_LKUP_RET_FRAG_NEEDED, /* fragmentation required to fwd */
+ BPF_FIB_LKUP_RET_NO_SRC_ADDR, /* failed to derive IP src addr */
};
struct bpf_fib_lookup {
@@ -6999,6 +7005,9 @@ struct bpf_fib_lookup {
__u32 rt_metric;
};
+ /* input: source address to consider for lookup
+ * output: source address result from lookup
+ */
union {
__be32 ipv4_src;
__u32 ipv6_src[4]; /* in6_addr; network order */
diff --git a/net/core/filter.c b/net/core/filter.c
index a094694899c9..e9cdcf49df62 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5850,6 +5850,9 @@ static int bpf_ipv4_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
params->rt_metric = res.fi->fib_priority;
params->ifindex = dev->ifindex;
+ if (flags & BPF_FIB_LOOKUP_SET_SRC)
+ params->ipv4_src = fib_result_prefsrc(net, &res);
+
/* xdp and cls_bpf programs are run in RCU-bh so
* rcu_read_lock_bh is not needed here
*/
@@ -5992,6 +5995,13 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
params->rt_metric = res.f6i->fib6_metric;
params->ifindex = dev->ifindex;
+ if (flags & BPF_FIB_LOOKUP_SET_SRC) {
+ err = ip6_route_get_saddr(net, res.f6i, &fl6.daddr, 0,
+ (struct in6_addr *)¶ms->ipv6_src);
+ if (err)
+ return BPF_FIB_LKUP_RET_NO_SRC_ADDR;
+ }
+
if (flags & BPF_FIB_LOOKUP_SKIP_NEIGH)
goto set_fwd_params;
@@ -6010,7 +6020,8 @@ static int bpf_ipv6_fib_lookup(struct net *net, struct bpf_fib_lookup *params,
#endif
#define BPF_FIB_LOOKUP_MASK (BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_OUTPUT | \
- BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID)
+ BPF_FIB_LOOKUP_SKIP_NEIGH | BPF_FIB_LOOKUP_TBID | \
+ BPF_FIB_LOOKUP_SET_SRC)
BPF_CALL_4(bpf_xdp_fib_lookup, struct xdp_buff *, ctx,
struct bpf_fib_lookup *, params, int, plen, u32, flags)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0448700890f7..72cd0ca97689 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3257,6 +3257,10 @@ union bpf_attr {
* and *params*->smac will not be set as output. A common
* use case is to call **bpf_redirect_neigh**\ () after
* doing **bpf_fib_lookup**\ ().
+ * **BPF_FIB_LOOKUP_SET_SRC**
+ * Derive and set source IP addr in *params*->ipv{4,6}_src
+ * for the nexthop. If the src addr cannot be derived,
+ * **BPF_FIB_LKUP_RET_NO_SRC_ADDR** is returned.
*
* *ctx* is either **struct xdp_md** for XDP programs or
* **struct sk_buff** tc cls_act programs.
@@ -6953,6 +6957,7 @@ enum {
BPF_FIB_LOOKUP_OUTPUT = (1U << 1),
BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2),
BPF_FIB_LOOKUP_TBID = (1U << 3),
+ BPF_FIB_LOOKUP_SET_SRC = (1U << 4),
};
enum {
@@ -6965,6 +6970,7 @@ enum {
BPF_FIB_LKUP_RET_UNSUPP_LWT, /* fwd requires encapsulation */
BPF_FIB_LKUP_RET_NO_NEIGH, /* no neighbor entry for nh */
BPF_FIB_LKUP_RET_FRAG_NEEDED, /* fragmentation required to fwd */
+ BPF_FIB_LKUP_RET_NO_SRC_ADDR, /* failed to derive IP src addr */
};
struct bpf_fib_lookup {
@@ -6999,6 +7005,10 @@ struct bpf_fib_lookup {
__u32 rt_metric;
};
+
+ /* input: source address to consider for lookup
+ * output: source address result from lookup
+ */
union {
__be32 ipv4_src;
__u32 ipv6_src[4]; /* in6_addr; network order */
--
2.42.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH bpf 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests
2023-09-29 15:07 [PATCH bpf 0/2] bpf: Derive source IP addr via bpf_*_fib_lookup() Martynas Pumputis
2023-09-29 15:07 ` [PATCH bpf 1/2] " Martynas Pumputis
@ 2023-09-29 15:07 ` Martynas Pumputis
1 sibling, 0 replies; 5+ messages in thread
From: Martynas Pumputis @ 2023-09-29 15:07 UTC (permalink / raw)
To: bpf
Cc: Daniel Borkmann, netdev, Martin KaFai Lau, Nikolay Aleksandrov,
Martynas Pumputis
This patch extends the existing fib_lookup test suite by adding two test
cases (for each IP family):
* Test source IP selection when default route is used.
* Test source IP selection when an IP route has a preferred src IP addr.
Signed-off-by: Martynas Pumputis <m@lambda.lt>
---
.../selftests/bpf/prog_tests/fib_lookup.c | 76 +++++++++++++++++--
1 file changed, 70 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c
index 2fd05649bad1..1b0ab1dbd4f1 100644
--- a/tools/testing/selftests/bpf/prog_tests/fib_lookup.c
+++ b/tools/testing/selftests/bpf/prog_tests/fib_lookup.c
@@ -11,9 +11,13 @@
#define NS_TEST "fib_lookup_ns"
#define IPV6_IFACE_ADDR "face::face"
+#define IPV6_IFACE_ADDR_SEC "cafe::cafe"
+#define IPV6_ADDR_DST "face::3"
#define IPV6_NUD_FAILED_ADDR "face::1"
#define IPV6_NUD_STALE_ADDR "face::2"
#define IPV4_IFACE_ADDR "10.0.0.254"
+#define IPV4_IFACE_ADDR_SEC "10.1.0.254"
+#define IPV4_ADDR_DST "10.2.0.254"
#define IPV4_NUD_FAILED_ADDR "10.0.0.1"
#define IPV4_NUD_STALE_ADDR "10.0.0.2"
#define IPV4_TBID_ADDR "172.0.0.254"
@@ -31,6 +35,8 @@ struct fib_lookup_test {
const char *desc;
const char *daddr;
int expected_ret;
+ const char *expected_ipv4_src;
+ const char *expected_ipv6_src;
int lookup_flags;
__u32 tbid;
__u8 dmac[6];
@@ -69,6 +75,22 @@ static const struct fib_lookup_test tests[] = {
.daddr = IPV6_TBID_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS,
.lookup_flags = BPF_FIB_LOOKUP_DIRECT | BPF_FIB_LOOKUP_TBID, .tbid = 100,
.dmac = DMAC_INIT2, },
+ { .desc = "IPv4 set src addr",
+ .daddr = IPV4_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS,
+ .expected_ipv4_src = IPV4_IFACE_ADDR,
+ .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, },
+ { .desc = "IPv6 set src addr",
+ .daddr = IPV6_NUD_FAILED_ADDR, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS,
+ .expected_ipv6_src = IPV6_IFACE_ADDR,
+ .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, },
+ { .desc = "IPv4 set prefsrc addr from route",
+ .daddr = IPV4_ADDR_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS,
+ .expected_ipv4_src = IPV4_IFACE_ADDR_SEC,
+ .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, },
+ { .desc = "IPv6 set prefsrc addr route",
+ .daddr = IPV6_ADDR_DST, .expected_ret = BPF_FIB_LKUP_RET_SUCCESS,
+ .expected_ipv6_src = IPV6_IFACE_ADDR_SEC,
+ .lookup_flags = BPF_FIB_LOOKUP_SET_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH, },
};
static int ifindex;
@@ -97,6 +119,13 @@ static int setup_netns(void)
SYS(fail, "ip neigh add %s dev veth1 nud failed", IPV4_NUD_FAILED_ADDR);
SYS(fail, "ip neigh add %s dev veth1 lladdr %s nud stale", IPV4_NUD_STALE_ADDR, DMAC);
+ /* Setup for prefsrc IP addr selection */
+ SYS(fail, "ip addr add %s/24 dev veth1", IPV4_IFACE_ADDR_SEC);
+ SYS(fail, "ip route add %s/32 dev veth1 src %s", IPV4_ADDR_DST, IPV4_IFACE_ADDR_SEC);
+
+ SYS(fail, "ip addr add %s/64 dev veth1 nodad", IPV6_IFACE_ADDR_SEC);
+ SYS(fail, "ip route add %s/128 dev veth1 src %s", IPV6_ADDR_DST, IPV6_IFACE_ADDR_SEC);
+
/* Setup for tbid lookup tests */
SYS(fail, "ip addr add %s/24 dev veth2", IPV4_TBID_ADDR);
SYS(fail, "ip route del %s/24 dev veth2", IPV4_TBID_NET);
@@ -133,9 +162,12 @@ static int set_lookup_params(struct bpf_fib_lookup *params, const struct fib_loo
if (inet_pton(AF_INET6, test->daddr, params->ipv6_dst) == 1) {
params->family = AF_INET6;
- ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src);
- if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)"))
- return -1;
+ if (!(test->lookup_flags & BPF_FIB_LOOKUP_SET_SRC)) {
+ ret = inet_pton(AF_INET6, IPV6_IFACE_ADDR, params->ipv6_src);
+ if (!ASSERT_EQ(ret, 1, "inet_pton(IPV6_IFACE_ADDR)"))
+ return -1;
+ }
+
return 0;
}
@@ -143,9 +175,12 @@ static int set_lookup_params(struct bpf_fib_lookup *params, const struct fib_loo
if (!ASSERT_EQ(ret, 1, "convert IP[46] address"))
return -1;
params->family = AF_INET;
- ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src);
- if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)"))
- return -1;
+
+ if (!(test->lookup_flags & BPF_FIB_LOOKUP_SET_SRC)) {
+ ret = inet_pton(AF_INET, IPV4_IFACE_ADDR, ¶ms->ipv4_src);
+ if (!ASSERT_EQ(ret, 1, "inet_pton(IPV4_IFACE_ADDR)"))
+ return -1;
+ }
return 0;
}
@@ -207,6 +242,35 @@ void test_fib_lookup(void)
ASSERT_EQ(skel->bss->fib_lookup_ret, tests[i].expected_ret,
"fib_lookup_ret");
+ if (tests[i].expected_ipv4_src) {
+ __be32 expected_ipv4_src;
+
+ ret = inet_pton(AF_INET, tests[i].expected_ipv4_src,
+ &expected_ipv4_src);
+ ASSERT_EQ(ret, 1, "inet_pton(expected_ipv4_src)");
+
+ ASSERT_EQ(fib_params->ipv4_src, expected_ipv4_src,
+ "fib_lookup ipv4 src");
+ }
+ if (tests[i].expected_ipv6_src) {
+ __u32 expected_ipv6_src[4];
+
+ ret = inet_pton(AF_INET6, tests[i].expected_ipv6_src,
+ expected_ipv6_src);
+ ASSERT_EQ(ret, 1, "inet_pton(expected_ipv6_src)");
+
+ ret = memcmp(expected_ipv6_src, fib_params->ipv6_src,
+ sizeof(fib_params->ipv6_src));
+ if (!ASSERT_EQ(ret, 0, "fib_lookup ipv6 src")) {
+ char src_ip6[64];
+
+ inet_ntop(AF_INET6, fib_params->ipv6_src, src_ip6,
+ sizeof(src_ip6));
+ printf("ipv6 expected %s actual %s ",
+ tests[i].expected_ipv6_src, src_ip6);
+ }
+ }
+
ret = memcmp(tests[i].dmac, fib_params->dmac, sizeof(tests[i].dmac));
if (!ASSERT_EQ(ret, 0, "dmac not match")) {
char expected[18], actual[18];
--
2.42.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH bpf 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup()
2023-09-29 15:07 ` [PATCH bpf 1/2] " Martynas Pumputis
@ 2023-09-30 15:09 ` kernel test robot
2023-10-01 10:06 ` kernel test robot
1 sibling, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-09-30 15:09 UTC (permalink / raw)
To: Martynas Pumputis, bpf
Cc: oe-kbuild-all, Daniel Borkmann, netdev, Martin KaFai Lau,
Nikolay Aleksandrov, Martynas Pumputis
Hi Martynas,
kernel test robot noticed the following build errors:
[auto build test ERROR on bpf/master]
url: https://github.com/intel-lab-lkp/linux/commits/Martynas-Pumputis/bpf-Derive-source-IP-addr-via-bpf_-_fib_lookup/20230929-231536
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master
patch link: https://lore.kernel.org/r/20230929150717.120463-2-m%40lambda.lt
patch subject: [PATCH bpf 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup()
config: i386-buildonly-randconfig-006-20230930 (https://download.01.org/0day-ci/archive/20230930/202309302249.FeC1Gbp9-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230930/202309302249.FeC1Gbp9-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309302249.FeC1Gbp9-lkp@intel.com/
All errors (new ones prefixed by >>):
ld: net/core/filter.o: in function `ip6_route_get_saddr.constprop.0':
>> filter.c:(.text+0xaebe): undefined reference to `ipv6_dev_get_saddr'
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH bpf 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup()
2023-09-29 15:07 ` [PATCH bpf 1/2] " Martynas Pumputis
2023-09-30 15:09 ` kernel test robot
@ 2023-10-01 10:06 ` kernel test robot
1 sibling, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-10-01 10:06 UTC (permalink / raw)
To: Martynas Pumputis, bpf
Cc: oe-kbuild-all, Daniel Borkmann, netdev, Martin KaFai Lau,
Nikolay Aleksandrov, Martynas Pumputis
Hi Martynas,
kernel test robot noticed the following build errors:
[auto build test ERROR on bpf/master]
url: https://github.com/intel-lab-lkp/linux/commits/Martynas-Pumputis/bpf-Derive-source-IP-addr-via-bpf_-_fib_lookup/20230929-231536
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master
patch link: https://lore.kernel.org/r/20230929150717.120463-2-m%40lambda.lt
patch subject: [PATCH bpf 1/2] bpf: Derive source IP addr via bpf_*_fib_lookup()
config: csky-allmodconfig (https://download.01.org/0day-ci/archive/20231001/202310011747.2WjYkVa8-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231001/202310011747.2WjYkVa8-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310011747.2WjYkVa8-lkp@intel.com/
All errors (new ones prefixed by >>):
csky-linux-ld: net/core/filter.o: in function `ip6_route_get_saddr.constprop.0':
filter.c:(.text+0x7594): undefined reference to `ipv6_dev_get_saddr'
>> csky-linux-ld: filter.c:(.text+0x75c0): undefined reference to `ipv6_dev_get_saddr'
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-10-01 10:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-29 15:07 [PATCH bpf 0/2] bpf: Derive source IP addr via bpf_*_fib_lookup() Martynas Pumputis
2023-09-29 15:07 ` [PATCH bpf 1/2] " Martynas Pumputis
2023-09-30 15:09 ` kernel test robot
2023-10-01 10:06 ` kernel test robot
2023-09-29 15:07 ` [PATCH bpf 2/2] selftests/bpf: Add BPF_FIB_LOOKUP_SET_SRC tests Martynas Pumputis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).