All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH iproute2] ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped
@ 2021-06-22 14:44 Alexander Mikhalitsyn
  0 siblings, 0 replies; 4+ messages in thread
From: Alexander Mikhalitsyn @ 2021-06-22 14:44 UTC (permalink / raw)
  To: netdev
  Cc: Alexander Mikhalitsyn, David Ahern, Stephen Hemminger,
	Andrei Vagin, Alexander Mikhalitsyn

We started to use in-kernel filtering feature which allows to get only needed
tables (see iproute_dump_filter()). From the kernel side it's implemented in
net/ipv4/fib_frontend.c (inet_dump_fib), net/ipv6/ip6_fib.c (inet6_dump_fib).
The problem here is that behaviour of "ip route save" was changed after
c7e6371bc ("ip route: Add protocol, table id and device to dump request").
If filters are used, then kernel returns ENOENT error if requested table is absent,
but in newly created net namespace even RT_TABLE_MAIN table doesn't exist.
It is really allocated, for instance, after issuing "ip l set lo up".

Reproducer is fairly simple:
Error: ipv4: FIB table does not exist.
Dump terminated

Expected result here is to get empty dump file (as it was before this change).

This affects on CRIU [1] because we use ip route save in dump process, to workaround
problem in tests we just put loopback interface up in each net namespace.

Links:
[1] https://github.com/checkpoint-restore/criu/issues/747
[2] https://www.spinics.net/lists/netdev/msg559739.html

Fixes: c7e6371bc ("ip route: Add protocol, table id and device to dump request")

Cc: David Ahern <dsahern@kernel.org>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
---
 ip/iproute.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/ip/iproute.c b/ip/iproute.c
index 5853f026..b70acc00 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -1734,6 +1734,7 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
 	char *od = NULL;
 	unsigned int mark = 0;
 	rtnl_filter_t filter_fn;
+	int ret;
 
 	if (action == IPROUTE_SAVE) {
 		if (save_route_prep())
@@ -1939,7 +1940,11 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
 
 	new_json_obj(json);
 
-	if (rtnl_dump_filter(&rth, filter_fn, stdout) < 0) {
+	ret = rtnl_dump_filter(&rth, filter_fn, stdout);
+
+	/* Let's ignore ENOENT error if we want to dump RT_TABLE_MAIN table */
+	if (ret < 0 &&
+	    !(errno == ENOENT && filter.tb == RT_TABLE_MAIN)) {
 		fprintf(stderr, "Dump terminated\n");
 		return -2;
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH iproute2] ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped
  2021-06-23 15:36 ` David Ahern
@ 2021-06-23 16:11   ` Alexander Mikhalitsyn
  0 siblings, 0 replies; 4+ messages in thread
From: Alexander Mikhalitsyn @ 2021-06-23 16:11 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, David Ahern, Stephen Hemminger, Andrei Vagin,
	Alexander Mikhalitsyn

On Wed, 23 Jun 2021 09:36:29 -0600
David Ahern <dsahern@gmail.com> wrote:

> On 6/22/21 9:03 AM, Alexander Mikhalitsyn wrote:
> > We started to use in-kernel filtering feature which allows to get only needed
> > tables (see iproute_dump_filter()). From the kernel side it's implemented in
> > net/ipv4/fib_frontend.c (inet_dump_fib), net/ipv6/ip6_fib.c (inet6_dump_fib).
> > The problem here is that behaviour of "ip route save" was changed after
> > c7e6371bc ("ip route: Add protocol, table id and device to dump request").
> > If filters are used, then kernel returns ENOENT error if requested table is absent,
> > but in newly created net namespace even RT_TABLE_MAIN table doesn't exist.
> > It is really allocated, for instance, after issuing "ip l set lo up".
> > 
> > Reproducer is fairly simple:
> > $ unshare -n ip route save > dump
> > Error: ipv4: FIB table does not exist.
> > Dump terminated
> 
> The above command on 5.4 kernel with corresponding iproute2 does not
> show that error. Is your kernel compiled with CONFIG_IP_MULTIPLE_TABLES
> enabled?
> 
Yes it is.
$ grep CONFIG_IP_MULTIPLE_TABLES .config
CONFIG_IP_MULTIPLE_TABLES=y

> > 
> > Expected result here is to get empty dump file (as it was before this change).
> > 
> > This affects on CRIU [1] because we use ip route save in dump process, to workaround
> > problem in tests we just put loopback interface up in each net namespace.
> > Other users also met this problem [2].
> > 
> > Links:
> > [1] https://github.com/checkpoint-restore/criu/issues/747
> > [2] https://www.spinics.net/lists/netdev/msg559739.html
> > 
> > Fixes: c7e6371bc ("ip route: Add protocol, table id and device to dump request")
> > 
> > Cc: David Ahern <dsahern@kernel.org>
> > Cc: Stephen Hemminger <stephen@networkplumber.org>
> > Cc: Andrei Vagin <avagin@gmail.com>
> > Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com>
> > Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
> > ---
> >  ip/iproute.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/ip/iproute.c b/ip/iproute.c
> > index 5853f026..b70acc00 100644
> > --- a/ip/iproute.c
> > +++ b/ip/iproute.c
> > @@ -1734,6 +1734,7 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
> >  	char *od = NULL;
> >  	unsigned int mark = 0;
> >  	rtnl_filter_t filter_fn;
> > +	int ret;
> >  
> >  	if (action == IPROUTE_SAVE) {
> >  		if (save_route_prep())
> > @@ -1939,7 +1940,11 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
> >  
> >  	new_json_obj(json);
> >  
> > -	if (rtnl_dump_filter(&rth, filter_fn, stdout) < 0) {
> > +	ret = rtnl_dump_filter(&rth, filter_fn, stdout);
> > +
> > +	/* Let's ignore ENOENT error if we want to dump RT_TABLE_MAIN table */
> > +	if (ret < 0 &&
> 
> ret temp variable is not needed; just add the extra checks.

Sure, thanks!
I will send v2 if all fine in general with this approach to fix the problem.

> 
> > +	    !(errno == ENOENT && filter.tb == RT_TABLE_MAIN)) {
> >  		fprintf(stderr, "Dump terminated\n");
> >  		return -2;
> >  	}
> > 
> 
> This looks fine to me, but I want clarification on the kernel config. As
> I recall with multiple tables and fib rules tables are created when net
> namespace is created.
I've traced how fib tables allocated (fib_new_table function) during
$ ip l set lo up
and stack looks like that:
ip   740 [003] 99894.075766: probe:fib_new_table: (ffffffffb08ff9a0)
        ffffffffb08ff9a1 fib_new_table+0x1 ([kernel.kallsyms])
        ffffffffb09001d1 fib_magic.isra.24+0xc1 ([kernel.kallsyms])
        ffffffffb0901b3d fib_add_ifaddr+0x16d ([kernel.kallsyms])
        ffffffffb0901be5 fib_netdev_event+0x95 ([kernel.kallsyms])
        ffffffffafed5457 notifier_call_chain+0x47 ([kernel.kallsyms])
        ffffffffb07f249b __dev_notify_flags+0x5b ([kernel.kallsyms])
        ffffffffb07f2c48 dev_change_flags+0x48 ([kernel.kallsyms])
        ffffffffb0805d34 do_setlink+0x314 ([kernel.kallsyms])
        ffffffffb080ad9d __rtnl_newlink+0x53d ([kernel.kallsyms])
        ffffffffb080b143 rtnl_newlink+0x43 ([kernel.kallsyms])
        ffffffffb08048ba rtnetlink_rcv_msg+0x22a ([kernel.kallsyms])
        ffffffffb0848dfc netlink_rcv_skb+0x4c ([kernel.kallsyms])
        ffffffffb084868d netlink_unicast+0x21d ([kernel.kallsyms])
        ffffffffb08488fe netlink_sendmsg+0x22e ([kernel.kallsyms])
        ffffffffb07cad9c sock_sendmsg+0x4c ([kernel.kallsyms])
        ffffffffb07cb0bb ____sys_sendmsg+0x1eb ([kernel.kallsyms])
        ffffffffb07cc74c ___sys_sendmsg+0x7c ([kernel.kallsyms])
        ffffffffb07cc817 __sys_sendmsg+0x57 ([kernel.kallsyms])
        ffffffffafe0279b do_syscall_64+0x5b ([kernel.kallsyms])
        ffffffffb0c000ad entry_SYSCALL_64_after_hwframe+0x65 ([kernel.kallsyms])
            7f556c716d78 __libc_sendmsg+0x18 (/usr/lib64/libc-2.28.so)

During trace of:
$ unshare -n
I see no fib_new_table() calls.

Thank you very much for your attention to the patch.

Regards,
Alex

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH iproute2] ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped
  2021-06-22 15:03 Alexander Mikhalitsyn
@ 2021-06-23 15:36 ` David Ahern
  2021-06-23 16:11   ` Alexander Mikhalitsyn
  0 siblings, 1 reply; 4+ messages in thread
From: David Ahern @ 2021-06-23 15:36 UTC (permalink / raw)
  To: Alexander Mikhalitsyn, netdev
  Cc: David Ahern, Stephen Hemminger, Andrei Vagin, Alexander Mikhalitsyn

On 6/22/21 9:03 AM, Alexander Mikhalitsyn wrote:
> We started to use in-kernel filtering feature which allows to get only needed
> tables (see iproute_dump_filter()). From the kernel side it's implemented in
> net/ipv4/fib_frontend.c (inet_dump_fib), net/ipv6/ip6_fib.c (inet6_dump_fib).
> The problem here is that behaviour of "ip route save" was changed after
> c7e6371bc ("ip route: Add protocol, table id and device to dump request").
> If filters are used, then kernel returns ENOENT error if requested table is absent,
> but in newly created net namespace even RT_TABLE_MAIN table doesn't exist.
> It is really allocated, for instance, after issuing "ip l set lo up".
> 
> Reproducer is fairly simple:
> $ unshare -n ip route save > dump
> Error: ipv4: FIB table does not exist.
> Dump terminated

The above command on 5.4 kernel with corresponding iproute2 does not
show that error. Is your kernel compiled with CONFIG_IP_MULTIPLE_TABLES
enabled?

> 
> Expected result here is to get empty dump file (as it was before this change).
> 
> This affects on CRIU [1] because we use ip route save in dump process, to workaround
> problem in tests we just put loopback interface up in each net namespace.
> Other users also met this problem [2].
> 
> Links:
> [1] https://github.com/checkpoint-restore/criu/issues/747
> [2] https://www.spinics.net/lists/netdev/msg559739.html
> 
> Fixes: c7e6371bc ("ip route: Add protocol, table id and device to dump request")
> 
> Cc: David Ahern <dsahern@kernel.org>
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Andrei Vagin <avagin@gmail.com>
> Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com>
> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
> ---
>  ip/iproute.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/ip/iproute.c b/ip/iproute.c
> index 5853f026..b70acc00 100644
> --- a/ip/iproute.c
> +++ b/ip/iproute.c
> @@ -1734,6 +1734,7 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
>  	char *od = NULL;
>  	unsigned int mark = 0;
>  	rtnl_filter_t filter_fn;
> +	int ret;
>  
>  	if (action == IPROUTE_SAVE) {
>  		if (save_route_prep())
> @@ -1939,7 +1940,11 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
>  
>  	new_json_obj(json);
>  
> -	if (rtnl_dump_filter(&rth, filter_fn, stdout) < 0) {
> +	ret = rtnl_dump_filter(&rth, filter_fn, stdout);
> +
> +	/* Let's ignore ENOENT error if we want to dump RT_TABLE_MAIN table */
> +	if (ret < 0 &&

ret temp variable is not needed; just add the extra checks.

> +	    !(errno == ENOENT && filter.tb == RT_TABLE_MAIN)) {
>  		fprintf(stderr, "Dump terminated\n");
>  		return -2;
>  	}
> 

This looks fine to me, but I want clarification on the kernel config. As
I recall with multiple tables and fib rules tables are created when net
namespace is created.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH iproute2] ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped
@ 2021-06-22 15:03 Alexander Mikhalitsyn
  2021-06-23 15:36 ` David Ahern
  0 siblings, 1 reply; 4+ messages in thread
From: Alexander Mikhalitsyn @ 2021-06-22 15:03 UTC (permalink / raw)
  To: netdev
  Cc: Alexander Mikhalitsyn, David Ahern, Stephen Hemminger,
	Andrei Vagin, Alexander Mikhalitsyn

We started to use in-kernel filtering feature which allows to get only needed
tables (see iproute_dump_filter()). From the kernel side it's implemented in
net/ipv4/fib_frontend.c (inet_dump_fib), net/ipv6/ip6_fib.c (inet6_dump_fib).
The problem here is that behaviour of "ip route save" was changed after
c7e6371bc ("ip route: Add protocol, table id and device to dump request").
If filters are used, then kernel returns ENOENT error if requested table is absent,
but in newly created net namespace even RT_TABLE_MAIN table doesn't exist.
It is really allocated, for instance, after issuing "ip l set lo up".

Reproducer is fairly simple:
$ unshare -n ip route save > dump
Error: ipv4: FIB table does not exist.
Dump terminated

Expected result here is to get empty dump file (as it was before this change).

This affects on CRIU [1] because we use ip route save in dump process, to workaround
problem in tests we just put loopback interface up in each net namespace.
Other users also met this problem [2].

Links:
[1] https://github.com/checkpoint-restore/criu/issues/747
[2] https://www.spinics.net/lists/netdev/msg559739.html

Fixes: c7e6371bc ("ip route: Add protocol, table id and device to dump request")

Cc: David Ahern <dsahern@kernel.org>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
---
 ip/iproute.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/ip/iproute.c b/ip/iproute.c
index 5853f026..b70acc00 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -1734,6 +1734,7 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
 	char *od = NULL;
 	unsigned int mark = 0;
 	rtnl_filter_t filter_fn;
+	int ret;
 
 	if (action == IPROUTE_SAVE) {
 		if (save_route_prep())
@@ -1939,7 +1940,11 @@ static int iproute_list_flush_or_save(int argc, char **argv, int action)
 
 	new_json_obj(json);
 
-	if (rtnl_dump_filter(&rth, filter_fn, stdout) < 0) {
+	ret = rtnl_dump_filter(&rth, filter_fn, stdout);
+
+	/* Let's ignore ENOENT error if we want to dump RT_TABLE_MAIN table */
+	if (ret < 0 &&
+	    !(errno == ENOENT && filter.tb == RT_TABLE_MAIN)) {
 		fprintf(stderr, "Dump terminated\n");
 		return -2;
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-06-23 16:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22 14:44 [PATCH iproute2] ip route: ignore ENOENT during save if RT_TABLE_MAIN is being dumped Alexander Mikhalitsyn
2021-06-22 15:03 Alexander Mikhalitsyn
2021-06-23 15:36 ` David Ahern
2021-06-23 16:11   ` Alexander Mikhalitsyn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.