* PROBLEM: Old content of /proc/net after switching network namespace
@ 2019-01-04 8:21 Mateusz Stępień
2019-01-04 11:09 ` Alexey Dobriyan
0 siblings, 1 reply; 4+ messages in thread
From: Mateusz Stępień @ 2019-01-04 8:21 UTC (permalink / raw)
To: adobriyan; +Cc: linux-fsdevel, linux-kernel
Hello everyone,
After changing network namespace using setns, the content of /proc/net
still represents the original namespace.
It looks like procfs dentries are not invalidated in dcache properly
after the namespace switch.
It happens only, when you read content of /proc/net before changing
namespace
The problem is reproducible in 4.19.13 but not in 4.14.X.
Bisecting the stable kernel tree shows that the commit
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1da4d377f943fe4194ffb9fb9c26cc58fad4dd24
introduced the problem.
Reverting mentioned commit resolves it.
MCVE (slightly modified example from [man 2 setns]):
#define _GNU_SOURCE
#include <fcntl.h>
#include <sched.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \
} while (0)
void print_dev()
{
int fd2;
fd2 = open("/proc/net/dev", O_RDONLY);
char buf[2048] = {0};
read(fd2, buf, 2048);
printf("%s", buf);
close(fd2);
}
int
main(int argc, char *argv[])
{
int fd;
printf("before namespace switch =========\n");
print_dev();
fd = open(argv[1], O_RDONLY); /* Get file descriptor for namespace */
if (fd == -1)
errExit("open");
if (setns(fd, 0) == -1) /* Join that namespace */
errExit("setns");
printf("after namespace switch ++++++++++\n");
print_dev();
return 0;
}
Steps to reproduce (assuming we have an interface named enp0s9):
ip netns add test
ip link set dev enp0s9 netns test
ip netns exec test sleep 30 &
gcc -o mcve mcve.c # mcve.c contains above C code
./mcve /proc/$(pidof sleep)/ns/net
# before namespace switch =========
# Inter-| Receive |
Transmit
# face |bytes packets errs drop fifo frame compressed
multicast|bytes packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# br0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s3: 149625 1117 0 0 0 0 0 1
61664 485 0 0 0 0 0 0
# docker0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s8: 17086 60 0 0 0 0 0 29
1006 13 0 0 0 0 0 0
# after namespace switch ++++++++++
# Inter-| Receive |
Transmit
# face |bytes packets errs drop fifo frame compressed
multicast|bytes packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# br0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s3: 150813 1135 0 0 0 0 0 1
64348 503 0 0 0 0 0 0
# docker0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s8: 17086 60 0 0 0 0 0 29
1006 13 0 0 0 0 0 0
ip netns exec test cat /proc/net/dev
## Should display
# Inter-| Receive |
Transmit
# face |bytes packets errs drop fifo frame compressed multicast|bytes
packets errs drop fifo colls carrier compressed
# lo: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
# enp0s9: 10438 33 0 0 0 0 0 17
936 12 0 0 0 0 0 0
output from awk -f scripts/ver_linux
Linux test-agent 4.19.13 #1 SMP PREEMPT Thu Jan 3 12:03:20 UTC 2019
x86_64 GNU/Linux
Util-linux 2.29.2
Mount 2.29.2
Module-init-tools 23
E2fsprogs 1.43.4
Linux C Library 2.24
Dynamic linker (ldd) 2.24
Linux C++ Library 6.0.22
Procps 3.3.12
Net-tools 2.10
Sh-utils 8.26
Udev 232
Modules Loaded ahci ata_generic ata_piix crc32c_intel e1000
ehci_hcd ehci_pci i2c_core i2c_piix4 libahci serio_raw usb_common usbcore
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: Old content of /proc/net after switching network namespace
2019-01-04 8:21 PROBLEM: Old content of /proc/net after switching network namespace Mateusz Stępień
@ 2019-01-04 11:09 ` Alexey Dobriyan
2019-01-07 8:54 ` Mateusz Stępień
0 siblings, 1 reply; 4+ messages in thread
From: Alexey Dobriyan @ 2019-01-04 11:09 UTC (permalink / raw)
To: Mateusz Stępień; +Cc: linux-fsdevel, linux-kernel
On Fri, Jan 04, 2019 at 09:21:23AM +0100, Mateusz Stępień wrote:
> After changing network namespace using setns, the content of /proc/net
> still represents the original namespace.
> It looks like procfs dentries are not invalidated in dcache properly
> after the namespace switch.
> It happens only, when you read content of /proc/net before changing
> namespace
> The problem is reproducible in 4.19.13 but not in 4.14.X.
> Bisecting the stable kernel tree shows that the commit
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=1da4d377f943fe4194ffb9fb9c26cc58fad4dd24
> introduced the problem.
> Reverting mentioned commit resolves it.
>
> MCVE (slightly modified example from [man 2 setns]):
open /proc/net/dev
read
close
setns
open /proc/net/dev
read
close
This bug was discussed recently and Al even posted how to fix it
properly.
Try this horror patch meanwhile:
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -219,19 +219,30 @@ void proc_free_inum(unsigned int inum)
ida_simple_remove(&proc_inum_ida, inum - PROC_DYNAMIC_FIRST);
}
+static bool net_root(const struct proc_dir_entry *pde)
+{
+ return pde->parent == &proc_root && pde->mode == 0;
+}
+
static int proc_misc_d_revalidate(struct dentry *dentry, unsigned int flags)
{
+ const struct proc_dir_entry *pde;
+
if (flags & LOOKUP_RCU)
return -ECHILD;
- if (atomic_read(&PDE(d_inode(dentry))->in_use) < 0)
+ pde = PDE(d_inode(dentry));
+
+ if (atomic_read(&pde->in_use) < 0 || net_root(pde))
return 0; /* revalidate */
return 1;
}
static int proc_misc_d_delete(const struct dentry *dentry)
{
- return atomic_read(&PDE(d_inode(dentry))->in_use) < 0;
+ const struct proc_dir_entry *pde = PDE(d_inode(dentry));
+
+ return atomic_read(&pde->in_use) < 0 || net_root(pde);
}
static const struct dentry_operations proc_misc_dentry_ops = {
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: Old content of /proc/net after switching network namespace
2019-01-04 11:09 ` Alexey Dobriyan
@ 2019-01-07 8:54 ` Mateusz Stępień
2019-01-07 12:39 ` Alexey Dobriyan
0 siblings, 1 reply; 4+ messages in thread
From: Mateusz Stępień @ 2019-01-07 8:54 UTC (permalink / raw)
To: Alexey Dobriyan; +Cc: linux-fsdevel, linux-kernel
Thank you for the quick response, unfortunately the patch does not work,
problem still persist.
I saw the other mail thread, and it looks like the proper solution
described by Al will take some time. I don't know if we can afford right
now to write a proper patch ourselves, also at least my understanding is
that this is a regression, so my question is: can we expect a proper
patch in a coming weeks? If not, will reverting the patch at fault have
any more negative side effects? Of course it will bring back the
behaviour it was supposed to fix, but are there any other issues?
Thanks,
Mateusz
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PROBLEM: Old content of /proc/net after switching network namespace
2019-01-07 8:54 ` Mateusz Stępień
@ 2019-01-07 12:39 ` Alexey Dobriyan
0 siblings, 0 replies; 4+ messages in thread
From: Alexey Dobriyan @ 2019-01-07 12:39 UTC (permalink / raw)
To: Mateusz Stępień; +Cc: linux-fsdevel, linux-kernel
On Mon, Jan 07, 2019 at 09:54:26AM +0100, Mateusz Stępień wrote:
> Thank you for the quick response, unfortunately the patch does not work,
> problem still persist.
OK, I reproduced the bug.
> I saw the other mail thread, and it looks like the proper solution
> described by Al will take some time. I don't know if we can afford right
> now to write a proper patch ourselves, also at least my understanding is
> that this is a regression, so my question is: can we expect a proper
> patch in a coming weeks? If not, will reverting the patch at fault have
> any more negative side effects? Of course it will bring back the
> behaviour it was supposed to fix, but are there any other issues?
Other issues, no.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-01-07 12:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-04 8:21 PROBLEM: Old content of /proc/net after switching network namespace Mateusz Stępień
2019-01-04 11:09 ` Alexey Dobriyan
2019-01-07 8:54 ` Mateusz Stępień
2019-01-07 12:39 ` Alexey Dobriyan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).