From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kirill Tkhai Subject: Re: [PATCH] iptables: Per-net ns lock Date: Mon, 23 Apr 2018 12:03:15 +0300 Message-ID: <36671bf4-a32e-e7c0-cc26-d3926f9d443e@virtuozzo.com> References: <152423174378.4473.8708420767261754117.stgit@localhost.localdomain> <20180420230620.GA23540@outlook.office365.com> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 7bit Cc: fw@strlen.de, netdev@vger.kernel.org, pablo@netfilter.org, rstoyanov1@gmail.com, ptikhomirov@virtuozzo.com To: Andrei Vagin Return-path: Received: from mail-eopbgr10094.outbound.protection.outlook.com ([40.107.1.94]:34562 "EHLO EUR02-HE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753109AbeDWJDX (ORCPT ); Mon, 23 Apr 2018 05:03:23 -0400 In-Reply-To: <20180420230620.GA23540@outlook.office365.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 21.04.2018 02:06, Andrei Vagin wrote: > On Fri, Apr 20, 2018 at 04:42:47PM +0300, Kirill Tkhai wrote: >> Containers want to restore their own net ns, >> while they may have no their own mnt ns. >> This case they share host's /run/xtables.lock >> file, but they may not have permission to open >> it. >> >> Patch makes /run/xtables.lock to be per-namespace, >> i.e., to refer to the caller task's net ns. >> >> Signed-off-by: Kirill Tkhai >> --- >> iptables/xshared.c | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/iptables/xshared.c b/iptables/xshared.c >> index 06db72d4..b6dbe4e7 100644 >> --- a/iptables/xshared.c >> +++ b/iptables/xshared.c >> @@ -254,7 +254,12 @@ static int xtables_lock(int wait, struct timeval *wait_interval) >> time_left.tv_sec = wait; >> time_left.tv_usec = 0; >> >> - fd = open(XT_LOCK_NAME, O_CREAT, 0600); >> + if (symlink("/proc/self/ns/net", XT_LOCK_NAME) != 0 && > > Any user can open this file and take the lock. Before this patch, the > lock file could be opened only by the root user. It means that any user > will be able to block all iptables operations. Do I miss something? Yes, this is the idea. It looks like the only way to save compatibility with old iptables and to allow to set rules from nested net namespaces. Also, this allows to synchronize with containers, which have its own mount namespace. Comparing to existing interfaces in kernel, there is an example. Ordinary user can open a file RO on a partition, and this prevents root from umounting it But this is never considered as a problem, and nobody makes partitions available only for root in 0600 mode to prevent this. There is lsof, and it's easy to find the lock owner. The same with iptables. The lock is not a critical protection, it's just a try for different users to synchronize between each other. Real protection happens in setsockopt() path. > [root@fc24 ~]# ln -s /proc/self/ns/net /run/xtables.lock2 > [root@fc24 ~]# ls -l /run/xtables.lock2 > lrwxrwxrwx 1 root root 17 Apr 21 01:52 /run/xtables.lock2 -> > /proc/self/ns/net > [root@fc24 ~]# ls -l /proc/self/ns/net > lrwxrwxrwx 1 root root 0 Apr 21 01:52 /proc/self/ns/net -> > net:[4026531993] > > Thanks, > Andrei > >> + errno != EEXIST) { >> + fprintf(stderr, "Fatal: can't create lock file\n"); > > fprintf(stderr, "Fatal: can't create lock file %s: %s\n", > XT_LOCK_NAME, strerror(errno)); > >> + return XT_LOCK_FAILED; >> + } >> + fd = open(XT_LOCK_NAME, O_RDONLY); >> if (fd < 0) { >> fprintf(stderr, "Fatal: can't open lock file %s: %s\n", >> XT_LOCK_NAME, strerror(errno)); Kirill