From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752974Ab1LSMYV (ORCPT ); Mon, 19 Dec 2011 07:24:21 -0500 Received: from mailhub.sw.ru ([195.214.232.25]:14979 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752530Ab1LSMYU (ORCPT ); Mon, 19 Dec 2011 07:24:20 -0500 Message-ID: <4EEF2C9A.8000403@parallels.com> Date: Mon, 19 Dec 2011 16:22:50 +0400 From: Stanislav Kinsbursky User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Lightning/1.0b2 Thunderbird/3.1.15 MIME-Version: 1.0 To: "Eric W. Biederman" CC: "Trond.Myklebust@netapp.com" , "linux-nfs@vger.kernel.org" , Pavel Emelianov , "neilb@suse.de" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , James Bottomley , "bfields@fieldses.org" , "davem@davemloft.net" , "devel@openvz.org" Subject: Re: [PATCH 01/11] SYSCTL: export root and set handling routines References: <20111214103602.3991.20990.stgit@localhost6.localdomain6> <20111214104449.3991.61989.stgit@localhost6.localdomain6> <4EEEFC54.10700@parallels.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 19.12.2011 14:15, Eric W. Biederman пишет: > Stanislav Kinsbursky writes: > >> 18.12.2011 02:25, Eric W. Biederman пишет: >>> Stanislav Kinsbursky writes: >>> >>>> These routines are required for making SUNRPC sysctl's per network namespace >>>> context. >>> >>> Why does sunrpc require it's own sysctl root? You should be able to use >>> the generic per network namespace root and call it good. >>> >>> What makes register_net_sysctl_table and register_net_sysctl_ro_table >>> unsuitable for sunrpc. I skimmed through your patches and I haven't >>> seen anything obvious. >>> >>> Eric >>> >> >> Hello, Eric. Sorry for the lack of information. >> I was considering two ways how to make these sysctl per net ns: >> >> 1) Use register_net_sysctl_table and register_net_sysctl_ro_table as you >> mentioned. This was easy and cheap, but also means, than all user-space >> programs, tuning SUNRPC will be broken (since all sysctl currently located >> in"/proc/sys/sunprc/"). > > Nope. That is a misunderstanding. register_net_sysctl_table works for > anything under /proc/sys. > >> 2) Export sysctl root creation routines and make per-net SUNRPC sysctl >> root. This approach allows to make any part of sysctl tree per namespace context >> and thus leave user-space stuff unchanged. >> >> BTW, NFS and LockD also have it's sysctls ("/proc/sys/fs/nfs/"). >> And also because of them I've decided, that it would be better to export SYSCTL >> root creation routines instead of breaking compatibility for all NFS layers by >> moving all sysctl under /proc/sys/net/ directory. >> >> Do you feel that it was a bad decision? > > I think it was a misinformed decision. > > I fully support not breaking userspace by moving where the sysctls files > are. If something sounds like I am suggesting moving sysctl files there > is a miscommunication somewhere. > > The concept of a sysctl root as I had envisioned it and essentially as it > is implemented was a per namespace sysctl tree. Those sysctl trees are > then unioned together when presented to user space. There should only > be one root per namespace. > > In practice what this means is that register_net_sysctl_table should > work for any sysctl file anywhere under /proc/sys. I think > register_net_sysctl_table is the right solution for your problem. The > only possible caveat I can think of is you might hit Al's performance > optimizations and need to create a common empty directory first with > register_sysctl_paths. > > Sorry, but I forgot to mention one more important goal I would like to achieve: I want to manage sysctl's variables in context of mount owner, but not viewer one. IOW imagine, that we have one two network namespaces: "A" and "B". Both of them have it's own net sysctl's root. And we have per-net sysctl "/proc/sys/var". And for ns "A" variable was set to 0, and for "B" - to 1. And B's "/proc/sys/var" is accessible from "A" namespace ("/chroot_path/proc/sys/var" for example). With this configuration I want to read "1" from both namespaces: owner "B" (/proc/sys/var) and "A" ("/chroot_path/proc/sys/var"). Looks like simple using of register_net_sysctl_table doesn't allow me this, because current net ns is used. And to achieve this goal I need my own sysctl set for SUNRPC like it was done for network namespaces. > .... > That said since I am in the process of rewriting things some of this > may change a little bit, but hopefully not in ways that immediately > effect the users of register_sysctl_table. > > Don't use register_net_sysctl_ro_table. I think what the implementors > actually wanted was register_net_sysctl_table(&init_net, ...) and didn't > know it. > > Don't put subdirectories in your sysctl tables. Use a ctl_path to > specify the entire directory where the files should show up. Generally > the code is easier to read in that form, and the code is simpler to deal > with if we don't have to worry about directories. > > Don't play with the sysctl roots. It is my intention to completely kill > them off and replace them by moving the per net sysctl tree under > /proc//sys/. Leaving behind symlinks in /proc/sys/net and I guess > ultimately in /proc/sys/sunrpc/ and /proc/sys/fs/nfs... Which actually > seems to better describe your mental model. > I'm afraid, that this approach this not allow me to achieve the goal, mentioned above, because current->nsproxy->net_ns will be used during lookup. Or maybe I misunderstanding here? > Thank you for mentioning /proc/sys/fs/nfs. That is a case I hadn't > thought about. In thinking about it I see some deficiencies in my > rewrite that I need to correct before I push that code. > Was glad to be usefull. > Eric -- Best regards, Stanislav Kinsbursky From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stanislav Kinsbursky Subject: Re: [PATCH 01/11] SYSCTL: export root and set handling routines Date: Mon, 19 Dec 2011 16:22:50 +0400 Message-ID: <4EEF2C9A.8000403@parallels.com> References: <20111214103602.3991.20990.stgit@localhost6.localdomain6> <20111214104449.3991.61989.stgit@localhost6.localdomain6> <4EEEFC54.10700@parallels.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: "Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org" , "linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , Pavel Emelianov , "neilb-l3A5Bk7waGM@public.gmane.org" , "netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , James Bottomley , "bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org" , "davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org" , "devel-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org" To: "Eric W. Biederman" Return-path: In-Reply-To: Sender: linux-nfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: netdev.vger.kernel.org 19.12.2011 14:15, Eric W. Biederman =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > Stanislav Kinsbursky writes: > >> 18.12.2011 02:25, Eric W. Biederman =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >>> Stanislav Kinsbursky writes: >>> >>>> These routines are required for making SUNRPC sysctl's per network= namespace >>>> context. >>> >>> Why does sunrpc require it's own sysctl root? You should be able t= o use >>> the generic per network namespace root and call it good. >>> >>> What makes register_net_sysctl_table and register_net_sysctl_ro_tab= le >>> unsuitable for sunrpc. I skimmed through your patches and I haven'= t >>> seen anything obvious. >>> >>> Eric >>> >> >> Hello, Eric. Sorry for the lack of information. >> I was considering two ways how to make these sysctl per net ns: >> >> 1) Use register_net_sysctl_table and register_net_sysctl_ro_table as= you >> mentioned. This was easy and cheap, but also means, than all user-sp= ace >> programs, tuning SUNRPC will be broken (since all sysctl currently l= ocated >> in"/proc/sys/sunprc/"). > > Nope. That is a misunderstanding. register_net_sysctl_table works f= or > anything under /proc/sys. > >> 2) Export sysctl root creation routines and make per-net SUNRPC sysc= tl >> root. This approach allows to make any part of sysctl tree per names= pace context >> and thus leave user-space stuff unchanged. >> >> BTW, NFS and LockD also have it's sysctls ("/proc/sys/fs/nfs/"). >> And also because of them I've decided, that it would be better to ex= port SYSCTL >> root creation routines instead of breaking compatibility for all NFS= layers by >> moving all sysctl under /proc/sys/net/ directory. >> >> Do you feel that it was a bad decision? > > I think it was a misinformed decision. > > I fully support not breaking userspace by moving where the sysctls fi= les > are. If something sounds like I am suggesting moving sysctl files th= ere > is a miscommunication somewhere. > > The concept of a sysctl root as I had envisioned it and essentially a= s it > is implemented was a per namespace sysctl tree. Those sysctl trees a= re > then unioned together when presented to user space. There should onl= y > be one root per namespace. > > In practice what this means is that register_net_sysctl_table should > work for any sysctl file anywhere under /proc/sys. I think > register_net_sysctl_table is the right solution for your problem. Th= e > only possible caveat I can think of is you might hit Al's performance > optimizations and need to create a common empty directory first with > register_sysctl_paths. > > Sorry, but I forgot to mention one more important goal I would like to = achieve: I want to manage sysctl's variables in context of mount owner, but not = viewer one. IOW imagine, that we have one two network namespaces: "A" and "B". Both= of them=20 have it's own net sysctl's root. And we have per-net sysctl "/proc/sys/= var". And for ns "A" variable was set to 0, and for "B" - to 1. And B's "/proc/sys/var" is accessible from "A" namespace ("/chroot_path/proc/sys/var" for example). With this configuration I want to read "1" from both namespaces: owner "B" (/proc/sys/var) and "A" ("/chroot_path/proc/sys/var"). Looks like simple using of register_net_sysctl_table doesn't allow me t= his,=20 because current net ns is used. And to achieve this goal I need my own = sysctl=20 set for SUNRPC like it was done for network namespaces. > .... > That said since I am in the process of rewriting things some of this > may change a little bit, but hopefully not in ways that immediately > effect the users of register_sysctl_table. > > Don't use register_net_sysctl_ro_table. I think what the implemento= rs > actually wanted was register_net_sysctl_table(&init_net, ...) and did= n't > know it. > > Don't put subdirectories in your sysctl tables. Use a ctl_path to > specify the entire directory where the files should show up. General= ly > the code is easier to read in that form, and the code is simpler to d= eal > with if we don't have to worry about directories. > > Don't play with the sysctl roots. It is my intention to completely k= ill > them off and replace them by moving the per net sysctl tree under > /proc//sys/. Leaving behind symlinks in /proc/sys/net and I gu= ess > ultimately in /proc/sys/sunrpc/ and /proc/sys/fs/nfs... Which actual= ly > seems to better describe your mental model. > I'm afraid, that this approach this not allow me to achieve the goal, m= entioned=20 above, because current->nsproxy->net_ns will be used during lookup. Or maybe I misunderstanding here? > Thank you for mentioning /proc/sys/fs/nfs. That is a case I hadn't > thought about. In thinking about it I see some deficiencies in my > rewrite that I need to correct before I push that code. > Was glad to be usefull. > Eric --=20 Best regards, Stanislav Kinsbursky -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html