linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: "Trond.Myklebust\@netapp.com" <Trond.Myklebust@netapp.com>,
	"linux-nfs\@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	Pavel Emelianov <xemul@parallels.com>,
	"neilb\@suse.de" <neilb@suse.de>,
	"netdev\@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	James Bottomley <jbottomley@parallels.com>,
	"bfields\@fieldses.org" <bfields@fieldses.org>,
	"davem\@davemloft.net" <davem@davemloft.net>,
	"devel\@openvz.org" <devel@openvz.org>
Subject: Re: [PATCH 01/11] SYSCTL: export root and set handling routines
Date: Wed, 11 Jan 2012 11:36:04 -0800	[thread overview]
Message-ID: <m18vle2frv.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <4F0DCEA8.7040205@parallels.com> (Stanislav Kinsbursky's message of "Wed, 11 Jan 2012 22:02:16 +0400")

Stanislav Kinsbursky <skinsbursky@parallels.com> writes:

> 11.01.2012 21:21, Eric W. Biederman пишет:
>>>>>> Especially what drives that desire not to have it have a /proc/<pid>/sys
>>>>>> directory that reflects the sysctls for a given process.
>>>>>>
>>>>>
>>>>> This is not so important for me, where to access sysctl's. But I'm worrying
>>>>> about backward compatibility. IOW, I'm afraid of changing path
>>>>> "/proc/sys/sunprc/*" to "/proc/<pid>/sys/sunrpc". This would break a lot of
>>>>> user-space programs.
>>>>
>>>> The part that keeps it all working is by adding a symlink from /proc/sys
>>>> to /proc/self/sys.  That technique has worked well for /proc/net, and I
>>>> don't expect there will be any problems with /proc/sys either.  It is
>>>> possible but is very rare for the introduction of a symlink in a path
>>>> to cause problems.
>>>>
>>>
>>> Probably I don't understand you, but as I see it now, symlink to "/proc/self/"
>>> is unacceptable because of the following:
>>> 1) will be used current context (any) instead of desired one
>> (Using the current context is the desirable outcome for existing tools).
>>> 1) if CT has other pid namespace - then we just have broken link.
>>
>> Assuming the process in question is not in the pid namespace available
>> to proc then yes you will indeed have a broken link.  But a broken
>> link is only a problem for new applications that are doing something strange.
>>
>
> I believe, that container is assuming to work in  it's own network and pid
> namespaces.
> With your approach, if I'm not mistaken, container's /proc/net and /proc/sys
> tunables will be unaccessible from parent environment. Or I'm wrong here?

Wrong.

>> I am proposing treating /proc/sys like /proc/net has already been
>> treated.  Aka move have the version of /proc/sys that relative to a
>> process be visible at: /proc/<pid>/sys, and with a compat symlink
>> from /proc/sys ->  /proc/self/sys.
>>
>> Just like has already been done with /proc/net.
>>
>
> 1) On one hand it looks logical, that any nested dentries in /proc are tied to
> pid namespace. But on the other hand we have a lot of tunables in /proc/net,
> /proc/sys, etc. which have nothing with processes or whatever similar.

Please stop and take a look at /proc/net.  If your /proc/net is not a
symlink please look at a modern kernel.

/proc/<pid>/net reflects the network namespace of the task in question.

> 2) currently /proc processes directories (i.e. /proc/1/, etc) depends on mount
> maker context. But /proc/sys and /proc/net doesn't. This looks weird and
> despondently, from my pow. What do you think about it?

Yep.  Sysfs is weird.  Ideally sysfs would display all devices all of
the time but unfortunately that breaks backwards compatibility.

In proc we have the opportunity to display nearly everything all of the
time and I think that opportunity is worth seizing.

Having to mount a filesystem simply because the designers of the
filesystem were not creative enough to figure out how to display
all of the information the filesystem is responsible for displaying
without having namespace conflicts is unfortunate.

> And what do you think about "conteinerization" of /proc contents in the way like
> "sysfs" was done?

I think the way sysfs is done is a pain in the neck to use.  Especially
in the context of commands like "ip netns exec".  With the sysfs model
there is a lot of extra state to manage.

I totally agree that the way sysfs is done is much better than the way
/proc/sys is done today.  Looking at current can be limiting in the
general case.

My current preference is the way /proc/net was done.

> Implementing /proc "conteinerization" in this way can give us great flexibility.
> For example, /proc/net (and /proc/sys/sunrpc) depends on mount owner net
> namespace, /proc/sysvipc depends on mount owner ipc namespace, etc.
> And this approach doesn't break backward compatibility as well.

The thing is /proc/net is already done.

All I see with making things like /proc/net depend on the context of the
process that called mount is a need to call mount much more often.

Eric

  reply	other threads:[~2012-01-11 19:33 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-14 11:44 [PATCH 00/11] SUNRPC: make sysctl per network namespcase context Stanislav Kinsbursky
2011-12-14 11:44 ` [PATCH 01/11] SYSCTL: export root and set handling routines Stanislav Kinsbursky
2011-12-17 22:25   ` Eric W. Biederman
2011-12-19  8:56     ` Stanislav Kinsbursky
2011-12-19 10:15       ` Eric W. Biederman
2011-12-19 12:22         ` Stanislav Kinsbursky
2011-12-19 16:37           ` Eric W. Biederman
2011-12-19 17:24             ` Stanislav Kinsbursky
2012-01-03  3:49               ` Eric W. Biederman
2012-01-10 10:38                 ` Stanislav Kinsbursky
2012-01-10 22:39                   ` Eric W. Biederman
2012-01-11  9:47                     ` Stanislav Kinsbursky
2012-01-11 17:21                       ` Eric W. Biederman
2012-01-11 18:02                         ` Stanislav Kinsbursky
2012-01-11 19:36                           ` Eric W. Biederman [this message]
2012-01-12  9:17                             ` Stanislav Kinsbursky
2011-12-14 11:44 ` [PATCH 02/11] SUNRPC: use syctl path instead of dummy parent table Stanislav Kinsbursky
2011-12-14 11:45 ` [PATCH 03/11] SUNRPC: sysctl root for debug table introduced Stanislav Kinsbursky
2011-12-14 11:45 ` [PATCH 04/11] SUNRPC: per-net sysctl's set introduced Stanislav Kinsbursky
2011-12-14 11:45 ` [PATCH 05/11] SUNRPC: register debug sysctl table per network namespace Stanislav Kinsbursky
2011-12-14 11:45 ` [PATCH 06/11] SUNRPC: register xs_tunables " Stanislav Kinsbursky
2011-12-14 11:45 ` [PATCH 07/11] SUNRPC: xs tunables per network namespace introduced Stanislav Kinsbursky
2011-12-14 11:45 ` [PATCH 08/11] SUNRPC: use per-net xs tunables instead of static ones Stanislav Kinsbursky
2011-12-14 11:45 ` [PATCH 09/11] SUNRPC: remove xs_tcp_fin_timeout variable Stanislav Kinsbursky
2011-12-14 11:46 ` [PATCH 10/11] SUNRPC: allow debug flags modifications only from init_net Stanislav Kinsbursky
2011-12-14 11:46 ` [PATCH 11/11] SUNRPC: sysctl table for rpc_debug introduced Stanislav Kinsbursky
2012-02-07 11:44 ` [PATCH 00/11] SUNRPC: make sysctl per network namespcase context Stanislav Kinsbursky
2012-02-07 13:21   ` Myklebust, Trond

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m18vle2frv.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bfields@fieldses.org \
    --cc=davem@davemloft.net \
    --cc=devel@openvz.org \
    --cc=jbottomley@parallels.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=netdev@vger.kernel.org \
    --cc=skinsbursky@parallels.com \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).