linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Pavel Emelyanov <xemul@parallels.com>
Cc: hadi@cyberus.ca, linux-kernel@vger.kernel.org,
	Linux Containers <containers@lists.osdl.org>,
	netdev@vger.kernel.org, netfilter-devel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	Daniel Lezcano <daniel.lezcano@free.fr>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Ulrich Drepper <drepper@gmail.com>,
	Al Viro <viro@ZenIV.linux.org.uk>,
	David Miller <davem@davemloft.net>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Ben Greear <greearb@candelatech.com>,
	Matt Helsley <matthltc@us.ibm.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>,
	Jan Engelhardt <jengelh@medozas.de>,
	Patrick McHardy <kaber@trash.net>
Subject: Re: [PATCH 8/8] net: Implement socketat.
Date: Thu, 23 Sep 2010 08:00:33 -0700	[thread overview]
Message-ID: <m1y6asld2m.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <4C9B495D.70200@parallels.com> (Pavel Emelyanov's message of "Thu, 23 Sep 2010 16:34:37 +0400")

Pavel Emelyanov <xemul@parallels.com> writes:

> On 09/23/2010 04:11 PM, jamal wrote:
>> On Thu, 2010-09-23 at 15:53 +0400, Pavel Emelyanov wrote:
>> 
>>> Why does it matter? You told, that the usage scenario was to
>>> add routes to container. If I do 2 syscalls instead of 1, is
>>> it THAT worse?
>>>
>> 
>> Anything to do with socket IO that requires namespace awareness
>> applies for usage; it could be tcp/udp/etc socket. If it doesnt
>> make any difference performance wise using one scheme vs other
>> to write/read heavy messages then i dont see an issue and socketat
>> is redundant.
>
> That's what my point is about - unless we know why would we need it
> we don't need it.
>
> Eric, please clarify, what is the need in creating a socket in foreign
> net namespace?

Strictly speaking with setns() you can implement this functionality
with setns().  aka

int socketat(int nsfd, int domain, int type, int protocol)
{
        int sk;

        setns(0, nsfd);
        sk = socket(domain, type, protocol);
        setns(0, default_nsfd);

        return sk;
}

The major difference is that socketat in userspace suffers
from races, with signals etc.

The use case are applications are the handful of networking applications
that find that it makes sense to listen to sockets from multiple network
namespaces at once.  Say a home machine that has a vpn into your office
network and the vpn into the office network runs in a different network
namespace so you don't have to worry about address conflicts between
the two networks, the chance of accidentally bridging between them,
and so you can use different dns resolvers for the different networks.

In that scenario it would be nice if I could run some services on both
networks.  Starting two+ copies of the daemons just so the can have live
in all of the networks is ok, but in the fullness of time I expect that
there will be daemons that want to optimize things and have sockets in
all of the network namespaces you are connected to.

In a multiple network namespace aware application when it goes to open
a socket it will want to specify which network namespace the socket is
in.  If it is a general listener it will probably listening to events
in /proc/mounts waiting for extra namespaces to be mounted under a
standard location say: /var/run/netns/<netnsname>/ns.

Once the application receives the event for a new network namespace
showing up it can will want to create a new socket listening for
connections in the new network namespace.

In that scenario none of those network namespaces are foreign, but one
network namespace will be the default and the rest will be non-default
network namespaces.

To support a multiple network namespace aware daemon I need to implement
sockeat() somewhere.  So I figured I would see if anyone minded a
trivial in kernel race free implementation.  To me it is a wart in the
API and I am busily removing warts in the API.

I don't know of any scenarios with other namespaces where there would be
applications that would be native in multiple namespaces.  So I haven't
haven't done any work in that direction.

Eric

  parent reply	other threads:[~2010-09-23 15:00 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-23  8:45 [ABI REVIEW][PATCH 0/8] Namespace file descriptors Eric W. Biederman
2010-09-23  8:46 ` [PATCH 1/8] ns: proc files for namespace naming policy Eric W. Biederman
2010-09-23  8:46 ` [PATCH 2/8] ns: Introduce the setns syscall Eric W. Biederman
2010-09-23  8:47 ` [PATCH 3/8] ns proc: Add support for the network namespace Eric W. Biederman
2010-09-23 11:27   ` Louis Rilling
2010-09-23 16:00     ` Eric W. Biederman
2010-09-23  8:48 ` [PATCH 4/8] ns proc: Add support for the uts namespace Eric W. Biederman
2010-09-23  8:49 ` [PATCH 5/8] ns proc: Add support for the ipc namespace Eric W. Biederman
2010-09-23  8:50 ` [PATCH 6/8] ns proc: Add support for the mount namespace Eric W. Biederman
2010-09-23  8:51 ` [PATCH 7/8] net: Allow setting the network namespace by fd Eric W. Biederman
2010-09-23  9:41   ` Eric Dumazet
2010-09-23 16:03     ` Eric W. Biederman
2010-09-23 11:22   ` jamal
2010-09-23 14:58     ` David Lamparter
2010-09-24 11:51       ` jamal
2010-09-24 12:57         ` David Lamparter
2010-09-24 13:32           ` jamal
2010-09-24 14:09             ` David Lamparter
2010-09-24 14:16               ` jamal
2010-09-23 15:14     ` Eric W. Biederman
2010-09-23 14:22   ` Brian Haley
2010-09-23 16:16     ` Eric W. Biederman
2010-09-24 13:46   ` Daniel Lezcano
2010-09-23  8:51 ` [PATCH 8/8] net: Implement socketat Eric W. Biederman
2010-09-23  8:56   ` Pavel Emelyanov
2010-09-23 11:19     ` jamal
2010-09-23 11:33       ` Pavel Emelyanov
2010-09-23 11:40         ` jamal
2010-09-23 11:53           ` Pavel Emelyanov
2010-09-23 12:11             ` jamal
2010-09-23 12:34               ` Pavel Emelyanov
2010-09-23 14:54                 ` David Lamparter
2010-09-23 15:00                 ` Eric W. Biederman [this message]
2010-10-02 21:13             ` Daniel Lezcano
2010-10-03 13:44               ` jamal
2010-10-04 10:13                 ` Daniel Lezcano
2010-10-04 19:07                 ` Eric W. Biederman
2010-10-15 12:30                 ` netns patches WAS( " jamal
2010-10-26 20:52                   ` jamal
2010-10-27  0:27                     ` Eric W. Biederman
2010-09-23 15:18 ` [ABI REVIEW][PATCH 0/8] Namespace file descriptors David Lamparter
2010-09-23 16:32   ` Eric W. Biederman
2010-09-23 16:49     ` David Lamparter
2010-09-24 13:02 ` Andrew Lutomirski
2010-09-24 13:49   ` Daniel Lezcano
2010-09-24 17:06     ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1y6asld2m.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=containers@lists.osdl.org \
    --cc=corbet@lwn.net \
    --cc=daniel.lezcano@free.fr \
    --cc=davem@davemloft.net \
    --cc=drepper@gmail.com \
    --cc=greearb@candelatech.com \
    --cc=hadi@cyberus.ca \
    --cc=jengelh@medozas.de \
    --cc=kaber@trash.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthltc@us.ibm.com \
    --cc=mtk.manpages@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=serge@hallyn.com \
    --cc=sukadev@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).