All of lore.kernel.org
 help / color / mirror / Atom feed
* Routing / forwarding in user space?
@ 2020-12-31 14:01 Marc Roos
  2020-12-31 19:49 ` Grant Taylor
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Marc Roos @ 2020-12-31 14:01 UTC (permalink / raw)
  To: lartc


I have a vm that uses some linux routing / forwarding / nat. I thought 
maybe this is a bit overkill and convert this to a container. However I 
am not sure if and what linux capabilities I need to grant to enable eg. 
forwarding. I think from security perspective it would be nicer to keep 
this isolated from the host.

sysctl -w net.ipv4.ip_forward=1
Generates 
sysctl: error setting key 'net.ipv4.ip_forward': Read-only file system

Nice would be to have something running maybe in user space that is 
similar to:

/sbin/iptables -A FORWARD -o $EXT -s 192.168.122.74/32 -m state --state 
NEW,ESTABLISHED,RELATED -j ACCEPT
/sbin/iptables -A FORWARD -i $EXT -d 192.168.122.74/32 -m state --state 
NEW,ESTABLISHED,RELATED -j ACCEPT

# meet frontend
/sbin/iptables -t nat -A PREROUTING -d x.x.x.x/32 -p tcp -m tcp --dport 
444 -j DNAT --to-destination 192.168.122.74
/sbin/iptables -t nat -A PREROUTING -d x.x.x.x/32 -p tcp -m tcp --dport 
4443 -j DNAT --to-destination 192.168.122.74
/sbin/iptables -t nat -A PREROUTING -d x.x.x.x/32 -p udp -m udp --dport 
10000:20000 -j DNAT --to-destination 192.168.122.74

/sbin/iptables -t nat -A POSTROUTING -o $EXT -s 192.168.122.74 -j SNAT 
--to-source x.x.x.x

Is there anything that can do routing/nat between interfaces but runs in 
users space???

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Routing / forwarding in user space?
  2020-12-31 14:01 Routing / forwarding in user space? Marc Roos
@ 2020-12-31 19:49 ` Grant Taylor
  2020-12-31 20:39 ` Grant Taylor
  2020-12-31 20:49 ` Grant Taylor
  2 siblings, 0 replies; 4+ messages in thread
From: Grant Taylor @ 2020-12-31 19:49 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 3291 bytes --]

On 12/31/20 7:01 AM, Marc Roos wrote:
> I have a vm that uses some linux routing / forwarding / nat. I thought 
> maybe this is a bit overkill and convert this to a container. However 
> I am not sure if and what linux capabilities I need to grant to 
> enable eg.  forwarding. I think from security perspective it would 
> be nicer to keep this isolated from the host.

The (full / fat) VM will have a different kernel completely separate 
from the host.

The container (as I understand them) will /not/ have a different kernel, 
thus /not/ completely separate from the host.

Aside:  My understanding of ""containers is that they are a collection 
of namespaces (in one combination or another).  As such -- IMHO -- they 
are /really/ part of / controlled by the host kernel.

I typically loosely describe it to people as it's the same kernel code 
working with different sets of data.  One set of data / configuration / 
IPs / routes / etc. is the host, another is a ""container, a 3rd set is 
another container, etc.

> sysctl -w net.ipv4.ip_forward=1
> Generates
> sysctl: error setting key 'net.ipv4.ip_forward': Read-only file system

That seems like a permissions issue.

I've done what you're wanting to do with network namespaces via ip netns 
as well as network namespaces + other namespaces via unshare & nsenter. 
(I've also gotten the two types to work together.)

> Nice would be to have something running maybe in user space that is 
> similar to:

Note:  In typical containers, this is *not* /user/ space.  This *is* 
/kernel/ space, just in a different namespace with a different set of data.

> /sbin/iptables -A FORWARD -o $EXT -s 192.168.122.74/32 -m state --state
> NEW,ESTABLISHED,RELATED -j ACCEPT
> /sbin/iptables -A FORWARD -i $EXT -d 192.168.122.74/32 -m state --state
> NEW,ESTABLISHED,RELATED -j ACCEPT
> 
> # meet frontend
> /sbin/iptables -t nat -A PREROUTING -d x.x.x.x/32 -p tcp -m tcp --dport
> 444 -j DNAT --to-destination 192.168.122.74
> /sbin/iptables -t nat -A PREROUTING -d x.x.x.x/32 -p tcp -m tcp --dport
> 4443 -j DNAT --to-destination 192.168.122.74
> /sbin/iptables -t nat -A PREROUTING -d x.x.x.x/32 -p udp -m udp --dport
> 10000:20000 -j DNAT --to-destination 192.168.122.74
> 
> /sbin/iptables -t nat -A POSTROUTING -o $EXT -s 192.168.122.74 -j SNAT
> --to-source x.x.x.x

I've done many different variations on what you're wanting to do using 
""containers.  (See above.)

Docker, Podman, et al. have their own implications ~> limitations that 
make this type of thing more difficult than it should be.

It is entirely possible to what you want to do in ""containers.

> Is there anything that can do routing/nat between interfaces but runs 
> in users space???

Absolutely.  I've got nine of these ""containers running on the system 
that I'm typing this reply on.

To me, the biggest question is what type of interfaces you are using. 
Are you moving a physical interface from the host into the network 
namespace / container?  Or are you using a logical interface from the 
network namespace / container and possibly extending it to a physical in 
the host via something like bridging.  (MACVLAN and IPVLAN play in this 
area.)


-- 
Grant. . . .
unix || die


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4013 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Routing / forwarding in user space?
  2020-12-31 14:01 Routing / forwarding in user space? Marc Roos
  2020-12-31 19:49 ` Grant Taylor
@ 2020-12-31 20:39 ` Grant Taylor
  2020-12-31 20:49 ` Grant Taylor
  2 siblings, 0 replies; 4+ messages in thread
From: Grant Taylor @ 2020-12-31 20:39 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 5741 bytes --]

Pre-Script:  I need to give some history of time keeping and clock 
making before I tell you what time it is.

On 12/31/20 12:49 PM, Grant Taylor wrote:
> Absolutely.  I've got nine of these ""containers running on the system 
> that I'm typing this reply on.

Here are some more details on what I'm doing in case you want to try 
something similar.

I have allocated the RFC 6598 Shared Address Space[1] to my workstation 
test VLANs, currently on my workstation.  My home network has routes to 
100.64.0.0/10 via my workstation's LAN IP.

100.64.0.0/24 is the core / backbone / area 0 of these lab VLANs.

Each lab VLAN has a separate /24 therein.
    Lab 1 = 100.64.1.0/24
    Lab 2 = 100.64.2.0/24
    Lab 3 = 100.64.3.0/24
    ...

My workstation has routes to the lab subnets via each ""container 
(network namespace) that is doing the very type of routing that I think 
you're asking about.

    100.64.1.0/24 via 100.64.0.1
    100.64.2.0/24 via 100.64.0.2
    100.64.3.0/24 via 100.64.0.3
    ...

I am using logical (vEth) interfaces between all the network namespaces 
/ ""containers.  --  I do tuck most of them away in another network 
namespace / ""container so that I don't see a bunch of ... unsighly 
interfaces when running "ip" / "ifconfig" / et al. in my host / root / 
unnamed network namespace.

I have a vEth from the host into what I call lab0.  Each of the other 
routing network namespaces / ""contianers have a vEth to lab0 and to the 
host.  lab0 bridges all of the vEths therein to create one broadcast 
domain that connects the host and all of the lab network namespaces / 
""containers.

This means that each network namespace / "" container can route between 
it's vEth that connects to the bridge and the vEth that connects back to 
the host.

    lab1 routes between 100.64.0.1/24 and 100.64.1.254/24
    lab2 routes between 100.64.0.2/24 and 100.64.2.254/24
    lab3 routes between 100.64.0.3/24 and 100.64.3.254/24
    ...

The purpose for these routing network namespace / ""containers is so 
that I can mess around with various things in VirtualBox (et al.) on the 
host and have access to 11 different networks (home LAN, virtual 
backbone, and each lab network).  This enables me to play with various 
things using network namespaces / "" containers as routers.

I have had as many as 100 of these running on my system at one time with 
no ill effect.  (Obviously the VMs connected to them have an effect. 
But that's not the network namespaces / "" containers.)

/*
** What time is it?
*/

I create all of this with a 25 line shell script.

1)  I create the directories (transient b/c of tmpfs) that are needed.
     A)  "ip netns" uses /run/netns so I create it and mountns & utsns 
following suit.
         # sudo mkdir -p /run/{mount,net,uts}ns
     B)  Network namespaces / ""containers use their own mount point.
         # sudo touch /run/{mount,net,uts}ns/lab0
2)  I create / instantiate the first network namespace / ""container.
         # unshare -mount=/run/mountns/lab0 --net=/run/netns/lab0 
--uts=/run/utsns/lab0 /bin/hostname lab0

Aside:  unshare creates / instantiates the network namespace / 
""container to run the /bin/hostname command.  It does not destroy the 
namespace / "" container -- which is default -- because of the 
mountpoints.  See the man page for more details.

3)  I create the vEth pair to connect from the host to lab0
         # sudo ip link add lab0 type veth peer name $HOSTNAME netns lab0
         # sudo ip link set lab0 up
         # sudo ip netns exec lab0 ip link set lo up
         # sudo ip netns exec lab0 ip link add bri0 type bridge
         # sudo ip netns exec lab0 ip link set bri0 up
         # sudo ip netns exec lab0 ip link $HOSTNAME master bri0
         # sudo ip netns exec lab0 ip link $HOSTNAME up

Steps 1-3 create the central netns.

4)  I create / instantiate and configure the network on the other lab 
network namespaces / ""containers all at the same time via a loop.
         # for l in {1..9}; do
         #    sudo touch /run/{mount,net,uts}ns/lab${l}
         #    sudo unshare -mount=/run/mountns/lab${l} 
--net=/run/netns/lab${l} --uts=/run/utsns/lab${l} /bin/hostname lab${l}
         #    sudo ip link add lab${l} type veth peer name lab${l}i 
netns lab${l}
         #    sudo ip link set lab${l} up
         #    sudo sysctl -q net.ipv6.conf.lab${l}.disable_ipv6=1 > 
/dev/null
         #    sudo ip netns exec lab${l} ip link set lo up
         #    sudo ip netns exec lab${l} ip link set lab${l}o up
         #    sudo ip netns exec lab${l} ip addr add 100.64.0.${l}/24 
dev lab${l}o
         #    sudo ip netns exec lab${l} ip link set lab${l}i up
         #    sudo ip netns exec lab${l} ip addr add 100.64.${l}.254/24 
dev lab${l}i
         # done

Note:  I manually retyped this, so there may be typos.

Aside:  I've not yet configured IPv6 in the labs, so I disable it.  (My 
home LAN is IPv6 enabled.)

This provides nine L2 lab# interfaces on the host so that I can connect 
VMs to them.  The host does /not/ have IP addresses in these lab VLANs. 
The host must route through the lab# network namespaces / ""containers 
to get to attached VMs.  Said VMs must do similar to access the host and 
the Internet.

I believe these network namespaces / ""containers are exactly what 
you're wanting to do; e.g. routing between network inside of a network 
namespace / ""container.

[1] Yes, I know the danger of conflict with ISPs that do Carrier Grade 
NAT.  Mine does not.  So I choose to use this space to avoid typical RFC 
1918 Address Space.



-- 
Grant. . . .
unix || die


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4013 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Routing / forwarding in user space?
  2020-12-31 14:01 Routing / forwarding in user space? Marc Roos
  2020-12-31 19:49 ` Grant Taylor
  2020-12-31 20:39 ` Grant Taylor
@ 2020-12-31 20:49 ` Grant Taylor
  2 siblings, 0 replies; 4+ messages in thread
From: Grant Taylor @ 2020-12-31 20:49 UTC (permalink / raw)
  To: lartc

[-- Attachment #1: Type: text/plain, Size: 1949 bytes --]

On 12/31/20 12:49 PM, Grant Taylor wrote:
> To me, the biggest question is what type of interfaces you are using. 
> Are you moving a physical interface from the host into the network 
> namespace / container?  Or are you using a logical interface from the 
> network namespace / container and possibly extending it to a physical in 
> the host via something like bridging.  (MACVLAN and IPVLAN play in this 
> area.)

My network namespaces / ""containers use vEth links to interconnect 
things.  But I could also move physical NICs from the host network 
namespace into the guest (?) network namespace / ""container.

I could create logical NICs; (802.1Q) VLAN / MACVLAN / IPVLAN / etc. and 
move them into the network namespace / ""container.  --  I have done 
exactly this at work.

I think that I can also create tunnel interfaces and move them into the 
network namespace / ""container.  --  I have not tried this.  The tunnel 
may need to be created inside the network namespace / ""container.

Deciding how to connect the network namespace / ""container to the 
outside world is extremely important.  You need to have a good 
understanding of what you are wanting to do and how to achieve your goal.

This is where I start to see things like Docker fall down.  --  Maybe 
it's my limited understanding of Docker / Podman / et al.  --  My 
understanding is that many traditional container systems tend to use 
independent networks, routing, and NATing.  This works for some things. 
But it does not work for everything.  Especially when you want L2 
connectivity, like when you want to use a ""container as a router for 
other LAN things.

I think that some container orchestration systems do provide a way to 
get a layer 2 connection into the container.  However, doing so is an 
exception and against their design methodology, thus you start at a 
disadvantage.



`--
Grant. . . .
unix || die


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4013 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-31 20:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-31 14:01 Routing / forwarding in user space? Marc Roos
2020-12-31 19:49 ` Grant Taylor
2020-12-31 20:39 ` Grant Taylor
2020-12-31 20:49 ` Grant Taylor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.