netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2)
@ 2013-08-21 10:31 Pavel Emelyanov
  2013-08-21 10:31 ` [PATCH 1/4] tun: Add ability to create tun device with given index Pavel Emelyanov
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Pavel Emelyanov @ 2013-08-21 10:31 UTC (permalink / raw)
  To: David Miller, Linux Netdev List

Hi,

After taking a closer look on tun checkpoint-restore I've found several
issues with the tun's API that make it impossible to dump and restore
the state of tun device and attached tun-files.

The proposed API changes are all about extending the existing ioctl-based
stuff. Patches fit today's net-next.

This v2 has David's comments about patch #1 fixed. All the rest is the same.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] tun: Add ability to create tun device with given index
  2013-08-21 10:31 [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) Pavel Emelyanov
@ 2013-08-21 10:31 ` Pavel Emelyanov
  2013-08-21 10:32 ` [PATCH 2/4] tun: Report whether the queue is attached or not Pavel Emelyanov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Emelyanov @ 2013-08-21 10:31 UTC (permalink / raw)
  To: David Miller, Linux Netdev List

Tun devices cannot be created with ifidex user wants, but it's
required by checkpoint-restore project.

Long time ago such ability was implemented for rtnl_ops-based
interface for creating links (9c7dafbf net: Allow to create links
with given ifindex), but the only API for creating and managing
tuntap devices is ioctl-based and is evolving with adding new ones
(cde8b15f tuntap: add ioctl to attach or detach a file form tuntap
device).

Following that trend, here's how a new ioctl that sets the ifindex
for device, that _will_ be created by TUNSETIFF ioctl looks like.
So those who want a tuntap device with the ifindex N, should open
the tun device, call ioctl(fd, TUNSETIFINDEX, &N), then call TUNSETIFF.
If the index N is busy, then the register_netdev will find this out
and the ioctl would be failed with -EBUSY.

If setifindex is not called, then it will be generated as before.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
 drivers/net/tun.c           |   21 ++++++++++++++++++++-
 include/uapi/linux/if_tun.h |    1 +
 2 files changed, 21 insertions(+), 1 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7ed13cc..4b65fbc 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -138,7 +138,10 @@ struct tun_file {
 	struct fasync_struct *fasync;
 	/* only used for fasnyc */
 	unsigned int flags;
-	u16 queue_index;
+	union {
+		u16 queue_index;
+		unsigned int ifindex;
+	};
 	struct list_head next;
 	struct tun_struct *detached;
 };
@@ -1601,6 +1604,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 
 		dev_net_set(dev, net);
 		dev->rtnl_link_ops = &tun_link_ops;
+		dev->ifindex = tfile->ifindex;
 
 		tun = netdev_priv(dev);
 		tun->dev = dev;
@@ -1817,6 +1821,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 	kgid_t group;
 	int sndbuf;
 	int vnet_hdr_sz;
+	unsigned int ifindex;
 	int ret;
 
 	if (cmd == TUNSETIFF || cmd == TUNSETQUEUE || _IOC_TYPE(cmd) == 0x89) {
@@ -1851,6 +1856,19 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 			ret = -EFAULT;
 		goto unlock;
 	}
+	if (cmd == TUNSETIFINDEX) {
+		ret = -EPERM;
+		if (tun)
+			goto unlock;
+
+		ret = -EFAULT;
+		if (copy_from_user(&ifindex, argp, sizeof(ifindex)))
+			goto unlock;
+
+		ret = 0;
+		tfile->ifindex = ifindex;
+		goto unlock;
+	}
 
 	ret = -EBADFD;
 	if (!tun)
@@ -2099,6 +2117,7 @@ static int tun_chr_open(struct inode *inode, struct file * file)
 	rcu_assign_pointer(tfile->tun, NULL);
 	tfile->net = get_net(current->nsproxy->net_ns);
 	tfile->flags = 0;
+	tfile->ifindex = 0;
 
 	rcu_assign_pointer(tfile->socket.wq, &tfile->wq);
 	init_waitqueue_head(&tfile->wq.wait);
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 1870ee2..c58d023 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -56,6 +56,7 @@
 #define TUNGETVNETHDRSZ _IOR('T', 215, int)
 #define TUNSETVNETHDRSZ _IOW('T', 216, int)
 #define TUNSETQUEUE  _IOW('T', 217, int)
+#define TUNSETIFINDEX	_IOW('T', 218, unsigned int)
 
 /* TUNSETIFF ifr flags */
 #define IFF_TUN		0x0001
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] tun: Report whether the queue is attached or not
  2013-08-21 10:31 [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) Pavel Emelyanov
  2013-08-21 10:31 ` [PATCH 1/4] tun: Add ability to create tun device with given index Pavel Emelyanov
@ 2013-08-21 10:32 ` Pavel Emelyanov
  2013-08-21 10:32 ` [PATCH 3/4] tun: Allow to skip filter on attach Pavel Emelyanov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Emelyanov @ 2013-08-21 10:32 UTC (permalink / raw)
  To: David Miller, Linux Netdev List

Multiqueue tun devices allow to attach and detach from its queues
while keeping the interface itself set on file.

Knowing this is critical for the checkpoint part of criu project.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
 drivers/net/tun.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4b65fbc..db43a24 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1881,6 +1881,9 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 	case TUNGETIFF:
 		tun_get_iff(current->nsproxy->net_ns, tun, &ifr);
 
+		if (tfile->detached)
+			ifr.ifr_flags |= IFF_DETACH_QUEUE;
+
 		if (copy_to_user(argp, &ifr, ifreq_len))
 			ret = -EFAULT;
 		break;
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] tun: Allow to skip filter on attach
  2013-08-21 10:31 [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) Pavel Emelyanov
  2013-08-21 10:31 ` [PATCH 1/4] tun: Add ability to create tun device with given index Pavel Emelyanov
  2013-08-21 10:32 ` [PATCH 2/4] tun: Report whether the queue is attached or not Pavel Emelyanov
@ 2013-08-21 10:32 ` Pavel Emelyanov
  2013-08-21 10:32 ` [PATCH 4/4] tun: Get skfilter layout Pavel Emelyanov
  2013-08-21 19:22 ` [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) David Miller
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Emelyanov @ 2013-08-21 10:32 UTC (permalink / raw)
  To: David Miller, Linux Netdev List

There's a small problem with sk-filters on tun devices. Consider
an application doing this sequence of steps:

fd = open("/dev/net/tun");
ioctl(fd, TUNSETIFF, { .ifr_name = "tun0" });
ioctl(fd, TUNATTACHFILTER, &my_filter);
ioctl(fd, TUNSETPERSIST, 1);
close(fd);

At that point the tun0 will remain in the system and will keep in
mind that there should be a socket filter at address '&my_filter'.

If after that we do

fd = open("/dev/net/tun");
ioctl(fd, TUNSETIFF, { .ifr_name = "tun0" });

we most likely receive the -EFAULT error, since tun_attach() would
try to connect the filter back. But (!) if we provide a filter at
address &my_filter, then tun0 will be created and the "new" filter
would be attached, but application may not know about that.

This may create certain problems to anyone using tun-s, but it's
critical problem for c/r -- if we meet a persistent tun device
with a filter in mind, we will not be able to attach to it to dump
its state (flags, owner, address, vnethdr size, etc.).

The proposal is to allow to attach to tun device (with TUNSETIFF)
w/o attaching the filter to the tun-file's socket. After this
attach app may e.g clean the device by dropping the filter, it
doesn't want to have one, or (in case of c/r) get information
about the device with tun ioctls.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
 drivers/net/tun.c           |   12 +++++++-----
 include/uapi/linux/if_tun.h |    1 +
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index db43a24..6acbdbc 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -501,7 +501,7 @@ static void tun_detach_all(struct net_device *dev)
 		module_put(THIS_MODULE);
 }
 
-static int tun_attach(struct tun_struct *tun, struct file *file)
+static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filter)
 {
 	struct tun_file *tfile = file->private_data;
 	int err;
@@ -526,7 +526,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file)
 	err = 0;
 
 	/* Re-attach the filter to presist device */
-	if (tun->filter_attached == true) {
+	if (!skip_filter && (tun->filter_attached == true)) {
 		err = sk_attach_filter(&tun->fprog, tfile->socket.sk);
 		if (!err)
 			goto out;
@@ -1557,7 +1557,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 		if (err < 0)
 			return err;
 
-		err = tun_attach(tun, file);
+		err = tun_attach(tun, file, ifr->ifr_flags & IFF_NOFILTER);
 		if (err < 0)
 			return err;
 
@@ -1631,7 +1631,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 		dev->vlan_features = dev->features;
 
 		INIT_LIST_HEAD(&tun->disabled);
-		err = tun_attach(tun, file);
+		err = tun_attach(tun, file, false);
 		if (err < 0)
 			goto err_free_dev;
 
@@ -1795,7 +1795,7 @@ static int tun_set_queue(struct file *file, struct ifreq *ifr)
 		ret = security_tun_dev_attach_queue(tun->security);
 		if (ret < 0)
 			goto unlock;
-		ret = tun_attach(tun, file);
+		ret = tun_attach(tun, file, false);
 	} else if (ifr->ifr_flags & IFF_DETACH_QUEUE) {
 		tun = rtnl_dereference(tfile->tun);
 		if (!tun || !(tun->flags & TUN_TAP_MQ) || tfile->detached)
@@ -1883,6 +1883,8 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 
 		if (tfile->detached)
 			ifr.ifr_flags |= IFF_DETACH_QUEUE;
+		if (!tfile->socket.sk->sk_filter)
+			ifr.ifr_flags |= IFF_NOFILTER;
 
 		if (copy_to_user(argp, &ifr, ifreq_len))
 			ret = -EFAULT;
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index c58d023..cc127b2 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -71,6 +71,7 @@
 #define IFF_DETACH_QUEUE 0x0400
 /* read-only flag */
 #define IFF_PERSIST	0x0800
+#define IFF_NOFILTER	0x1000
 
 /* Socket options */
 #define TUN_TX_TIMESTAMP 1
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] tun: Get skfilter layout
  2013-08-21 10:31 [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) Pavel Emelyanov
                   ` (2 preceding siblings ...)
  2013-08-21 10:32 ` [PATCH 3/4] tun: Allow to skip filter on attach Pavel Emelyanov
@ 2013-08-21 10:32 ` Pavel Emelyanov
  2013-08-21 19:22 ` [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) David Miller
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Emelyanov @ 2013-08-21 10:32 UTC (permalink / raw)
  To: David Miller, Linux Netdev List

The only thing we may have from tun device is the fprog, whic contains
the number of filter elements and a pointer to (user-space) memory
where the elements are. The program itself may not be available if the
device is persistent and detached.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
 drivers/net/tun.c           |   10 ++++++++++
 include/uapi/linux/if_tun.h |    1 +
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 6acbdbc..60a1e93 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2042,6 +2042,16 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 		tun_detach_filter(tun, tun->numqueues);
 		break;
 
+	case TUNGETFILTER:
+		ret = -EINVAL;
+		if ((tun->flags & TUN_TYPE_MASK) != TUN_TAP_DEV)
+			break;
+		ret = -EFAULT;
+		if (copy_to_user(argp, &tun->fprog, sizeof(tun->fprog)))
+			break;
+		ret = 0;
+		break;
+
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index cc127b2..e9502dd 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -57,6 +57,7 @@
 #define TUNSETVNETHDRSZ _IOW('T', 216, int)
 #define TUNSETQUEUE  _IOW('T', 217, int)
 #define TUNSETIFINDEX	_IOW('T', 218, unsigned int)
+#define TUNGETFILTER _IOR('T', 219, struct sock_fprog)
 
 /* TUNSETIFF ifr flags */
 #define IFF_TUN		0x0001
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2)
  2013-08-21 10:31 [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) Pavel Emelyanov
                   ` (3 preceding siblings ...)
  2013-08-21 10:32 ` [PATCH 4/4] tun: Get skfilter layout Pavel Emelyanov
@ 2013-08-21 19:22 ` David Miller
  4 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2013-08-21 19:22 UTC (permalink / raw)
  To: xemul; +Cc: netdev

From: Pavel Emelyanov <xemul@parallels.com>
Date: Wed, 21 Aug 2013 14:31:11 +0400

> After taking a closer look on tun checkpoint-restore I've found several
> issues with the tun's API that make it impossible to dump and restore
> the state of tun device and attached tun-files.
> 
> The proposed API changes are all about extending the existing ioctl-based
> stuff. Patches fit today's net-next.
> 
> This v2 has David's comments about patch #1 fixed. All the rest is the same.
> 
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

Series applied, thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/4] tun: Report whether the queue is attached or not
  2013-08-19 15:09 [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore Pavel Emelyanov
@ 2013-08-19 15:09 ` Pavel Emelyanov
  0 siblings, 0 replies; 7+ messages in thread
From: Pavel Emelyanov @ 2013-08-19 15:09 UTC (permalink / raw)
  To: Linux Netdev List, David Miller

Multiqueue tun devices allow to attach and detach from its queues
while keeping the interface itself set on file.

Knowing this is critical for the checkpoint part of criu project.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
 drivers/net/tun.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a12450b..167222f 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1881,6 +1881,9 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
 	case TUNGETIFF:
 		tun_get_iff(current->nsproxy->net_ns, tun, &ifr);
 
+		if (tfile->detached)
+			ifr.ifr_flags |= IFF_DETACH_QUEUE;
+
 		if (copy_to_user(argp, &ifr, ifreq_len))
 			ret = -EFAULT;
 		break;
-- 
1.7.6.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-08-21 19:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-21 10:31 [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) Pavel Emelyanov
2013-08-21 10:31 ` [PATCH 1/4] tun: Add ability to create tun device with given index Pavel Emelyanov
2013-08-21 10:32 ` [PATCH 2/4] tun: Report whether the queue is attached or not Pavel Emelyanov
2013-08-21 10:32 ` [PATCH 3/4] tun: Allow to skip filter on attach Pavel Emelyanov
2013-08-21 10:32 ` [PATCH 4/4] tun: Get skfilter layout Pavel Emelyanov
2013-08-21 19:22 ` [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore (v2) David Miller
  -- strict thread matches above, loose matches on Subject: below --
2013-08-19 15:09 [PATCH net-next 0/4] tun: Some bits required for tun's checkpoint-restore Pavel Emelyanov
2013-08-19 15:09 ` [PATCH 2/4] tun: Report whether the queue is attached or not Pavel Emelyanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).