All of lore.kernel.org
 help / color / mirror / Atom feed
* [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
@ 2011-09-06 22:35 Roopa Prabhu
  2011-09-06 22:35 ` [net-next-2.6 PATCH 1/3 RFC] macvlan: Add support for unicast filtering in macvlan Roopa Prabhu
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-06 22:35 UTC (permalink / raw)
  To: netdev; +Cc: dragos.tatulea, arnd, mst, dwang2, benve, kaber, sri

This patch is an attempt at providing address filtering support for macvtap 
devices in PASSTHRU mode. Its still a work in progress.
Briefly tested for basic functionality. Wanted to get some feedback on the 
direction before proceeding.

I have hopefully CC'ed all concerned people.

PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
there is a 1-1 mapping between macvtap device and physical nic or VF. And all
filtering is done in lowerdev hw. The lowerdev does not need to be in 
promiscous mode as long as the guest filters are passed down to the lowerdev. 
This patch tries to remove the need for putting the lowerdev in promiscous mode. 
I have also referred to the thread below where TUNSETTXFILTER was mentioned in 
this context: 
 http://patchwork.ozlabs.org/patch/69297/

This patch basically passes the addresses got by TUNSETTXFILTER to macvlan 
lowerdev.

I have looked at previous work and discussions on this for qemu-kvm 
by Michael Tsirkin, Alex Williamson and Dragos Tatulea
http://patchwork.ozlabs.org/patch/78595/
http://patchwork.ozlabs.org/patch/47160/
https://patchwork.kernel.org/patch/474481/

Redhat bugzilla by Michael Tsirkin:
https://bugzilla.redhat.com/show_bug.cgi?id=655013

I used Michael's qemu-kvm patch for testing the changes with KVM 

I would like to cover both MAC and vlan filtering in this work.

Open Questions/Issues:
- There is a need for vlan filtering to complete the patch. It will require 
  a new tap ioctl cmd for vlans. 
  Some ideas on this are: 

  a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap filter 
	(similar to tun_filter for addresses). Passing the vlan id's to lower
	device will mean going thru the whole list of vlans every time.

  OR

  b) TUNSETVLAN with vlan id and flag to set/unset

  Does option 'b' sound ok ?

- In this implementation we make the macvlan address list same as the address 
  list that came in the filter with TUNSETTXFILTER. This will not cover cases 
  where the macvlan device needs to have other addresses that are not 
  necessarily in the filter. Is this a problem ?

- The patch currently only supports passing of IFF_PROMISC and IFF_MULTICAST 
filter flags to lowerdev

This patch series implements the following 
01/3 - macvlan: Add support for unicast filtering in macvlan 
02/3 - macvlan: Add function to set addr filter on lower device in passthru mode
03/3 - macvtap: Add support for TUNSETTXFILTER

Please comment. Thanks.

Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [net-next-2.6 PATCH 1/3 RFC] macvlan: Add support for unicast filtering in macvlan
  2011-09-06 22:35 [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Roopa Prabhu
@ 2011-09-06 22:35 ` Roopa Prabhu
  2011-09-06 22:35 ` [net-next-2.6 PATCH 2/3 RFC] macvlan: Add function to set addr filters for device in passthru mode Roopa Prabhu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-06 22:35 UTC (permalink / raw)
  To: netdev; +Cc: dragos.tatulea, arnd, mst, dwang2, benve, kaber, sri

From: Roopa Prabhu <roprabhu@cisco.com>

This patch adds support for ndo_set_rx_mode and sets IFF_UNICAST_FLT to enable
unicast filtering in macvlan. This is to support unicast and multicast
address filtering when this device is used in PASSTHRU mode

Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
---
 drivers/net/macvlan.c |   20 ++++++++++++++++++--
 1 files changed, 18 insertions(+), 2 deletions(-)


diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 836e13f..528924f 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -316,11 +316,19 @@ static int macvlan_open(struct net_device *dev)
 		if (err < 0)
 			goto del_unicast;
 	}
+	if (dev->flags & IFF_PROMISC) {
+		err = dev_set_promiscuity(lowerdev, 1);
+		if (err < 0)
+			goto unset_allmulti;
+	}
 
 hash_add:
 	macvlan_hash_add(vlan);
 	return 0;
 
+unset_allmulti:
+	dev_set_allmulti(lowerdev, -1);
+
 del_unicast:
 	dev_uc_del(lowerdev, dev->dev_addr);
 out:
@@ -337,9 +345,12 @@ static int macvlan_stop(struct net_device *dev)
 		goto hash_del;
 	}
 
+	dev_uc_unsync(lowerdev, dev);
 	dev_mc_unsync(lowerdev, dev);
 	if (dev->flags & IFF_ALLMULTI)
 		dev_set_allmulti(lowerdev, -1);
+	if (dev->flags & IFF_PROMISC)
+		dev_set_promiscuity(lowerdev, -1);
 
 	dev_uc_del(lowerdev, dev->dev_addr);
 
@@ -384,12 +395,16 @@ static void macvlan_change_rx_flags(struct net_device *dev, int change)
 
 	if (change & IFF_ALLMULTI)
 		dev_set_allmulti(lowerdev, dev->flags & IFF_ALLMULTI ? 1 : -1);
+	if (change & IFF_PROMISC)
+		dev_set_promiscuity(lowerdev,
+			dev->flags & IFF_PROMISC ? 1 : -1);
 }
 
-static void macvlan_set_multicast_list(struct net_device *dev)
+static void macvlan_set_rx_mode(struct net_device *dev)
 {
 	struct macvlan_dev *vlan = netdev_priv(dev);
 
+	dev_uc_sync(vlan->lowerdev, dev);
 	dev_mc_sync(vlan->lowerdev, dev);
 }
 
@@ -561,7 +576,7 @@ static const struct net_device_ops macvlan_netdev_ops = {
 	.ndo_change_mtu		= macvlan_change_mtu,
 	.ndo_change_rx_flags	= macvlan_change_rx_flags,
 	.ndo_set_mac_address	= macvlan_set_mac_address,
-	.ndo_set_rx_mode	= macvlan_set_multicast_list,
+	.ndo_set_rx_mode	= macvlan_set_rx_mode,
 	.ndo_get_stats64	= macvlan_dev_get_stats64,
 	.ndo_validate_addr	= eth_validate_addr,
 	.ndo_vlan_rx_add_vid	= macvlan_vlan_rx_add_vid,
@@ -573,6 +588,7 @@ void macvlan_common_setup(struct net_device *dev)
 	ether_setup(dev);
 
 	dev->priv_flags	       &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
+	dev->priv_flags	       |= IFF_UNICAST_FLT;
 	dev->netdev_ops		= &macvlan_netdev_ops;
 	dev->destructor		= free_netdev;
 	dev->header_ops		= &macvlan_hard_header_ops,

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [net-next-2.6 PATCH 2/3 RFC] macvlan: Add function to set addr filters for device in passthru mode
  2011-09-06 22:35 [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Roopa Prabhu
  2011-09-06 22:35 ` [net-next-2.6 PATCH 1/3 RFC] macvlan: Add support for unicast filtering in macvlan Roopa Prabhu
@ 2011-09-06 22:35 ` Roopa Prabhu
  2011-09-06 22:35 ` [net-next-2.6 PATCH 3/3 RFC] macvtap: Add support for TUNSETTXFILTER Roopa Prabhu
  2011-09-07 12:34 ` [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Michael S. Tsirkin
  3 siblings, 0 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-06 22:35 UTC (permalink / raw)
  To: netdev; +Cc: dragos.tatulea, arnd, mst, dwang2, benve, kaber, sri

From: Roopa Prabhu <roprabhu@cisco.com>

This patch introduces a function that accepts address list and filter flags
and sets them on the macvlan netdev. Currently it only supports PASSTHRU mode.
It also removes the code that puts the device in promiscous mode for PASSTHRU

Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
---
 drivers/net/macvlan.c      |  130 ++++++++++++++++++++++++++++++++++++++------
 include/linux/if_macvlan.h |    3 +
 2 files changed, 114 insertions(+), 19 deletions(-)


diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index 528924f..ba8b5a6 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -299,18 +299,16 @@ static int macvlan_open(struct net_device *dev)
 	struct net_device *lowerdev = vlan->lowerdev;
 	int err;
 
-	if (vlan->port->passthru) {
-		dev_set_promiscuity(lowerdev, 1);
-		goto hash_add;
-	}
+	if (!vlan->port->passthru) {
+		err = -EBUSY;
+		if (macvlan_addr_busy(vlan->port, dev->dev_addr))
+			goto out;
 
-	err = -EBUSY;
-	if (macvlan_addr_busy(vlan->port, dev->dev_addr))
-		goto out;
+		err = dev_uc_add(lowerdev, dev->dev_addr);
+		if (err < 0)
+			goto out;
+	}
 
-	err = dev_uc_add(lowerdev, dev->dev_addr);
-	if (err < 0)
-		goto out;
 	if (dev->flags & IFF_ALLMULTI) {
 		err = dev_set_allmulti(lowerdev, 1);
 		if (err < 0)
@@ -322,7 +320,6 @@ static int macvlan_open(struct net_device *dev)
 			goto unset_allmulti;
 	}
 
-hash_add:
 	macvlan_hash_add(vlan);
 	return 0;
 
@@ -330,7 +327,8 @@ unset_allmulti:
 	dev_set_allmulti(lowerdev, -1);
 
 del_unicast:
-	dev_uc_del(lowerdev, dev->dev_addr);
+	if (!vlan->port->passthru)
+		dev_uc_del(lowerdev, dev->dev_addr);
 out:
 	return err;
 }
@@ -340,11 +338,6 @@ static int macvlan_stop(struct net_device *dev)
 	struct macvlan_dev *vlan = netdev_priv(dev);
 	struct net_device *lowerdev = vlan->lowerdev;
 
-	if (vlan->port->passthru) {
-		dev_set_promiscuity(lowerdev, -1);
-		goto hash_del;
-	}
-
 	dev_uc_unsync(lowerdev, dev);
 	dev_mc_unsync(lowerdev, dev);
 	if (dev->flags & IFF_ALLMULTI)
@@ -352,9 +345,9 @@ static int macvlan_stop(struct net_device *dev)
 	if (dev->flags & IFF_PROMISC)
 		dev_set_promiscuity(lowerdev, -1);
 
-	dev_uc_del(lowerdev, dev->dev_addr);
+	if (!vlan->port->passthru)
+		dev_uc_del(lowerdev, dev->dev_addr);
 
-hash_del:
 	macvlan_hash_del(vlan, !dev->dismantle);
 	return 0;
 }
@@ -854,6 +847,105 @@ static int macvlan_device_event(struct notifier_block *unused,
 	return NOTIFY_DONE;
 }
 
+static int macvlan_set_filter_passthru(struct net_device *dev,
+					unsigned int flags, int count,
+					u8 *addrlist)
+{
+	struct netdev_hw_addr *ha;
+	u8 *addr;
+	int i, found;
+	unsigned int flags_changed;
+
+	rtnl_lock();
+
+	/*
+	 *	Only look for changes in IFF_PROMISC and IFF_ALLMULTI for now.
+	 *	Ideally this list for flags to look must come from the caller
+	 */
+	flags_changed = (dev->flags ^ flags) & (IFF_PROMISC | IFF_ALLMULTI);
+	if (flags_changed)
+		dev_change_flags(dev, dev->flags ^ flags_changed);
+
+	if (!count) {
+		rtnl_unlock();
+		return 0;
+	}
+
+	/* Delete all multicast addresses not in use */
+	netdev_for_each_mc_addr(ha, dev) {
+		for (i = 0, addr = addrlist; i < count; i++) {
+			if (is_multicast_ether_addr(addr)
+				&& !compare_ether_addr(addr,
+				ha->addr))
+				break;
+			addr += ETH_ALEN;
+		}
+		if (i == count)
+			dev_mc_del(dev, ha->addr);
+	}
+
+	/* Delete all unicast addresses not in use */
+	netdev_for_each_uc_addr(ha, dev) {
+		for (i = 0, addr = addrlist; i < count; i++) {
+			if (!is_multicast_ether_addr(addr)
+				&& !compare_ether_addr(addr,
+				ha->addr))
+				break;
+			addr += ETH_ALEN;
+		}
+		if (i == count)
+			dev_uc_del(dev, ha->addr);
+	}
+
+	/* Add new addresses */
+	for (i = 0, addr = addrlist; i < count; i++) {
+		found = 0;
+		if (is_multicast_ether_addr(addr)) {
+			netdev_for_each_mc_addr(ha, dev) {
+				if (compare_ether_addr(addr,
+					ha->addr) == 0) {
+					found = 1;
+					break;
+				}
+			}
+			if (!found)
+				dev_mc_add(dev, addr);
+		} else {
+			netdev_for_each_uc_addr(ha, dev) {
+				if (compare_ether_addr(addr,
+					ha->addr) == 0) {
+					found = 1;
+					break;
+				}
+			}
+			if (!found)
+				dev_uc_add(dev, addr);
+		}
+		addr += ETH_ALEN;
+	}
+	rtnl_unlock();
+
+	return count;
+}
+
+int macvlan_set_filter(struct net_device *dev, unsigned int flags,
+			int count, u8 *addrlist)
+{
+	struct macvlan_dev *vlan = netdev_priv(dev);
+
+	if (!vlan)
+		return -EINVAL;
+
+	switch (vlan->mode) {
+	case MACVLAN_MODE_PASSTHRU:
+		return macvlan_set_filter_passthru(dev, flags, count,
+							addrlist);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+EXPORT_SYMBOL_GPL(macvlan_set_filter);
+
 static struct notifier_block macvlan_notifier_block __read_mostly = {
 	.notifier_call	= macvlan_device_event,
 };
diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h
index e28b2e4..d55d4e6 100644
--- a/include/linux/if_macvlan.h
+++ b/include/linux/if_macvlan.h
@@ -104,4 +104,7 @@ extern int macvlan_link_register(struct rtnl_link_ops *ops);
 extern netdev_tx_t macvlan_start_xmit(struct sk_buff *skb,
 				      struct net_device *dev);
 
+extern int macvlan_set_filter(struct net_device *dev, unsigned int flags,
+			      int count, u8 *addrlist);
+
 #endif /* _LINUX_IF_MACVLAN_H */

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [net-next-2.6 PATCH 3/3 RFC] macvtap: Add support for TUNSETTXFILTER
  2011-09-06 22:35 [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Roopa Prabhu
  2011-09-06 22:35 ` [net-next-2.6 PATCH 1/3 RFC] macvlan: Add support for unicast filtering in macvlan Roopa Prabhu
  2011-09-06 22:35 ` [net-next-2.6 PATCH 2/3 RFC] macvlan: Add function to set addr filters for device in passthru mode Roopa Prabhu
@ 2011-09-06 22:35 ` Roopa Prabhu
  2011-09-08 16:25   ` Arnd Bergmann
  2011-09-07 12:34 ` [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Michael S. Tsirkin
  3 siblings, 1 reply; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-06 22:35 UTC (permalink / raw)
  To: netdev; +Cc: dragos.tatulea, arnd, mst, dwang2, benve, kaber, sri

From: Roopa Prabhu <roprabhu@cisco.com>

This patch adds support for TUNSETTXFILTER. Calls macvlan set filter function
with address list and flags received via TUNSETTXFILTER.

Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
---
 drivers/net/macvtap.c |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 49 insertions(+), 0 deletions(-)


diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index ab96c31..9943683 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -825,6 +825,10 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
 	int __user *sp = argp;
 	int s;
 	int ret;
+	u8 *addrs;
+	struct tun_filter tf;
+	unsigned int flags = 0;
+	int alen;
 
 	switch (cmd) {
 	case TUNSETIFF:
@@ -896,6 +900,51 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
 			return  -EINVAL;
 		return 0;
 
+	case TUNSETTXFILTER:
+		rcu_read_lock_bh();
+		vlan = rcu_dereference_bh(q->vlan);
+		if (vlan)
+			dev_hold(vlan->dev);
+		rcu_read_unlock_bh();
+
+		if (!vlan)
+			return -ENOLINK;
+
+		if (copy_from_user(&tf, argp, sizeof(tf))) {
+			dev_put(vlan->dev);
+			return -EFAULT;
+		}
+
+		/* XXX: If broadcast address present, set IFF_BROADCAST */
+		/* XXX: If multicast address present, set IFF_MULTICAST */
+		flags |= (tf.flags & TUN_FLT_ALLMULTI ? IFF_ALLMULTI : 0) |
+			 (!tf.count ? IFF_PROMISC : 0);
+		ret = 0;
+		if (tf.count > 0) {
+			alen = ETH_ALEN * tf.count;
+			addrs = kmalloc(alen, GFP_KERNEL);
+			if (!addrs) {
+				dev_put(vlan->dev);
+				return -ENOMEM;
+			}
+
+			if (copy_from_user(addrs, argp + sizeof(tf), alen)) {
+				kfree(addrs);
+				dev_put(vlan->dev);
+				return -EFAULT;
+			}
+		}
+
+		if (tf.count > 0 || flags)
+			ret = macvlan_set_filter(vlan->dev, flags,
+				tf.count, addrs);
+		dev_put(vlan->dev);
+
+		if (tf.count > 0)
+			kfree(addrs);
+
+		return ret;
+
 	default:
 		return -EINVAL;
 	}

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-06 22:35 [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Roopa Prabhu
                   ` (2 preceding siblings ...)
  2011-09-06 22:35 ` [net-next-2.6 PATCH 3/3 RFC] macvtap: Add support for TUNSETTXFILTER Roopa Prabhu
@ 2011-09-07 12:34 ` Michael S. Tsirkin
  2011-09-08  5:20   ` Roopa Prabhu
  3 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-07 12:34 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri

On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
> This patch is an attempt at providing address filtering support for macvtap 
> devices in PASSTHRU mode. Its still a work in progress.
> Briefly tested for basic functionality. Wanted to get some feedback on the 
> direction before proceeding.
> 

Good work, thanks.

> I have hopefully CC'ed all concerned people.

kvm crowd might also be interested.
Try using ./scripts/get_maintainer.pl as well.

> PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
> there is a 1-1 mapping between macvtap device and physical nic or VF. And all
> filtering is done in lowerdev hw. The lowerdev does not need to be in 
> promiscous mode as long as the guest filters are passed down to the lowerdev. 
> This patch tries to remove the need for putting the lowerdev in promiscous mode. 
> I have also referred to the thread below where TUNSETTXFILTER was mentioned in 
> this context: 
>  http://patchwork.ozlabs.org/patch/69297/
> 
> This patch basically passes the addresses got by TUNSETTXFILTER to macvlan 
> lowerdev.
> 
> I have looked at previous work and discussions on this for qemu-kvm 
> by Michael Tsirkin, Alex Williamson and Dragos Tatulea
> http://patchwork.ozlabs.org/patch/78595/
> http://patchwork.ozlabs.org/patch/47160/
> https://patchwork.kernel.org/patch/474481/
> 
> Redhat bugzilla by Michael Tsirkin:
> https://bugzilla.redhat.com/show_bug.cgi?id=655013
> 
> I used Michael's qemu-kvm patch for testing the changes with KVM 
> 
> I would like to cover both MAC and vlan filtering in this work.
> 
> Open Questions/Issues:
> - There is a need for vlan filtering to complete the patch. It will require 
>   a new tap ioctl cmd for vlans. 
>   Some ideas on this are: 
> 
>   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap filter 
> 	(similar to tun_filter for addresses). Passing the vlan id's to lower
> 	device will mean going thru the whole list of vlans every time.
> 
>   OR
> 
>   b) TUNSETVLAN with vlan id and flag to set/unset
> 
>   Does option 'b' sound ok ?
> 
> - In this implementation we make the macvlan address list same as the address 
>   list that came in the filter with TUNSETTXFILTER. This will not cover cases 
>   where the macvlan device needs to have other addresses that are not 
>   necessarily in the filter. Is this a problem ?

What cases do you have in mind?

> - The patch currently only supports passing of IFF_PROMISC and IFF_MULTICAST 
> filter flags to lowerdev
> 
> This patch series implements the following 
> 01/3 - macvlan: Add support for unicast filtering in macvlan 
> 02/3 - macvlan: Add function to set addr filter on lower device in passthru mode
> 03/3 - macvtap: Add support for TUNSETTXFILTER
> 
> Please comment. Thanks.
> 
> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
> Signed-off-by: Christian Benvenuti <benve@cisco.com>
> Signed-off-by: David Wang <dwang2@cisco.com>

The security isn't lower than with promisc, so I don't see
a problem with this as such.

There are more features we'll want down the road though,
so let's see whether the interface will be able to
satisfy them in a backwards compatible way before we
set it in stone. Here's what I came up with:

How will the filtering table be partitioned within guests?

A way to limit what the guest can do would also be useful.
How can this be done? selinux?

Any thoughts on spoofing filtering?

Would it be possible to make the filtering programmable
using netlink, e.g. ethtool, ip, or some such?
That would make this useful for bridged setups besides
macvtap/virtualization.

Thanks,

-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-07 12:34 ` [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Michael S. Tsirkin
@ 2011-09-08  5:20   ` Roopa Prabhu
  2011-09-08 11:08     ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-08  5:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm

On 9/7/11 5:34 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
>> This patch is an attempt at providing address filtering support for macvtap
>> devices in PASSTHRU mode. Its still a work in progress.
>> Briefly tested for basic functionality. Wanted to get some feedback on the
>> direction before proceeding.
>> 
> 
> Good work, thanks.
> 

Thanks.

>> I have hopefully CC'ed all concerned people.
> 
> kvm crowd might also be interested.
> Try using ./scripts/get_maintainer.pl as well.
> 
Thanks for the tip. Expanded CC list a bit more.

>> PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
>> there is a 1-1 mapping between macvtap device and physical nic or VF. And all
>> filtering is done in lowerdev hw. The lowerdev does not need to be in
>> promiscous mode as long as the guest filters are passed down to the lowerdev.
>> This patch tries to remove the need for putting the lowerdev in promiscous
>> mode. 
>> I have also referred to the thread below where TUNSETTXFILTER was mentioned
>> in 
>> this context: 
>>  http://patchwork.ozlabs.org/patch/69297/
>> 
>> This patch basically passes the addresses got by TUNSETTXFILTER to macvlan
>> lowerdev.
>> 
>> I have looked at previous work and discussions on this for qemu-kvm
>> by Michael Tsirkin, Alex Williamson and Dragos Tatulea
>> http://patchwork.ozlabs.org/patch/78595/
>> http://patchwork.ozlabs.org/patch/47160/
>> https://patchwork.kernel.org/patch/474481/
>> 
>> Redhat bugzilla by Michael Tsirkin:
>> https://bugzilla.redhat.com/show_bug.cgi?id=655013
>> 
>> I used Michael's qemu-kvm patch for testing the changes with KVM
>> 
>> I would like to cover both MAC and vlan filtering in this work.
>> 
>> Open Questions/Issues:
>> - There is a need for vlan filtering to complete the patch. It will require
>>   a new tap ioctl cmd for vlans.
>>   Some ideas on this are:
>> 
>>   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap filter
>> (similar to tun_filter for addresses). Passing the vlan id's to lower
>> device will mean going thru the whole list of vlans every time.
>> 
>>   OR
>> 
>>   b) TUNSETVLAN with vlan id and flag to set/unset
>> 
>>   Does option 'b' sound ok ?
>> 
>> - In this implementation we make the macvlan address list same as the address
>>   list that came in the filter with TUNSETTXFILTER. This will not cover cases
>>   where the macvlan device needs to have other addresses that are not
>>   necessarily in the filter. Is this a problem ?
> 
> What cases do you have in mind?
> 
This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
see a problem with uc/mc address list being the same in all the stacked
netdevs in the path. I called that out above to make sure I was not missing
any case in PASSTHRU mode where this might be invalid. Otherwise I don't see
a problem in the simple PASSTHRU use case this patch supports.

>> - The patch currently only supports passing of IFF_PROMISC and IFF_MULTICAST
>> filter flags to lowerdev
>> 
>> This patch series implements the following
>> 01/3 - macvlan: Add support for unicast filtering in macvlan
>> 02/3 - macvlan: Add function to set addr filter on lower device in passthru
>> mode
>> 03/3 - macvtap: Add support for TUNSETTXFILTER
>> 
>> Please comment. Thanks.
>> 
>> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
>> Signed-off-by: Christian Benvenuti <benve@cisco.com>
>> Signed-off-by: David Wang <dwang2@cisco.com>
> 
> The security isn't lower than with promisc, so I don't see
> a problem with this as such.
> 
> There are more features we'll want down the road though,
> so let's see whether the interface will be able to
> satisfy them in a backwards compatible way before we
> set it in stone. Here's what I came up with:
> 
> How will the filtering table be partitioned within guests?

Since this patch supports macvlan PASSTHRU mode only, in which the lower
device has 1-1 mapping to the guest nic, it does not require any
partitioning of filtering table within guests. Unless I missed understanding
something. 

If the lower device were being shared by multiple guest network interfaces
(non PASSTHRU mode), only then we will need to maintain separate filter
tables for each guest network interface in macvlan and forward the pkt to
respective guest interface after a filter lookup. This could affect
performance too I think.

I chose to support PASSTHRU Mode only at first because its simpler and all
code additions are in control path only.

> 
> A way to limit what the guest can do would also be useful.
> How can this be done? selinux?

I vaguely remember a thread on the same context.. had a suggestion to
maintain pre-approved address lists and allow guest filter registration of
only those addresses for security. This seemed reasonable. Plus the ability
to support additional address registration from guest could be made
configurable (One of your ideas again from prior work).

I am not an selinux expert, but I am thinking we can use it to only allow or
disallow access or operations to the macvtap device. (?). I will check more
on this.

> 
> Any thoughts on spoofing filtering?

I can only think of checking addresses against an allowed address list.
Don't know of any other ways. Any hints ?

In any case I am assuming all the protection/security measures should be
taken at the layer calling the TUNSETTXFILTER ie..In macvtap virtualization
use case its libvirt or qemu-kvm. No ?

> 
> Would it be possible to make the filtering programmable
> using netlink, e.g. ethtool, ip, or some such?

Should be possible via ethtool or ip calling ioctl TUNSETTXFILTER. Are you
thinking of macvlan having a netlink interface to set filter and not ioctl
?. Sure. But I was thinking the point of implementing TUNSETTXFILTER was to
maintain compatibility with the generic tap interface that does the same
thing. 
And having both the netlink op and ioctl interface might not be clean ?.
Sorry if I misunderstood your question.

> That would make this useful for bridged setups besides
> macvtap/virtualization.
> 

Thanks for the comments. 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-08  5:20   ` Roopa Prabhu
@ 2011-09-08 11:08     ` Michael S. Tsirkin
  2011-09-08 16:19       ` Roopa Prabhu
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-08 11:08 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm

On Wed, Sep 07, 2011 at 10:20:28PM -0700, Roopa Prabhu wrote:
> On 9/7/11 5:34 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
> >> This patch is an attempt at providing address filtering support for macvtap
> >> devices in PASSTHRU mode. Its still a work in progress.
> >> Briefly tested for basic functionality. Wanted to get some feedback on the
> >> direction before proceeding.
> >> 
> > 
> > Good work, thanks.
> > 
> 
> Thanks.
> 
> >> I have hopefully CC'ed all concerned people.
> > 
> > kvm crowd might also be interested.
> > Try using ./scripts/get_maintainer.pl as well.
> > 
> Thanks for the tip. Expanded CC list a bit more.
> 
> >> PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
> >> there is a 1-1 mapping between macvtap device and physical nic or VF. And all
> >> filtering is done in lowerdev hw. The lowerdev does not need to be in
> >> promiscous mode as long as the guest filters are passed down to the lowerdev.
> >> This patch tries to remove the need for putting the lowerdev in promiscous
> >> mode. 
> >> I have also referred to the thread below where TUNSETTXFILTER was mentioned
> >> in 
> >> this context: 
> >>  http://patchwork.ozlabs.org/patch/69297/
> >> 
> >> This patch basically passes the addresses got by TUNSETTXFILTER to macvlan
> >> lowerdev.
> >> 
> >> I have looked at previous work and discussions on this for qemu-kvm
> >> by Michael Tsirkin, Alex Williamson and Dragos Tatulea
> >> http://patchwork.ozlabs.org/patch/78595/
> >> http://patchwork.ozlabs.org/patch/47160/
> >> https://patchwork.kernel.org/patch/474481/
> >> 
> >> Redhat bugzilla by Michael Tsirkin:
> >> https://bugzilla.redhat.com/show_bug.cgi?id=655013
> >> 
> >> I used Michael's qemu-kvm patch for testing the changes with KVM
> >> 
> >> I would like to cover both MAC and vlan filtering in this work.
> >> 
> >> Open Questions/Issues:
> >> - There is a need for vlan filtering to complete the patch. It will require
> >>   a new tap ioctl cmd for vlans.
> >>   Some ideas on this are:
> >> 
> >>   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap filter
> >> (similar to tun_filter for addresses). Passing the vlan id's to lower
> >> device will mean going thru the whole list of vlans every time.
> >> 
> >>   OR
> >> 
> >>   b) TUNSETVLAN with vlan id and flag to set/unset
> >> 
> >>   Does option 'b' sound ok ?
> >> 
> >> - In this implementation we make the macvlan address list same as the address
> >>   list that came in the filter with TUNSETTXFILTER. This will not cover cases
> >>   where the macvlan device needs to have other addresses that are not
> >>   necessarily in the filter. Is this a problem ?
> > 
> > What cases do you have in mind?
> > 
> This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
> see a problem with uc/mc address list being the same in all the stacked
> netdevs in the path. I called that out above to make sure I was not missing
> any case in PASSTHRU mode where this might be invalid. Otherwise I don't see
> a problem in the simple PASSTHRU use case this patch supports.
> 
> >> - The patch currently only supports passing of IFF_PROMISC and IFF_MULTICAST
> >> filter flags to lowerdev
> >> 
> >> This patch series implements the following
> >> 01/3 - macvlan: Add support for unicast filtering in macvlan
> >> 02/3 - macvlan: Add function to set addr filter on lower device in passthru
> >> mode
> >> 03/3 - macvtap: Add support for TUNSETTXFILTER
> >> 
> >> Please comment. Thanks.
> >> 
> >> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
> >> Signed-off-by: Christian Benvenuti <benve@cisco.com>
> >> Signed-off-by: David Wang <dwang2@cisco.com>
> > 
> > The security isn't lower than with promisc, so I don't see
> > a problem with this as such.
> > 
> > There are more features we'll want down the road though,
> > so let's see whether the interface will be able to
> > satisfy them in a backwards compatible way before we
> > set it in stone. Here's what I came up with:
> > 
> > How will the filtering table be partitioned within guests?
> 
> Since this patch supports macvlan PASSTHRU mode only, in which the lower
> device has 1-1 mapping to the guest nic, it does not require any
> partitioning of filtering table within guests. Unless I missed understanding
> something. 
> If the lower device were being shared by multiple guest network interfaces
> (non PASSTHRU mode), only then we will need to maintain separate filter
> tables for each guest network interface in macvlan and forward the pkt to
> respective guest interface after a filter lookup. This could affect
> performance too I think.

Not with hardware filtering support. Which is where we'd need to
partition the host nic mac table between guests.

> I chose to support PASSTHRU Mode only at first because its simpler and all
> code additions are in control path only.

I agree. It would be a bit silly to have a dedicated interface
for passthough and a completely separate one for
non passthrough.

> > 
> > A way to limit what the guest can do would also be useful.
> > How can this be done? selinux?
> 
> I vaguely remember a thread on the same context.. had a suggestion to
> maintain pre-approved address lists and allow guest filter registration of
> only those addresses for security. This seemed reasonable. Plus the ability
> to support additional address registration from guest could be made
> configurable (One of your ideas again from prior work).
> 
> I am not an selinux expert, but I am thinking we can use it to only allow or
> disallow access or operations to the macvtap device. (?). I will check more
> on this.

We'd have to have a way to revoke that as well.

> > 
> > Any thoughts on spoofing filtering?
> 
> I can only think of checking addresses against an allowed address list.
> Don't know of any other ways. Any hints ?

Hardware (esp SRIOV) often has ways to do this check, too.

> 
> In any case I am assuming all the protection/security measures should be
> taken at the layer calling the TUNSETTXFILTER ie..In macvtap virtualization
> use case its libvirt or qemu-kvm. No ?

Ideally we'd have a way to separate these capabilities, so that libvirt
can override qemu.

> > 
> > Would it be possible to make the filtering programmable
> > using netlink, e.g. ethtool, ip, or some such?
> 
> Should be possible via ethtool or ip calling ioctl TUNSETTXFILTER. Are you
> thinking of macvlan having a netlink interface to set filter and not ioctl
> ?. Sure.

Yes.

> But I was thinking the point of implementing TUNSETTXFILTER was to
> maintain compatibility with the generic tap interface that does the same
> thing. 

Yes. OTOH I don't think anyone uses that ATM so it might not
be important if it's not a good fit.
E.g. we could notify libvirt and have it use netlink for us
if we like that better.

> And having both the netlink op and ioctl interface might not be clean ?.

No idea.

> Sorry if I misunderstood your question.
> 
> > That would make this useful for bridged setups besides
> > macvtap/virtualization.
> > 
> 
> Thanks for the comments. 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-08 11:08     ` Michael S. Tsirkin
@ 2011-09-08 16:19       ` Roopa Prabhu
  2011-09-08 17:42         ` Sridhar Samudrala
  2011-09-08 19:11         ` Michael S. Tsirkin
  0 siblings, 2 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-08 16:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm




On 9/8/11 4:08 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Sep 07, 2011 at 10:20:28PM -0700, Roopa Prabhu wrote:
>> On 9/7/11 5:34 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> 
>>> On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
>>>> This patch is an attempt at providing address filtering support for macvtap
>>>> devices in PASSTHRU mode. Its still a work in progress.
>>>> Briefly tested for basic functionality. Wanted to get some feedback on the
>>>> direction before proceeding.
>>>> 
>>> 
>>> Good work, thanks.
>>> 
>> 
>> Thanks.
>> 
>>>> I have hopefully CC'ed all concerned people.
>>> 
>>> kvm crowd might also be interested.
>>> Try using ./scripts/get_maintainer.pl as well.
>>> 
>> Thanks for the tip. Expanded CC list a bit more.
>> 
>>>> PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
>>>> there is a 1-1 mapping between macvtap device and physical nic or VF. And
>>>> all
>>>> filtering is done in lowerdev hw. The lowerdev does not need to be in
>>>> promiscous mode as long as the guest filters are passed down to the
>>>> lowerdev.
>>>> This patch tries to remove the need for putting the lowerdev in promiscous
>>>> mode. 
>>>> I have also referred to the thread below where TUNSETTXFILTER was mentioned
>>>> in 
>>>> this context: 
>>>>  http://patchwork.ozlabs.org/patch/69297/
>>>> 
>>>> This patch basically passes the addresses got by TUNSETTXFILTER to macvlan
>>>> lowerdev.
>>>> 
>>>> I have looked at previous work and discussions on this for qemu-kvm
>>>> by Michael Tsirkin, Alex Williamson and Dragos Tatulea
>>>> http://patchwork.ozlabs.org/patch/78595/
>>>> http://patchwork.ozlabs.org/patch/47160/
>>>> https://patchwork.kernel.org/patch/474481/
>>>> 
>>>> Redhat bugzilla by Michael Tsirkin:
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=655013
>>>> 
>>>> I used Michael's qemu-kvm patch for testing the changes with KVM
>>>> 
>>>> I would like to cover both MAC and vlan filtering in this work.
>>>> 
>>>> Open Questions/Issues:
>>>> - There is a need for vlan filtering to complete the patch. It will require
>>>>   a new tap ioctl cmd for vlans.
>>>>   Some ideas on this are:
>>>> 
>>>>   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap
>>>> filter
>>>> (similar to tun_filter for addresses). Passing the vlan id's to lower
>>>> device will mean going thru the whole list of vlans every time.
>>>> 
>>>>   OR
>>>> 
>>>>   b) TUNSETVLAN with vlan id and flag to set/unset
>>>> 
>>>>   Does option 'b' sound ok ?
>>>> 
>>>> - In this implementation we make the macvlan address list same as the
>>>> address
>>>>   list that came in the filter with TUNSETTXFILTER. This will not cover
>>>> cases
>>>>   where the macvlan device needs to have other addresses that are not
>>>>   necessarily in the filter. Is this a problem ?
>>> 
>>> What cases do you have in mind?
>>> 
>> This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
>> see a problem with uc/mc address list being the same in all the stacked
>> netdevs in the path. I called that out above to make sure I was not missing
>> any case in PASSTHRU mode where this might be invalid. Otherwise I don't see
>> a problem in the simple PASSTHRU use case this patch supports.
>> 
>>>> - The patch currently only supports passing of IFF_PROMISC and
>>>> IFF_MULTICAST
>>>> filter flags to lowerdev
>>>> 
>>>> This patch series implements the following
>>>> 01/3 - macvlan: Add support for unicast filtering in macvlan
>>>> 02/3 - macvlan: Add function to set addr filter on lower device in passthru
>>>> mode
>>>> 03/3 - macvtap: Add support for TUNSETTXFILTER
>>>> 
>>>> Please comment. Thanks.
>>>> 
>>>> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
>>>> Signed-off-by: Christian Benvenuti <benve@cisco.com>
>>>> Signed-off-by: David Wang <dwang2@cisco.com>
>>> 
>>> The security isn't lower than with promisc, so I don't see
>>> a problem with this as such.
>>> 
>>> There are more features we'll want down the road though,
>>> so let's see whether the interface will be able to
>>> satisfy them in a backwards compatible way before we
>>> set it in stone. Here's what I came up with:
>>> 
>>> How will the filtering table be partitioned within guests?
>> 
>> Since this patch supports macvlan PASSTHRU mode only, in which the lower
>> device has 1-1 mapping to the guest nic, it does not require any
>> partitioning of filtering table within guests. Unless I missed understanding
>> something. 
>> If the lower device were being shared by multiple guest network interfaces
>> (non PASSTHRU mode), only then we will need to maintain separate filter
>> tables for each guest network interface in macvlan and forward the pkt to
>> respective guest interface after a filter lookup. This could affect
>> performance too I think.
> 
> Not with hardware filtering support. Which is where we'd need to
> partition the host nic mac table between guests.
> 
I need to understand this more. In non passthru case when a VF or physical
nic is shared between guests, the nic does not really know about the guests,
so I was thinking we do the same thing as we do for the passthru case (ie
send all the address filters from macvlan to the physical nic). So at the
hardware, filtering is done for all guests sharing the nic. But if we want
each virtio-net nic or guest to get exactly what it asked for
macvlan/macvtap needs to maintain a copy of each guest filter and do a
lookup and send only the requested traffic to the guest. Here is the
performance hit that I was seeing. Please see my next comment for further
details. 


>> I chose to support PASSTHRU Mode only at first because its simpler and all
>> code additions are in control path only.
> 
> I agree. It would be a bit silly to have a dedicated interface
> for passthough and a completely separate one for
> non passthrough.
>
Agree. The reason I did not focus on non-passthru case in the initial
version was because I was thinking things to do in the non-passthru case
will be just add-ons to the passthru case. But true Better to flush out the
non-pasthru case details.

After dwelling on this a bit more how about the below:

Phase 1: Goal: Enable hardware filtering for all macvlan modes
    - In macvlan passthru mode the single guest virtio-nic connected will
receive traffic that he requested for
    - In macvlan non-passthru mode all guest virtio-nics sharing the
      physical nic will see all other guest traffic
      but the filtering at guest virtio-nic will make sure each guest
      eventually sees traffic he asked for. This is still better than
putting the physical nic in promiscuous mode.

(This is mainly what my patch does...but will need to remove the passthru
check and see if there are any thing else needed for non-passthru case)


Phase 2: Goal: Enable filtering at macvlan so that each guest virtio-nic
receives only what he requested for.
    - In this case, in addition to pushing the filters down to the physical
nic we will have to maintain the same filter in macvlan and do a filter
lookup before forwarding the traffic to a virtio-nic.

But I am thinking phase 2 might be redundant given virtio-nic already does
filtering for the guest. In which case we might not need phase 2 at all. I
might have been over complicating things.

Please comment. And please correct if I missed something.
 
 
>>> 
>>> A way to limit what the guest can do would also be useful.
>>> How can this be done? selinux?
>> 
>> I vaguely remember a thread on the same context.. had a suggestion to
>> maintain pre-approved address lists and allow guest filter registration of
>> only those addresses for security. This seemed reasonable. Plus the ability
>> to support additional address registration from guest could be made
>> configurable (One of your ideas again from prior work).
>> 
>> I am not an selinux expert, but I am thinking we can use it to only allow or
>> disallow access or operations to the macvtap device. (?). I will check more
>> on this.
> 
> We'd have to have a way to revoke that as well.
> 
Yes true.


>>> 
>>> Any thoughts on spoofing filtering?
>> 
>> I can only think of checking addresses against an allowed address list.
>> Don't know of any other ways. Any hints ?
> 
> Hardware (esp SRIOV) often has ways to do this check, too.
> 
Yes correct. Hw sriov and even switch in 802.1Qbh has anti-spoofing feature.
In which case I am thinking having It at the macvtap layer is not an
absolute must (?).


>> 
>> In any case I am assuming all the protection/security measures should be
>> taken at the layer calling the TUNSETTXFILTER ie..In macvtap virtualization
>> use case its libvirt or qemu-kvm. No ?
> 
> Ideally we'd have a way to separate these capabilities, so that libvirt
> can override qemu.
> 
>>> 
>>> Would it be possible to make the filtering programmable
>>> using netlink, e.g. ethtool, ip, or some such?
>> 
>> Should be possible via ethtool or ip calling ioctl TUNSETTXFILTER. Are you
>> thinking of macvlan having a netlink interface to set filter and not ioctl
>> ?. Sure.
> 
> Yes.
> 
>> But I was thinking the point of implementing TUNSETTXFILTER was to
>> maintain compatibility with the generic tap interface that does the same
>> thing. 
> 
> Yes. OTOH I don't think anyone uses that ATM so it might not
> be important if it's not a good fit.
> E.g. we could notify libvirt and have it use netlink for us
> if we like that better.
> 
Ok thanks for clarifying that. One more reason to use TUNSETTXFILTER
interface was for qemu-kvm who uses the same tap interface for macvtap and
regular tap. So if we use netlink we have to do different things for macvtap
and tap filters in qemu. And qemu-kvm does not distinguish between macvtap
and tap as far as I know. No ?


Thanks you for your review and comments.


>> And having both the netlink op and ioctl interface might not be clean ?.
> 
> No idea.
> 
>> Sorry if I misunderstood your question.
>> 
>>> That would make this useful for bridged setups besides
>>> macvtap/virtualization.
>>> 
>> 
>> Thanks for the comments. 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 3/3 RFC] macvtap: Add support for TUNSETTXFILTER
  2011-09-06 22:35 ` [net-next-2.6 PATCH 3/3 RFC] macvtap: Add support for TUNSETTXFILTER Roopa Prabhu
@ 2011-09-08 16:25   ` Arnd Bergmann
  2011-09-08 19:06     ` Roopa Prabhu
  0 siblings, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2011-09-08 16:25 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev, dragos.tatulea, mst, dwang2, benve, kaber, sri

On Wednesday 07 September 2011, Roopa Prabhu wrote:
> From: Roopa Prabhu <roprabhu@cisco.com>
> 
> This patch adds support for TUNSETTXFILTER. Calls macvlan set filter function
> with address list and flags received via TUNSETTXFILTER.
> 
> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
> Signed-off-by: Christian Benvenuti <benve@cisco.com>
> Signed-off-by: David Wang <dwang2@cisco.com>

Looks ok to me in principle, but

> +               /* XXX: If broadcast address present, set IFF_BROADCAST */
> +               /* XXX: If multicast address present, set IFF_MULTICAST */
> +               flags |= (tf.flags & TUN_FLT_ALLMULTI ? IFF_ALLMULTI : 0) |
> +                        (!tf.count ? IFF_PROMISC : 0);
> +               ret = 0;
> +               if (tf.count > 0) {
> +                       alen = ETH_ALEN * tf.count;
> +                       addrs = kmalloc(alen, GFP_KERNEL);
> +                       if (!addrs) {
> +                               dev_put(vlan->dev);
> +                               return -ENOMEM;
> +                       }

I think you need to check tf.count for a maximum value. In theory, a user
could pass a rather large number (65536) which is not good.

Also the TUNSETTXFILTER code looks sufficiently large that it would be
better to put it into a separate function. Use "goto" statements in
order to do the error handling in there, instead of repeating
lots of kfree and dev_put calls in each error case.

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-08 16:19       ` Roopa Prabhu
@ 2011-09-08 17:42         ` Sridhar Samudrala
  2011-09-08 19:23           ` Roopa Prabhu
  2011-09-08 19:11         ` Michael S. Tsirkin
  1 sibling, 1 reply; 32+ messages in thread
From: Sridhar Samudrala @ 2011-09-08 17:42 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Michael S. Tsirkin, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm

On Thu, 2011-09-08 at 09:19 -0700, Roopa Prabhu wrote:
> 
> 
> On 9/8/11 4:08 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Sep 07, 2011 at 10:20:28PM -0700, Roopa Prabhu wrote:
> >> On 9/7/11 5:34 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >> 
> >>> On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
> >>>> This patch is an attempt at providing address filtering support for macvtap
> >>>> devices in PASSTHRU mode. Its still a work in progress.
> >>>> Briefly tested for basic functionality. Wanted to get some feedback on the
> >>>> direction before proceeding.
> >>>> 
> >>> 
> >>> Good work, thanks.
> >>> 
> >> 
> >> Thanks.
> >> 
> >>>> I have hopefully CC'ed all concerned people.
> >>> 
> >>> kvm crowd might also be interested.
> >>> Try using ./scripts/get_maintainer.pl as well.
> >>> 
> >> Thanks for the tip. Expanded CC list a bit more.
> >> 
> >>>> PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU mode
> >>>> there is a 1-1 mapping between macvtap device and physical nic or VF. And
> >>>> all
> >>>> filtering is done in lowerdev hw. The lowerdev does not need to be in
> >>>> promiscous mode as long as the guest filters are passed down to the
> >>>> lowerdev.
> >>>> This patch tries to remove the need for putting the lowerdev in promiscous
> >>>> mode. 
> >>>> I have also referred to the thread below where TUNSETTXFILTER was mentioned
> >>>> in 
> >>>> this context: 
> >>>>  http://patchwork.ozlabs.org/patch/69297/
> >>>> 
> >>>> This patch basically passes the addresses got by TUNSETTXFILTER to macvlan
> >>>> lowerdev.
> >>>> 
> >>>> I have looked at previous work and discussions on this for qemu-kvm
> >>>> by Michael Tsirkin, Alex Williamson and Dragos Tatulea
> >>>> http://patchwork.ozlabs.org/patch/78595/
> >>>> http://patchwork.ozlabs.org/patch/47160/
> >>>> https://patchwork.kernel.org/patch/474481/
> >>>> 
> >>>> Redhat bugzilla by Michael Tsirkin:
> >>>> https://bugzilla.redhat.com/show_bug.cgi?id=655013
> >>>> 
> >>>> I used Michael's qemu-kvm patch for testing the changes with KVM
> >>>> 
> >>>> I would like to cover both MAC and vlan filtering in this work.
> >>>> 
> >>>> Open Questions/Issues:
> >>>> - There is a need for vlan filtering to complete the patch. It will require
> >>>>   a new tap ioctl cmd for vlans.
> >>>>   Some ideas on this are:
> >>>> 
> >>>>   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap
> >>>> filter
> >>>> (similar to tun_filter for addresses). Passing the vlan id's to lower
> >>>> device will mean going thru the whole list of vlans every time.
> >>>> 
> >>>>   OR
> >>>> 
> >>>>   b) TUNSETVLAN with vlan id and flag to set/unset
> >>>> 
> >>>>   Does option 'b' sound ok ?
> >>>> 
> >>>> - In this implementation we make the macvlan address list same as the
> >>>> address
> >>>>   list that came in the filter with TUNSETTXFILTER. This will not cover
> >>>> cases
> >>>>   where the macvlan device needs to have other addresses that are not
> >>>>   necessarily in the filter. Is this a problem ?
> >>> 
> >>> What cases do you have in mind?
> >>> 
> >> This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
> >> see a problem with uc/mc address list being the same in all the stacked
> >> netdevs in the path. I called that out above to make sure I was not missing
> >> any case in PASSTHRU mode where this might be invalid. Otherwise I don't see
> >> a problem in the simple PASSTHRU use case this patch supports.
> >> 
> >>>> - The patch currently only supports passing of IFF_PROMISC and
> >>>> IFF_MULTICAST
> >>>> filter flags to lowerdev
> >>>> 
> >>>> This patch series implements the following
> >>>> 01/3 - macvlan: Add support for unicast filtering in macvlan
> >>>> 02/3 - macvlan: Add function to set addr filter on lower device in passthru
> >>>> mode
> >>>> 03/3 - macvtap: Add support for TUNSETTXFILTER
> >>>> 
> >>>> Please comment. Thanks.
> >>>> 
> >>>> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
> >>>> Signed-off-by: Christian Benvenuti <benve@cisco.com>
> >>>> Signed-off-by: David Wang <dwang2@cisco.com>
> >>> 
> >>> The security isn't lower than with promisc, so I don't see
> >>> a problem with this as such.
> >>> 
> >>> There are more features we'll want down the road though,
> >>> so let's see whether the interface will be able to
> >>> satisfy them in a backwards compatible way before we
> >>> set it in stone. Here's what I came up with:
> >>> 
> >>> How will the filtering table be partitioned within guests?
> >> 
> >> Since this patch supports macvlan PASSTHRU mode only, in which the lower
> >> device has 1-1 mapping to the guest nic, it does not require any
> >> partitioning of filtering table within guests. Unless I missed understanding
> >> something. 
> >> If the lower device were being shared by multiple guest network interfaces
> >> (non PASSTHRU mode), only then we will need to maintain separate filter
> >> tables for each guest network interface in macvlan and forward the pkt to
> >> respective guest interface after a filter lookup. This could affect
> >> performance too I think.
> > 
> > Not with hardware filtering support. Which is where we'd need to
> > partition the host nic mac table between guests.
> > 
> I need to understand this more. In non passthru case when a VF or physical
> nic is shared between guests, the nic does not really know about the guests,
> so I was thinking we do the same thing as we do for the passthru case (ie
> send all the address filters from macvlan to the physical nic). So at the
> hardware, filtering is done for all guests sharing the nic. But if we want
> each virtio-net nic or guest to get exactly what it asked for
> macvlan/macvtap needs to maintain a copy of each guest filter and do a
> lookup and send only the requested traffic to the guest. Here is the
> performance hit that I was seeing. Please see my next comment for further
> details. 
> 
> 
> >> I chose to support PASSTHRU Mode only at first because its simpler and all
> >> code additions are in control path only.
> > 
> > I agree. It would be a bit silly to have a dedicated interface
> > for passthough and a completely separate one for
> > non passthrough.
> >
> Agree. The reason I did not focus on non-passthru case in the initial
> version was because I was thinking things to do in the non-passthru case
> will be just add-ons to the passthru case. But true Better to flush out the
> non-pasthru case details.
> 
> After dwelling on this a bit more how about the below:
> 
> Phase 1: Goal: Enable hardware filtering for all macvlan modes
>     - In macvlan passthru mode the single guest virtio-nic connected will
> receive traffic that he requested for
Currently the guest receives all the packets seen on the interface as it
is put in promiscuous mode. With your patch it only sees the packets
that he requested for. Have you tried creating a macvlan interface on
top of the guest virtio-net interface? Is the new mac address propagated
all the way to the host nic?

I think the main usecase for passthru mode is to assign a SR-IOV VF to
a single guest.

>     - In macvlan non-passthru mode all guest virtio-nics sharing the
>       physical nic will see all other guest traffic
>       but the filtering at guest virtio-nic will make sure each guest
>       eventually sees traffic he asked for. This is still better than
> putting the physical nic in promiscuous mode.

With the default macvlan mode (vepa), i think each guest will only see
its own traffic. But currently adding a secondary mac address on a guest
will not work as it is not propagated all the way down to the host.


> (This is mainly what my patch does...but will need to remove the passthru
> check and see if there are any thing else needed for non-passthru case)
> 
> 
> Phase 2: Goal: Enable filtering at macvlan so that each guest virtio-nic
> receives only what he requested for.
>     - In this case, in addition to pushing the filters down to the physical
> nic we will have to maintain the same filter in macvlan and do a filter
> lookup before forwarding the traffic to a virtio-nic.
> 
> But I am thinking phase 2 might be redundant given virtio-nic already does
> filtering for the guest. In which case we might not need phase 2 at all. I
> might have been over complicating things.

I think filtering at macvlan will be more efficient than replicating all
the packets to all the guest virtio-nics.

Thanks
Sridhar


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 3/3 RFC] macvtap: Add support for TUNSETTXFILTER
  2011-09-08 16:25   ` Arnd Bergmann
@ 2011-09-08 19:06     ` Roopa Prabhu
  0 siblings, 0 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-08 19:06 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: netdev, dragos.tatulea, mst, dwang2, benve, kaber, sri

On 9/8/11 9:25 AM, "Arnd Bergmann" <arnd@arndb.de> wrote:

> On Wednesday 07 September 2011, Roopa Prabhu wrote:
>> From: Roopa Prabhu <roprabhu@cisco.com>
>> 
>> This patch adds support for TUNSETTXFILTER. Calls macvlan set filter function
>> with address list and flags received via TUNSETTXFILTER.
>> 
>> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
>> Signed-off-by: Christian Benvenuti <benve@cisco.com>
>> Signed-off-by: David Wang <dwang2@cisco.com>
> 
> Looks ok to me in principle, but
> 
>> +               /* XXX: If broadcast address present, set IFF_BROADCAST */
>> +               /* XXX: If multicast address present, set IFF_MULTICAST */
>> +               flags |= (tf.flags & TUN_FLT_ALLMULTI ? IFF_ALLMULTI : 0) |
>> +                        (!tf.count ? IFF_PROMISC : 0);
>> +               ret = 0;
>> +               if (tf.count > 0) {
>> +                       alen = ETH_ALEN * tf.count;
>> +                       addrs = kmalloc(alen, GFP_KERNEL);
>> +                       if (!addrs) {
>> +                               dev_put(vlan->dev);
>> +                               return -ENOMEM;
>> +                       }
> 
> I think you need to check tf.count for a maximum value. In theory, a user
> could pass a rather large number (65536) which is not good.

Good point. Will fix it.

> 
> Also the TUNSETTXFILTER code looks sufficiently large that it would be
> better to put it into a separate function. Use "goto" statements in
> order to do the error handling in there, instead of repeating
> lots of kfree and dev_put calls in each error case.

Ok sounds good. Will fix this when I respin the patches.

Thanks for the comments.
Roopa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-08 16:19       ` Roopa Prabhu
  2011-09-08 17:42         ` Sridhar Samudrala
@ 2011-09-08 19:11         ` Michael S. Tsirkin
  2011-09-09  2:53           ` Roopa Prabhu
  1 sibling, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-08 19:11 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm

On Thu, Sep 08, 2011 at 09:19:32AM -0700, Roopa Prabhu wrote:
> >>> There are more features we'll want down the road though,
> >>> so let's see whether the interface will be able to
> >>> satisfy them in a backwards compatible way before we
> >>> set it in stone. Here's what I came up with:
> >>> 
> >>> How will the filtering table be partitioned within guests?
> >> 
> >> Since this patch supports macvlan PASSTHRU mode only, in which the lower
> >> device has 1-1 mapping to the guest nic, it does not require any
> >> partitioning of filtering table within guests. Unless I missed understanding
> >> something. 
> >> If the lower device were being shared by multiple guest network interfaces
> >> (non PASSTHRU mode), only then we will need to maintain separate filter
> >> tables for each guest network interface in macvlan and forward the pkt to
> >> respective guest interface after a filter lookup. This could affect
> >> performance too I think.
> > 
> > Not with hardware filtering support. Which is where we'd need to
> > partition the host nic mac table between guests.
> > 
> I need to understand this more. In non passthru case when a VF or physical
> nic is shared between guests,

For example, consider a VF given to each guest. Hardware supports a fixed
total number of filters, which can be partitioned between VFs.

> the nic does not really know about the guests,
> so I was thinking we do the same thing as we do for the passthru case (ie
> send all the address filters from macvlan to the physical nic). So at the
> hardware, filtering is done for all guests sharing the nic. But if we want
> each virtio-net nic or guest to get exactly what it asked for
> macvlan/macvtap needs to maintain a copy of each guest filter and do a
> lookup and send only the requested traffic to the guest. Here is the
> performance hit that I was seeing. Please see my next comment for further
> details. 

It won't be any slower than attaching a non-passthrough macvlan
to a device, will it?

> 
> >> I chose to support PASSTHRU Mode only at first because its simpler and all
> >> code additions are in control path only.
> > 
> > I agree. It would be a bit silly to have a dedicated interface
> > for passthough and a completely separate one for
> > non passthrough.
> >
> Agree. The reason I did not focus on non-passthru case in the initial
> version was because I was thinking things to do in the non-passthru case
> will be just add-ons to the passthru case. But true Better to flush out the
> non-pasthru case details.
> 
> After dwelling on this a bit more how about the below:
> 
> Phase 1: Goal: Enable hardware filtering for all macvlan modes
>     - In macvlan passthru mode the single guest virtio-nic connected will
>       receive traffic that he requested for
>     - In macvlan non-passthru mode all guest virtio-nics sharing the
>       physical nic will see all other guest traffic
>       but the filtering at guest virtio-nic

I don't think guests currently filter anything.

>       will make sure each guest
>       eventually sees traffic he asked for. This is still better than
>       putting the physical nic in promiscuous mode.
> 
> (This is mainly what my patch does...but will need to remove the passthru
> check and see if there are any thing else needed for non-passthru case)

I'm fine with sticking with passthrough, make non passthrough
a separate phase.

> 
> Phase 2: Goal: Enable filtering at macvlan so that each guest virtio-nic
> receives only what he requested for.
>     - In this case, in addition to pushing the filters down to the physical
>       nic we will have to maintain the same filter in macvlan and do a filter
>       lookup before forwarding the traffic to a virtio-nic.
> 
> But I am thinking phase 2 might be redundant given virtio-nic already does
> filtering for the guest.

It does? Do you mean the filter that qemu does in userspace?

> In which case we might not need phase 2 at all. I
> might have been over complicating things.
> 
> Please comment. And please correct if I missed something.
>  
>  
> >>> 
> >>> A way to limit what the guest can do would also be useful.
> >>> How can this be done? selinux?
> >> 
> >> I vaguely remember a thread on the same context.. had a suggestion to
> >> maintain pre-approved address lists and allow guest filter registration of
> >> only those addresses for security. This seemed reasonable. Plus the ability
> >> to support additional address registration from guest could be made
> >> configurable (One of your ideas again from prior work).
> >> 
> >> I am not an selinux expert, but I am thinking we can use it to only allow or
> >> disallow access or operations to the macvtap device. (?). I will check more
> >> on this.
> > 
> > We'd have to have a way to revoke that as well.
> > 
> Yes true.
> 
> 
> >>> 
> >>> Any thoughts on spoofing filtering?
> >> 
> >> I can only think of checking addresses against an allowed address list.
> >> Don't know of any other ways. Any hints ?
> > 
> > Hardware (esp SRIOV) often has ways to do this check, too.
> > 
> Yes correct. Hw sriov and even switch in 802.1Qbh has anti-spoofing feature.
> In which case I am thinking having It at the macvtap layer is not an
> absolute must (?).

Exactly. But let's figure out *how* it will be programmed.
If anti-spoofing is programmed with netlink, maybe that's
a better interface for rx filter too, for consistency.

> >> 
> >> In any case I am assuming all the protection/security measures should be
> >> taken at the layer calling the TUNSETTXFILTER ie..In macvtap virtualization
> >> use case its libvirt or qemu-kvm. No ?
> > 
> > Ideally we'd have a way to separate these capabilities, so that libvirt
> > can override qemu.
> > 
> >>> 
> >>> Would it be possible to make the filtering programmable
> >>> using netlink, e.g. ethtool, ip, or some such?
> >> 
> >> Should be possible via ethtool or ip calling ioctl TUNSETTXFILTER. Are you
> >> thinking of macvlan having a netlink interface to set filter and not ioctl
> >> ?. Sure.
> > 
> > Yes.
> > 
> >> But I was thinking the point of implementing TUNSETTXFILTER was to
> >> maintain compatibility with the generic tap interface that does the same
> >> thing. 
> > 
> > Yes. OTOH I don't think anyone uses that ATM so it might not
> > be important if it's not a good fit.
> > E.g. we could notify libvirt and have it use netlink for us
> > if we like that better.
> > 
> Ok thanks for clarifying that. One more reason to use TUNSETTXFILTER
> interface was for qemu-kvm who uses the same tap interface for macvtap and
> regular tap. So if we use netlink we have to do different things for macvtap
> and tap filters in qemu. And qemu-kvm does not distinguish between macvtap
> and tap as far as I know. No ?

It's not a question of simplifying qemu as much as trying to
make the kernel interface abstract device differences
away from users. Using same interface for tun and macvtap
gave us some confidence that the interface is a good one.

But this does not seem to have worked with TUNSETTXFILTER -
at least qemu doesn't use it yet, and it's been upstream
a while. So there's no proof it's a good interface.

So if we decide netlink is a better interface we can add it for tun too.
We need to be backwards compatible and figure out what happens if someone
tries to use both methods: probably apply both or ignore TUNSETTXFILTER ...

> 
> Thanks you for your review and comments.
> 
> 
> >> And having both the netlink op and ioctl interface might not be clean ?.
> > 
> > No idea.
> > 
> >> Sorry if I misunderstood your question.
> >> 
> >>> That would make this useful for bridged setups besides
> >>> macvtap/virtualization.
> >>> 
> >> 
> >> Thanks for the comments. 

Overall good progress, and don't let the interface discussions
block you. You want to push in two directions - stabilize code in one
branch, and play with interfaces in another one. By the time there's a
concensus on the interfaces you have the main logic all ready,
then you merge.

-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-08 17:42         ` Sridhar Samudrala
@ 2011-09-08 19:23           ` Roopa Prabhu
  2011-09-08 19:33             ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-08 19:23 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Michael S. Tsirkin, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm




On 9/8/11 10:42 AM, "Sridhar Samudrala" <sri@us.ibm.com> wrote:

> On Thu, 2011-09-08 at 09:19 -0700, Roopa Prabhu wrote:
>> 
>> 
>> On 9/8/11 4:08 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> 
>>> On Wed, Sep 07, 2011 at 10:20:28PM -0700, Roopa Prabhu wrote:
>>>> On 9/7/11 5:34 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>> 
>>>>> On Tue, Sep 06, 2011 at 03:35:40PM -0700, Roopa Prabhu wrote:
>>>>>> This patch is an attempt at providing address filtering support for
>>>>>> macvtap
>>>>>> devices in PASSTHRU mode. Its still a work in progress.
>>>>>> Briefly tested for basic functionality. Wanted to get some feedback on
>>>>>> the
>>>>>> direction before proceeding.
>>>>>> 
>>>>> 
>>>>> Good work, thanks.
>>>>> 
>>>> 
>>>> Thanks.
>>>> 
>>>>>> I have hopefully CC'ed all concerned people.
>>>>> 
>>>>> kvm crowd might also be interested.
>>>>> Try using ./scripts/get_maintainer.pl as well.
>>>>> 
>>>> Thanks for the tip. Expanded CC list a bit more.
>>>> 
>>>>>> PASSTHRU mode today sets the lowerdev in promiscous mode. In PASSTHRU
>>>>>> mode
>>>>>> there is a 1-1 mapping between macvtap device and physical nic or VF. And
>>>>>> all
>>>>>> filtering is done in lowerdev hw. The lowerdev does not need to be in
>>>>>> promiscous mode as long as the guest filters are passed down to the
>>>>>> lowerdev.
>>>>>> This patch tries to remove the need for putting the lowerdev in
>>>>>> promiscous
>>>>>> mode. 
>>>>>> I have also referred to the thread below where TUNSETTXFILTER was
>>>>>> mentioned
>>>>>> in 
>>>>>> this context:
>>>>>>  http://patchwork.ozlabs.org/patch/69297/
>>>>>> 
>>>>>> This patch basically passes the addresses got by TUNSETTXFILTER to
>>>>>> macvlan
>>>>>> lowerdev.
>>>>>> 
>>>>>> I have looked at previous work and discussions on this for qemu-kvm
>>>>>> by Michael Tsirkin, Alex Williamson and Dragos Tatulea
>>>>>> http://patchwork.ozlabs.org/patch/78595/
>>>>>> http://patchwork.ozlabs.org/patch/47160/
>>>>>> https://patchwork.kernel.org/patch/474481/
>>>>>> 
>>>>>> Redhat bugzilla by Michael Tsirkin:
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=655013
>>>>>> 
>>>>>> I used Michael's qemu-kvm patch for testing the changes with KVM
>>>>>> 
>>>>>> I would like to cover both MAC and vlan filtering in this work.
>>>>>> 
>>>>>> Open Questions/Issues:
>>>>>> - There is a need for vlan filtering to complete the patch. It will
>>>>>> require
>>>>>>   a new tap ioctl cmd for vlans.
>>>>>>   Some ideas on this are:
>>>>>> 
>>>>>>   a) TUNSETVLANFILTER: This will entail we send the whole vlan bitmap
>>>>>> filter
>>>>>> (similar to tun_filter for addresses). Passing the vlan id's to lower
>>>>>> device will mean going thru the whole list of vlans every time.
>>>>>> 
>>>>>>   OR
>>>>>> 
>>>>>>   b) TUNSETVLAN with vlan id and flag to set/unset
>>>>>> 
>>>>>>   Does option 'b' sound ok ?
>>>>>> 
>>>>>> - In this implementation we make the macvlan address list same as the
>>>>>> address
>>>>>>   list that came in the filter with TUNSETTXFILTER. This will not cover
>>>>>> cases
>>>>>>   where the macvlan device needs to have other addresses that are not
>>>>>>   necessarily in the filter. Is this a problem ?
>>>>> 
>>>>> What cases do you have in mind?
>>>>> 
>>>> This patch targets only macvlan PASSTHRU mode and for PASSTHRU mode I don't
>>>> see a problem with uc/mc address list being the same in all the stacked
>>>> netdevs in the path. I called that out above to make sure I was not missing
>>>> any case in PASSTHRU mode where this might be invalid. Otherwise I don't
>>>> see
>>>> a problem in the simple PASSTHRU use case this patch supports.
>>>> 
>>>>>> - The patch currently only supports passing of IFF_PROMISC and
>>>>>> IFF_MULTICAST
>>>>>> filter flags to lowerdev
>>>>>> 
>>>>>> This patch series implements the following
>>>>>> 01/3 - macvlan: Add support for unicast filtering in macvlan
>>>>>> 02/3 - macvlan: Add function to set addr filter on lower device in
>>>>>> passthru
>>>>>> mode
>>>>>> 03/3 - macvtap: Add support for TUNSETTXFILTER
>>>>>> 
>>>>>> Please comment. Thanks.
>>>>>> 
>>>>>> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
>>>>>> Signed-off-by: Christian Benvenuti <benve@cisco.com>
>>>>>> Signed-off-by: David Wang <dwang2@cisco.com>
>>>>> 
>>>>> The security isn't lower than with promisc, so I don't see
>>>>> a problem with this as such.
>>>>> 
>>>>> There are more features we'll want down the road though,
>>>>> so let's see whether the interface will be able to
>>>>> satisfy them in a backwards compatible way before we
>>>>> set it in stone. Here's what I came up with:
>>>>> 
>>>>> How will the filtering table be partitioned within guests?
>>>> 
>>>> Since this patch supports macvlan PASSTHRU mode only, in which the lower
>>>> device has 1-1 mapping to the guest nic, it does not require any
>>>> partitioning of filtering table within guests. Unless I missed
>>>> understanding
>>>> something. 
>>>> If the lower device were being shared by multiple guest network interfaces
>>>> (non PASSTHRU mode), only then we will need to maintain separate filter
>>>> tables for each guest network interface in macvlan and forward the pkt to
>>>> respective guest interface after a filter lookup. This could affect
>>>> performance too I think.
>>> 
>>> Not with hardware filtering support. Which is where we'd need to
>>> partition the host nic mac table between guests.
>>> 
>> I need to understand this more. In non passthru case when a VF or physical
>> nic is shared between guests, the nic does not really know about the guests,
>> so I was thinking we do the same thing as we do for the passthru case (ie
>> send all the address filters from macvlan to the physical nic). So at the
>> hardware, filtering is done for all guests sharing the nic. But if we want
>> each virtio-net nic or guest to get exactly what it asked for
>> macvlan/macvtap needs to maintain a copy of each guest filter and do a
>> lookup and send only the requested traffic to the guest. Here is the
>> performance hit that I was seeing. Please see my next comment for further
>> details. 
>> 
>> 
>>>> I chose to support PASSTHRU Mode only at first because its simpler and all
>>>> code additions are in control path only.
>>> 
>>> I agree. It would be a bit silly to have a dedicated interface
>>> for passthough and a completely separate one for
>>> non passthrough.
>>> 
>> Agree. The reason I did not focus on non-passthru case in the initial
>> version was because I was thinking things to do in the non-passthru case
>> will be just add-ons to the passthru case. But true Better to flush out the
>> non-pasthru case details.
>> 
>> After dwelling on this a bit more how about the below:
>> 
>> Phase 1: Goal: Enable hardware filtering for all macvlan modes
>>     - In macvlan passthru mode the single guest virtio-nic connected will
>> receive traffic that he requested for
> Currently the guest receives all the packets seen on the interface as it
> is put in promiscuous mode. With your patch it only sees the packets
> that he requested for. Have you tried creating a macvlan interface on
> top of the guest virtio-net interface? Is the new mac address propagated
> all the way to the host nic?

Yes I have tried this and it works. The mac address gets propagated to the
physical nic.

> 
> I think the main usecase for passthru mode is to assign a SR-IOV VF to
> a single guest.
> 
Yes and for the passthru usecase this patch should be enough to enable
filtering in hw (eventually like I indicated before I need to fix vlan
filtering too).

>>     - In macvlan non-passthru mode all guest virtio-nics sharing the
>>       physical nic will see all other guest traffic
>>       but the filtering at guest virtio-nic will make sure each guest
>>       eventually sees traffic he asked for. This is still better than
>> putting the physical nic in promiscuous mode.
> 
> With the default macvlan mode (vepa), i think each guest will only see
> its own traffic. But currently adding a secondary mac address on a guest
> will not work as it is not propagated all the way down to the host.
> 
> 
>> (This is mainly what my patch does...but will need to remove the passthru
>> check and see if there are any thing else needed for non-passthru case)
>> 
>> 
>> Phase 2: Goal: Enable filtering at macvlan so that each guest virtio-nic
>> receives only what he requested for.
>>     - In this case, in addition to pushing the filters down to the physical
>> nic we will have to maintain the same filter in macvlan and do a filter
>> lookup before forwarding the traffic to a virtio-nic.
>> 
>> But I am thinking phase 2 might be redundant given virtio-nic already does
>> filtering for the guest. In which case we might not need phase 2 at all. I
>> might have been over complicating things.
> 
> I think filtering at macvlan will be more efficient than replicating all
> the packets to all the guest virtio-nics.

True. This usecase is for non-passthru and I think it will require some
performance testing for all modes too. Could be a phase II patch with the
current patch only enabling filtering in hw.

Thanks for the comments,
Roopa



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-08 19:23           ` Roopa Prabhu
@ 2011-09-08 19:33             ` Michael S. Tsirkin
  2011-09-09  3:00               ` Roopa Prabhu
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-08 19:33 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Sridhar Samudrala, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm

On Thu, Sep 08, 2011 at 12:23:56PM -0700, Roopa Prabhu wrote:
> > 
> > I think the main usecase for passthru mode is to assign a SR-IOV VF to
> > a single guest.
> > 
> Yes and for the passthru usecase this patch should be enough to enable
> filtering in hw (eventually like I indicated before I need to fix vlan
> filtering too).

So with filtering in hw, and in sriov VF case, VFs
actually share a filtering table. How will that
be partitioned?

-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-08 19:11         ` Michael S. Tsirkin
@ 2011-09-09  2:53           ` Roopa Prabhu
  2011-09-09  5:55             ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-09  2:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm




On 9/8/11 12:11 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Sep 08, 2011 at 09:19:32AM -0700, Roopa Prabhu wrote:
>>>>> There are more features we'll want down the road though,
>>>>> so let's see whether the interface will be able to
>>>>> satisfy them in a backwards compatible way before we
>>>>> set it in stone. Here's what I came up with:
>>>>> 
>>>>> How will the filtering table be partitioned within guests?
>>>> 
>>>> Since this patch supports macvlan PASSTHRU mode only, in which the lower
>>>> device has 1-1 mapping to the guest nic, it does not require any
>>>> partitioning of filtering table within guests. Unless I missed
>>>> understanding
>>>> something. 
>>>> If the lower device were being shared by multiple guest network interfaces
>>>> (non PASSTHRU mode), only then we will need to maintain separate filter
>>>> tables for each guest network interface in macvlan and forward the pkt to
>>>> respective guest interface after a filter lookup. This could affect
>>>> performance too I think.
>>> 
>>> Not with hardware filtering support. Which is where we'd need to
>>> partition the host nic mac table between guests.
>>> 
>> I need to understand this more. In non passthru case when a VF or physical
>> nic is shared between guests,
> 
> For example, consider a VF given to each guest. Hardware supports a fixed
> total number of filters, which can be partitioned between VFs.
> 
O ok. But hw maintains VF filters separately for every VF as far as I know.
Filters received on a VF are programmed for that VF only. Am assuming all
hardware do this. Atleast our hardware does this.
What I was referring to was a single VF shared between guests using macvtap
(could be bridge mode for example). All guests sharing the VF will register
 filters with the VF via macvlan. Hw makes sure what ever the VF asked for
is received at the VF. VF in hw does not know that it is shared by guests.
Only at macvlan we might need to re-filter the pkts received on the VF and
steer pkts to the individual guests based on what they asked for.


>> the nic does not really know about the guests,
>> so I was thinking we do the same thing as we do for the passthru case (ie
>> send all the address filters from macvlan to the physical nic). So at the
>> hardware, filtering is done for all guests sharing the nic. But if we want
>> each virtio-net nic or guest to get exactly what it asked for
>> macvlan/macvtap needs to maintain a copy of each guest filter and do a
>> lookup and send only the requested traffic to the guest. Here is the
>> performance hit that I was seeing. Please see my next comment for further
>> details. 
> 
> It won't be any slower than attaching a non-passthrough macvlan
> to a device, will it?
> 
Am not sure. The filter lookup in macvlan is the one I am concerned about.
Will need to try it out.

>> 
>>>> I chose to support PASSTHRU Mode only at first because its simpler and all
>>>> code additions are in control path only.
>>> 
>>> I agree. It would be a bit silly to have a dedicated interface
>>> for passthough and a completely separate one for
>>> non passthrough.
>>> 
>> Agree. The reason I did not focus on non-passthru case in the initial
>> version was because I was thinking things to do in the non-passthru case
>> will be just add-ons to the passthru case. But true Better to flush out the
>> non-pasthru case details.
>> 
>> After dwelling on this a bit more how about the below:
>> 
>> Phase 1: Goal: Enable hardware filtering for all macvlan modes
>>     - In macvlan passthru mode the single guest virtio-nic connected will
>>       receive traffic that he requested for
>>     - In macvlan non-passthru mode all guest virtio-nics sharing the
>>       physical nic will see all other guest traffic
>>       but the filtering at guest virtio-nic
> 
> I don't think guests currently filter anything.
> 
I was referring to Qemu-kvm virtio-net in
virtion_net_receive->receive_filter. I think It only passes pkts that the
guest OS is interested. It uses the filter table that I am passing to
macvtap in this patch.


>>       will make sure each guest
>>       eventually sees traffic he asked for. This is still better than
>>       putting the physical nic in promiscuous mode.
>> 
>> (This is mainly what my patch does...but will need to remove the passthru
>> check and see if there are any thing else needed for non-passthru case)
> 
> I'm fine with sticking with passthrough, make non passthrough
> a separate phase.
> 
Ok.

>> 
>> Phase 2: Goal: Enable filtering at macvlan so that each guest virtio-nic
>> receives only what he requested for.
>>     - In this case, in addition to pushing the filters down to the physical
>>       nic we will have to maintain the same filter in macvlan and do a filter
>>       lookup before forwarding the traffic to a virtio-nic.
>> 
>> But I am thinking phase 2 might be redundant given virtio-nic already does
>> filtering for the guest.
> 
> It does? Do you mean the filter that qemu does in userspace?
> 
Yes I meant the filter that qemu does in userspace qemu-kvm/hw/virtio-net.c
receive_filter(). 

>> In which case we might not need phase 2 at all. I
>> might have been over complicating things.
>> 
>> Please comment. And please correct if I missed something.
>>  
>>  
>>>>> 
>>>>> A way to limit what the guest can do would also be useful.
>>>>> How can this be done? selinux?
>>>> 
>>>> I vaguely remember a thread on the same context.. had a suggestion to
>>>> maintain pre-approved address lists and allow guest filter registration of
>>>> only those addresses for security. This seemed reasonable. Plus the ability
>>>> to support additional address registration from guest could be made
>>>> configurable (One of your ideas again from prior work).
>>>> 
>>>> I am not an selinux expert, but I am thinking we can use it to only allow
>>>> or
>>>> disallow access or operations to the macvtap device. (?). I will check more
>>>> on this.
>>> 
>>> We'd have to have a way to revoke that as well.
>>> 
>> Yes true.
>> 
>> 
>>>>> 
>>>>> Any thoughts on spoofing filtering?
>>>> 
>>>> I can only think of checking addresses against an allowed address list.
>>>> Don't know of any other ways. Any hints ?
>>> 
>>> Hardware (esp SRIOV) often has ways to do this check, too.
>>> 
>> Yes correct. Hw sriov and even switch in 802.1Qbh has anti-spoofing feature.
>> In which case I am thinking having It at the macvtap layer is not an
>> absolute must (?).
> 
> Exactly. But let's figure out *how* it will be programmed.
> If anti-spoofing is programmed with netlink, maybe that's
> a better interface for rx filter too, for consistency.
> 
I will check sriov vfs. But I know our hw does not have an interface exposed
from the driver. We do it through the management s/w at the switch port
(this is 802.1Qbh). I will check other sriov nics. I don't think intel has a
netlink interface either. But I will double check.


>>>> 
>>>> In any case I am assuming all the protection/security measures should be
>>>> taken at the layer calling the TUNSETTXFILTER ie..In macvtap virtualization
>>>> use case its libvirt or qemu-kvm. No ?
>>> 
>>> Ideally we'd have a way to separate these capabilities, so that libvirt
>>> can override qemu.
>>> 
>>>>> 
>>>>> Would it be possible to make the filtering programmable
>>>>> using netlink, e.g. ethtool, ip, or some such?
>>>> 
>>>> Should be possible via ethtool or ip calling ioctl TUNSETTXFILTER. Are you
>>>> thinking of macvlan having a netlink interface to set filter and not ioctl
>>>> ?. Sure.
>>> 
>>> Yes.
>>> 
>>>> But I was thinking the point of implementing TUNSETTXFILTER was to
>>>> maintain compatibility with the generic tap interface that does the same
>>>> thing. 
>>> 
>>> Yes. OTOH I don't think anyone uses that ATM so it might not
>>> be important if it's not a good fit.
>>> E.g. we could notify libvirt and have it use netlink for us
>>> if we like that better.
>>> 
>> Ok thanks for clarifying that. One more reason to use TUNSETTXFILTER
>> interface was for qemu-kvm who uses the same tap interface for macvtap and
>> regular tap. So if we use netlink we have to do different things for macvtap
>> and tap filters in qemu. And qemu-kvm does not distinguish between macvtap
>> and tap as far as I know. No ?
> 
> It's not a question of simplifying qemu as much as trying to
> make the kernel interface abstract device differences
> away from users. Using same interface for tun and macvtap
> gave us some confidence that the interface is a good one.
> 
> But this does not seem to have worked with TUNSETTXFILTER -
> at least qemu doesn't use it yet, and it's been upstream
> a while. So there's no proof it's a good interface.

All qemu-kvm patches (not upstream yet) that have been floating around and
which I have been testing with use TUNSETTXFILTER. And tun kernel driver
does support filtering using TUNSETTXFILTER. It has code that does filtering
based on filter received using TUNSETTXFILTER.


But true we don't have to stick with TUNSETTXFILTER. I will explore the
netlink interface and research pros and cons and sketch the interface and
get back. 

> 
> So if we decide netlink is a better interface we can add it for tun too.
> We need to be backwards compatible and figure out what happens if someone
> tries to use both methods: probably apply both or ignore TUNSETTXFILTER ...
> 
>> 
>> Thanks you for your review and comments.
>> 
>> 
>>>> And having both the netlink op and ioctl interface might not be clean ?.
>>> 
>>> No idea.
>>> 
>>>> Sorry if I misunderstood your question.
>>>> 
>>>>> That would make this useful for bridged setups besides
>>>>> macvtap/virtualization.
>>>>> 
>>>> 
>>>> Thanks for the comments.
> 
> Overall good progress, and don't let the interface discussions
> block you. You want to push in two directions - stabilize code in one
> branch, and play with interfaces in another one. By the time there's a
> concensus on the interfaces you have the main logic all ready,
> then you merge. 

ok thanks michael! 

- Roopa



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-08 19:33             ` Michael S. Tsirkin
@ 2011-09-09  3:00               ` Roopa Prabhu
  2011-09-09  4:25                 ` Sridhar Samudrala
  2011-09-11  9:44                 ` Michael S. Tsirkin
  0 siblings, 2 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-09  3:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sridhar Samudrala, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm




On 9/8/11 12:33 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Sep 08, 2011 at 12:23:56PM -0700, Roopa Prabhu wrote:
>>> 
>>> I think the main usecase for passthru mode is to assign a SR-IOV VF to
>>> a single guest.
>>> 
>> Yes and for the passthru usecase this patch should be enough to enable
>> filtering in hw (eventually like I indicated before I need to fix vlan
>> filtering too).
> 
> So with filtering in hw, and in sriov VF case, VFs
> actually share a filtering table. How will that
> be partitioned?

AFAIK, though it might maintain a single filter table space in hw, hw does
know which filter belongs to which VF. And the OS driver does not need to do
anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
registered with a VF netdev are registered with the hw by the driver. And hw
will filter and send only pkts that the VF has expressed interest in.

No special filter partitioning in hw is required.

Thanks,
Roopa


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-09  3:00               ` Roopa Prabhu
@ 2011-09-09  4:25                 ` Sridhar Samudrala
  2011-09-09 16:21                   ` Roopa Prabhu
  2011-09-11  9:44                 ` Michael S. Tsirkin
  1 sibling, 1 reply; 32+ messages in thread
From: Sridhar Samudrala @ 2011-09-09  4:25 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Michael S. Tsirkin, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm

On 9/8/2011 8:00 PM, Roopa Prabhu wrote:
>
>
> On 9/8/11 12:33 PM, "Michael S. Tsirkin"<mst@redhat.com>  wrote:
>
>> On Thu, Sep 08, 2011 at 12:23:56PM -0700, Roopa Prabhu wrote:
>>>> I think the main usecase for passthru mode is to assign a SR-IOV VF to
>>>> a single guest.
>>>>
>>> Yes and for the passthru usecase this patch should be enough to enable
>>> filtering in hw (eventually like I indicated before I need to fix vlan
>>> filtering too).
>> So with filtering in hw, and in sriov VF case, VFs
>> actually share a filtering table. How will that
>> be partitioned?
> AFAIK, though it might maintain a single filter table space in hw, hw does
> know which filter belongs to which VF. And the OS driver does not need to do
> anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
> registered with a VF netdev are registered with the hw by the driver. And hw
> will filter and send only pkts that the VF has expressed interest in.
Does your NIC & driver support adding multiple mac addresses to a VF?
I have tried a few other SR-IOV NICs sometime back and they didn't 
support this feature.

Currently, we don't have an interface to add multiple mac addresses to a 
netdev other than an
indirect way of creating a macvlan /if on top of it.

Thanks
Sridhar

>
> No special filter partitioning in hw is required.
>
> Thanks,
> Roopa
>



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-09  2:53           ` Roopa Prabhu
@ 2011-09-09  5:55             ` Michael S. Tsirkin
  2011-09-09 16:33               ` Roopa Prabhu
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-09  5:55 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm

On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote:
> >> Phase 1: Goal: Enable hardware filtering for all macvlan modes
> >>     - In macvlan passthru mode the single guest virtio-nic connected will
> >>       receive traffic that he requested for
> >>     - In macvlan non-passthru mode all guest virtio-nics sharing the
> >>       physical nic will see all other guest traffic
> >>       but the filtering at guest virtio-nic
> > 
> > I don't think guests currently filter anything.
> > 
> I was referring to Qemu-kvm virtio-net in
> virtion_net_receive->receive_filter. I think It only passes pkts that the
> guest OS is interested. It uses the filter table that I am passing to
> macvtap in this patch.

This happens after userspace thread gets woken up and data
is copied there. So relying on filtering at that level is
going to be very inefficient on a system with
multiple active guests. Further, and for that reason, vhost-net
doesn't do filtering at all, relying on the backends
to pass it correct packets.

-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-09  4:25                 ` Sridhar Samudrala
@ 2011-09-09 16:21                   ` Roopa Prabhu
  0 siblings, 0 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-09 16:21 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Michael S. Tsirkin, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm




On 9/8/11 9:25 PM, "Sridhar Samudrala" <sri@us.ibm.com> wrote:

> On 9/8/2011 8:00 PM, Roopa Prabhu wrote:
>> 
>> 
>> On 9/8/11 12:33 PM, "Michael S. Tsirkin"<mst@redhat.com>  wrote:
>> 
>>> On Thu, Sep 08, 2011 at 12:23:56PM -0700, Roopa Prabhu wrote:
>>>>> I think the main usecase for passthru mode is to assign a SR-IOV VF to
>>>>> a single guest.
>>>>> 
>>>> Yes and for the passthru usecase this patch should be enough to enable
>>>> filtering in hw (eventually like I indicated before I need to fix vlan
>>>> filtering too).
>>> So with filtering in hw, and in sriov VF case, VFs
>>> actually share a filtering table. How will that
>>> be partitioned?
>> AFAIK, though it might maintain a single filter table space in hw, hw does
>> know which filter belongs to which VF. And the OS driver does not need to do
>> anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
>> registered with a VF netdev are registered with the hw by the driver. And hw
>> will filter and send only pkts that the VF has expressed interest in.
> Does your NIC & driver support adding multiple mac addresses to a VF?
> I have tried a few other SR-IOV NICs sometime back and they didn't
> support this feature.

Yes our nic does. I thought Intel's also does (see ixgbevf_set_rx_mode).
Though I have not really tried using it on an Intel card. I think most cards
should at the least support multicast filters.

If the lower dev does not support unicast filtering, dev_uc_add(lowerdev,..)
puts the lower dev in promiscous mode. Though..i think I can chcek this
before hand in macvlan_open and put the lowerdev in promiscuous mode if it
does not support filtering.

> 
> Currently, we don't have an interface to add multiple mac addresses to a
> netdev other than an
> indirect way of creating a macvlan /if on top of it.

Yes I think so. I have been using only macvlan to test.

Thanks,
Roopa


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-09  5:55             ` Michael S. Tsirkin
@ 2011-09-09 16:33               ` Roopa Prabhu
  2011-09-11  9:38                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-09 16:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm




On 9/8/11 10:55 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote:
>>>> Phase 1: Goal: Enable hardware filtering for all macvlan modes
>>>>     - In macvlan passthru mode the single guest virtio-nic connected will
>>>>       receive traffic that he requested for
>>>>     - In macvlan non-passthru mode all guest virtio-nics sharing the
>>>>       physical nic will see all other guest traffic
>>>>       but the filtering at guest virtio-nic
>>> 
>>> I don't think guests currently filter anything.
>>> 
>> I was referring to Qemu-kvm virtio-net in
>> virtion_net_receive->receive_filter. I think It only passes pkts that the
>> guest OS is interested. It uses the filter table that I am passing to
>> macvtap in this patch.
> 
> This happens after userspace thread gets woken up and data
> is copied there. So relying on filtering at that level is
> going to be very inefficient on a system with
> multiple active guests. Further, and for that reason, vhost-net
> doesn't do filtering at all, relying on the backends
> to pass it correct packets.

Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode
and for non-PASSTHRU when there is a single guest connected to a VF.
For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely
better than putting the VF in promiscuous mode.
But to address the concern you mention above, in phase 2 when we have more
than one guest sharing the VF, we will have to add filter lookup in macvlan
to filter pkts for each guest. This will need some performance tests too.

Will start investigating the netlink interface comments for phase 1 first.

Thanks!
-Roopa


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-09 16:33               ` Roopa Prabhu
@ 2011-09-11  9:38                 ` Michael S. Tsirkin
  2011-09-11 13:18                   ` Roopa Prabhu
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-11  9:38 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm

On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote:
> 
> 
> 
> On 9/8/11 10:55 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote:
> >>>> Phase 1: Goal: Enable hardware filtering for all macvlan modes
> >>>>     - In macvlan passthru mode the single guest virtio-nic connected will
> >>>>       receive traffic that he requested for
> >>>>     - In macvlan non-passthru mode all guest virtio-nics sharing the
> >>>>       physical nic will see all other guest traffic
> >>>>       but the filtering at guest virtio-nic
> >>> 
> >>> I don't think guests currently filter anything.
> >>> 
> >> I was referring to Qemu-kvm virtio-net in
> >> virtion_net_receive->receive_filter. I think It only passes pkts that the
> >> guest OS is interested. It uses the filter table that I am passing to
> >> macvtap in this patch.
> > 
> > This happens after userspace thread gets woken up and data
> > is copied there. So relying on filtering at that level is
> > going to be very inefficient on a system with
> > multiple active guests. Further, and for that reason, vhost-net
> > doesn't do filtering at all, relying on the backends
> > to pass it correct packets.
> 
> Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode
> and for non-PASSTHRU when there is a single guest connected to a VF.
> For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely
> better than putting the VF in promiscuous mode.
> But to address the concern you mention above, in phase 2 when we have more
> than one guest sharing the VF,

It's probably more interesting for a card without SRIOV support.

> we will have to add filter lookup in macvlan
> to filter pkts for each guest.

Any chance to enable hardware filters for that?

> This will need some performance tests too.
> 
> Will start investigating the netlink interface comments for phase 1 first.
> 
> Thanks!
> -Roopa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-09  3:00               ` Roopa Prabhu
  2011-09-09  4:25                 ` Sridhar Samudrala
@ 2011-09-11  9:44                 ` Michael S. Tsirkin
  2011-09-11 13:18                   ` Roopa Prabhu
  1 sibling, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-11  9:44 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Sridhar Samudrala, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm

On Thu, Sep 08, 2011 at 08:00:53PM -0700, Roopa Prabhu wrote:
> 
> 
> 
> On 9/8/11 12:33 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Thu, Sep 08, 2011 at 12:23:56PM -0700, Roopa Prabhu wrote:
> >>> 
> >>> I think the main usecase for passthru mode is to assign a SR-IOV VF to
> >>> a single guest.
> >>> 
> >> Yes and for the passthru usecase this patch should be enough to enable
> >> filtering in hw (eventually like I indicated before I need to fix vlan
> >> filtering too).
> > 
> > So with filtering in hw, and in sriov VF case, VFs
> > actually share a filtering table. How will that
> > be partitioned?
> 
> AFAIK, though it might maintain a single filter table space in hw, hw does
> know which filter belongs to which VF. And the OS driver does not need to do
> anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
> registered with a VF netdev are registered with the hw by the driver. And hw
> will filter and send only pkts that the VF has expressed interest in.
> 
> No special filter partitioning in hw is required.
> 
> Thanks,
> Roopa

Yes, but what I mean is, if the size of the single filter table
is limited, we need to decide how many addresses is
each guest allowed. If we let one guest ask for
as many as it wants, it can lock others out.

-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-11  9:44                 ` Michael S. Tsirkin
@ 2011-09-11 13:18                   ` Roopa Prabhu
  2011-09-11 19:03                     ` Michael S. Tsirkin
  2011-09-12  4:30                     ` Sridhar Samudrala
  0 siblings, 2 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-11 13:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sridhar Samudrala, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm




On 9/11/11 2:44 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

>> 
>> AFAIK, though it might maintain a single filter table space in hw, hw does
>> know which filter belongs to which VF. And the OS driver does not need to do
>> anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
>> registered with a VF netdev are registered with the hw by the driver. And hw
>> will filter and send only pkts that the VF has expressed interest in.
>> 
>> No special filter partitioning in hw is required.
>> 
>> Thanks,
>> Roopa
> 
> Yes, but what I mean is, if the size of the single filter table
> is limited, we need to decide how many addresses is
> each guest allowed. If we let one guest ask for
> as many as it wants, it can lock others out.

Yes true. In these cases ie when the number of unicast addresses being
registered is more than it can handle, The VF driver will put the VF  in
promiscuous mode (Or at least its supposed to do. I think all drivers do
that).


Thanks,
Roopa



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-11  9:38                 ` Michael S. Tsirkin
@ 2011-09-11 13:18                   ` Roopa Prabhu
  2011-09-11 18:52                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-11 13:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm




On 9/11/11 2:38 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote:
>> 
>> 
>> 
>> On 9/8/11 10:55 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> 
>>> On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote:
>>>>>> Phase 1: Goal: Enable hardware filtering for all macvlan modes
>>>>>>     - In macvlan passthru mode the single guest virtio-nic connected will
>>>>>>       receive traffic that he requested for
>>>>>>     - In macvlan non-passthru mode all guest virtio-nics sharing the
>>>>>>       physical nic will see all other guest traffic
>>>>>>       but the filtering at guest virtio-nic
>>>>> 
>>>>> I don't think guests currently filter anything.
>>>>> 
>>>> I was referring to Qemu-kvm virtio-net in
>>>> virtion_net_receive->receive_filter. I think It only passes pkts that the
>>>> guest OS is interested. It uses the filter table that I am passing to
>>>> macvtap in this patch.
>>> 
>>> This happens after userspace thread gets woken up and data
>>> is copied there. So relying on filtering at that level is
>>> going to be very inefficient on a system with
>>> multiple active guests. Further, and for that reason, vhost-net
>>> doesn't do filtering at all, relying on the backends
>>> to pass it correct packets.
>> 
>> Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode
>> and for non-PASSTHRU when there is a single guest connected to a VF.
>> For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely
>> better than putting the VF in promiscuous mode.
>> But to address the concern you mention above, in phase 2 when we have more
>> than one guest sharing the VF,
> 
> It's probably more interesting for a card without SRIOV support.
> 
If its an SRIOV card I am assuming people likely using PASSTHRU mode.
Non-SRIOV cards will use any of the non-PASSTHRU mode.


>> we will have to add filter lookup in macvlan
>> to filter pkts for each guest.
> 
> Any chance to enable hardware filters for that?
> 
NAFAIK. Am not sure how you would do it too. Its still a single device from
where the host receives traffic from.

Thanks,
Roopa
 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-11 13:18                   ` Roopa Prabhu
@ 2011-09-11 18:52                     ` Michael S. Tsirkin
  2011-09-12 13:38                       ` Roopa Prabhu
  0 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-11 18:52 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm

On Sun, Sep 11, 2011 at 06:18:02AM -0700, Roopa Prabhu wrote:
> 
> 
> 
> On 9/11/11 2:38 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote:
> >> 
> >> 
> >> 
> >> On 9/8/11 10:55 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >> 
> >>> On Thu, Sep 08, 2011 at 07:53:11PM -0700, Roopa Prabhu wrote:
> >>>>>> Phase 1: Goal: Enable hardware filtering for all macvlan modes
> >>>>>>     - In macvlan passthru mode the single guest virtio-nic connected will
> >>>>>>       receive traffic that he requested for
> >>>>>>     - In macvlan non-passthru mode all guest virtio-nics sharing the
> >>>>>>       physical nic will see all other guest traffic
> >>>>>>       but the filtering at guest virtio-nic
> >>>>> 
> >>>>> I don't think guests currently filter anything.
> >>>>> 
> >>>> I was referring to Qemu-kvm virtio-net in
> >>>> virtion_net_receive->receive_filter. I think It only passes pkts that the
> >>>> guest OS is interested. It uses the filter table that I am passing to
> >>>> macvtap in this patch.
> >>> 
> >>> This happens after userspace thread gets woken up and data
> >>> is copied there. So relying on filtering at that level is
> >>> going to be very inefficient on a system with
> >>> multiple active guests. Further, and for that reason, vhost-net
> >>> doesn't do filtering at all, relying on the backends
> >>> to pass it correct packets.
> >> 
> >> Ok thanks for the info. So in which case, phase 1 is best for PASSTHRU mode
> >> and for non-PASSTHRU when there is a single guest connected to a VF.
> >> For non-PASSTHRU multi guest sharing the same VF, Phase 1 is definitely
> >> better than putting the VF in promiscuous mode.
> >> But to address the concern you mention above, in phase 2 when we have more
> >> than one guest sharing the VF,
> > 
> > It's probably more interesting for a card without SRIOV support.
> > 
> If its an SRIOV card I am assuming people likely using PASSTHRU mode.
> Non-SRIOV cards will use any of the non-PASSTHRU mode.
> 
> 
> >> we will have to add filter lookup in macvlan
> >> to filter pkts for each guest.
> > 
> > Any chance to enable hardware filters for that?
> > 
> NAFAIK. Am not sure how you would do it too. Its still a single device from
> where the host receives traffic from.
> 
> Thanks,
> Roopa

VMDQ cards might let you program mac addresses for individula rings.


-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-11 13:18                   ` Roopa Prabhu
@ 2011-09-11 19:03                     ` Michael S. Tsirkin
  2011-09-12 17:02                       ` Roopa Prabhu
  2011-09-12  4:30                     ` Sridhar Samudrala
  1 sibling, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2011-09-11 19:03 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Sridhar Samudrala, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm

On Sun, Sep 11, 2011 at 06:18:01AM -0700, Roopa Prabhu wrote:
> 
> 
> 
> On 9/11/11 2:44 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> >> 
> >> AFAIK, though it might maintain a single filter table space in hw, hw does
> >> know which filter belongs to which VF. And the OS driver does not need to do
> >> anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
> >> registered with a VF netdev are registered with the hw by the driver. And hw
> >> will filter and send only pkts that the VF has expressed interest in.
> >> 
> >> No special filter partitioning in hw is required.
> >> 
> >> Thanks,
> >> Roopa
> > 
> > Yes, but what I mean is, if the size of the single filter table
> > is limited, we need to decide how many addresses is
> > each guest allowed. If we let one guest ask for
> > as many as it wants, it can lock others out.
> 
> Yes true. In these cases ie when the number of unicast addresses being
> registered is more than it can handle, The VF driver will put the VF  in
> promiscuous mode (Or at least its supposed to do. I think all drivers do
> that).
> 
> 
> Thanks,
> Roopa

Right, so that works at least but likely performs worse
than a hardware filter. So we better allocate it in
some fair way, as a minimum. Maybe a way for
the admin to control that allocation is useful.

-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-11 13:18                   ` Roopa Prabhu
  2011-09-11 19:03                     ` Michael S. Tsirkin
@ 2011-09-12  4:30                     ` Sridhar Samudrala
  2011-09-12 17:23                       ` Roopa Prabhu
  1 sibling, 1 reply; 32+ messages in thread
From: Sridhar Samudrala @ 2011-09-12  4:30 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Michael S. Tsirkin, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm

On 9/11/2011 6:18 AM, Roopa Prabhu wrote:
>
>
> On 9/11/11 2:44 AM, "Michael S. Tsirkin"<mst@redhat.com>  wrote:
>
>>> AFAIK, though it might maintain a single filter table space in hw, hw does
>>> know which filter belongs to which VF. And the OS driver does not need to do
>>> anything special. The VF driver exposes a VF netdev. And any uc/mc addresses
>>> registered with a VF netdev are registered with the hw by the driver. And hw
>>> will filter and send only pkts that the VF has expressed interest in.
>>>
>>> No special filter partitioning in hw is required.
>>>
>>> Thanks,
>>> Roopa
>> Yes, but what I mean is, if the size of the single filter table
>> is limited, we need to decide how many addresses is
>> each guest allowed. If we let one guest ask for
>> as many as it wants, it can lock others out.
> Yes true. In these cases ie when the number of unicast addresses being
> registered is more than it can handle, The VF driver will put the VF  in
> promiscuous mode (Or at least its supposed to do. I think all drivers do
> that).
>
What does putting VF in promiscuous mode mean?  How can the NIC decide 
which set
of mac addresses are passed to the VF? Does it mean VF sees all the 
packets received
by the NIC including packets destined for other VFs/PF?

Thanks
Sridhar


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-11 18:52                     ` Michael S. Tsirkin
@ 2011-09-12 13:38                       ` Roopa Prabhu
  0 siblings, 0 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-12 13:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, dragos.tatulea, arnd, dwang2, benve, kaber, sri, davem,
	eric.dumazet, mchan, kvm




On 9/11/11 11:52 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Sun, Sep 11, 2011 at 06:18:02AM -0700, Roopa Prabhu wrote:
>> 
>> 
>> 
>> On 9/11/11 2:38 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> 
>>> On Fri, Sep 09, 2011 at 09:33:33AM -0700, Roopa Prabhu wrote:

>>> 
>>> It's probably more interesting for a card without SRIOV support.
>>> 
>> If its an SRIOV card I am assuming people likely using PASSTHRU mode.
>> Non-SRIOV cards will use any of the non-PASSTHRU mode.
>> 
>> 
>>>> we will have to add filter lookup in macvlan
>>>> to filter pkts for each guest.
>>> 
>>> Any chance to enable hardware filters for that?
>>> 
>> NAFAIK. Am not sure how you would do it too. Its still a single device from
>> where the host receives traffic from.
>> 
>> Thanks,
>> Roopa
> 
> VMDQ cards might let you program mac addresses for individula rings.
> 
I tried to lookup more information on this. I dint find any concrete
information. I am not sure if individual rings show up as separate netdev.
Any more info on how a VMDQ nic is used with macvlan ?.

I came across this 
http://www.linux-kvm.org/wiki/images/6/6a/KvmForum2008$kdf2008_7.pdf

Thanks,
Roopa
 
 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-11 19:03                     ` Michael S. Tsirkin
@ 2011-09-12 17:02                       ` Roopa Prabhu
  2011-09-15 13:46                         ` Roopa Prabhu
  0 siblings, 1 reply; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-12 17:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sridhar Samudrala, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm




On 9/11/11 12:03 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Sun, Sep 11, 2011 at 06:18:01AM -0700, Roopa Prabhu wrote:
>> 
>> 
>> 
>> On 9/11/11 2:44 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> 
>>> 
>>> Yes, but what I mean is, if the size of the single filter table
>>> is limited, we need to decide how many addresses is
>>> each guest allowed. If we let one guest ask for
>>> as many as it wants, it can lock others out.
>> 
>> Yes true. In these cases ie when the number of unicast addresses being
>> registered is more than it can handle, The VF driver will put the VF  in
>> promiscuous mode (Or at least its supposed to do. I think all drivers do
>> that).
>> 
>> 
>> Thanks,
>> Roopa
> 
> Right, so that works at least but likely performs worse
> than a hardware filter. So we better allocate it in
> some fair way, as a minimum. Maybe a way for
> the admin to control that allocation is useful.

Yes I think we will have to do something like that. There is a maximum that
hw can support. Might need to consider that too. But there is no interface
to get that today. I think the virtualization case gets a little trickier.
Virtio-net allows upto 64 unicast addresses. But the lowerdev may allow only
upto say 10 unicast addresses (I think intel supports 10 unicast addresses
on the VF). Am not sure if there is a good way to notify the guest of
blocked addresses. Maybe putting the lower dev in promiscuous mode could be
a policy decision too in this case.

One other thing, I had indicated that I will look up details on opening my
patch for non-passthru to enable hw filtering (without adding filtering
support in macvlan right away. Ie phase1). Turns out in current code in
macvlan_handle_frame, for non-passthru case, it does not fwd unicast pkts
destined to macs other than the ones in macvlan hash. So a filter or hash
lookup there for additional unicast addresses needs to be definitely added
for non-passthru.

Thanks,
Roopa


 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-12  4:30                     ` Sridhar Samudrala
@ 2011-09-12 17:23                       ` Roopa Prabhu
  0 siblings, 0 replies; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-12 17:23 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Michael S. Tsirkin, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm




On 9/11/11 9:30 PM, "Sridhar Samudrala" <sri@us.ibm.com> wrote:

> On 9/11/2011 6:18 AM, Roopa Prabhu wrote:
>> 
>> 
>> On 9/11/11 2:44 AM, "Michael S. Tsirkin"<mst@redhat.com>  wrote:
>> 
>>>> AFAIK, though it might maintain a single filter table space in hw, hw does
>>>> know which filter belongs to which VF. And the OS driver does not need to
>>>> do
>>>> anything special. The VF driver exposes a VF netdev. And any uc/mc
>>>> addresses
>>>> registered with a VF netdev are registered with the hw by the driver. And
>>>> hw
>>>> will filter and send only pkts that the VF has expressed interest in.
>>>> 
>>>> No special filter partitioning in hw is required.
>>>> 
>>>> Thanks,
>>>> Roopa
>>> Yes, but what I mean is, if the size of the single filter table
>>> is limited, we need to decide how many addresses is
>>> each guest allowed. If we let one guest ask for
>>> as many as it wants, it can lock others out.
>> Yes true. In these cases ie when the number of unicast addresses being
>> registered is more than it can handle, The VF driver will put the VF  in
>> promiscuous mode (Or at least its supposed to do. I think all drivers do
>> that).
>> 
> What does putting VF in promiscuous mode mean?  How can the NIC decide
> which set
> of mac addresses are passed to the VF? Does it mean VF sees all the
> packets received
> by the NIC including packets destined for other VFs/PF?
> 
Yes I think so. After your question I looked at 2 other  VF drivers and
looks like they return error if num unicast addresses exceeds the number
supported by hw and don't put the VF in promiscuous mode. But one could put
the VF in promiscuous mode by changing IFF_FLAGS I think.

The original in-kernel passthru mode code puts the VF in promiscuous mode by
default. Am assuming that works well with other sriov cards you got a chance
to try out with.

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-12 17:02                       ` Roopa Prabhu
@ 2011-09-15 13:46                         ` Roopa Prabhu
  2011-09-26 23:06                           ` Christian Benvenuti (benve)
  0 siblings, 1 reply; 32+ messages in thread
From: Roopa Prabhu @ 2011-09-15 13:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sridhar Samudrala, netdev, dragos.tatulea, arnd, dwang2, benve,
	kaber, davem, eric.dumazet, mchan, kvm



The netlink patch is still in the works. I will post the patches after I
clean it up a bit and also accommodate or find answers to most questions
discussed for non-passthru case. Thought I will post the netlink interface
here to see if anyone has any early comments. I have a
rtnl_link_ops->set_rx_filter defined.

[IFLA_RX_FILTER] = {
    [IFLA_ADDRESS_FILTER] = {
        [IFLA_ADDRESS_FILTER_FLAGS]
        [IFLA_ADDRESS_LIST] = {
            [IFLA_ADDRESS_LIST_ENTRY]
        }
    }    
    [IFLA_VLAN_FILTER] = {
        [IFLA_VLAN_LIST] = {
            [IFLA_VLAN]
        }
    }
}

Some open questions:
    - The VLAN filter above shows a VLAN list. It could also be a bitmap or
the interface could provide both a bitmap and VLAN list for more flexibility
. Like the below  

[IFLA_RX_FILTER] = {
    [IFLA_ADDRESS_FILTER] = {
        [IFLA_ADDRESS_FILTER_FLAGS]
        [IFLA_ADDRESS_LIST] = {
            [IFLA_ADDRESS_LIST_ENTRY]
        }
    }    
    [IFLA_VLAN_FILTER] = {
        [IFLA_VLAN_BITMAP]
        [IFLA_VLAN_LIST] = {
            [IFLA_VLAN]
        }
    }
}

    - Do you see any advantage in keeping Unicast and multicast address list
separate ? Something like the below :
    [IFLA_RX_FILTER] = {
        [IFLA_ADDRESS_FILTER_FLAGS]
        [IFLA_UC_ADDRESS_FILTER] = {
            [IFLA_ADDRESS_LIST] = {
                [IFLA_ADDRESS_LIST_ENTRY]
            }
        }
        [IFLA_MC_ADDRESS_FILTER] = {
            [IFLA_ADDRESS_LIST] = {
                [IFLA_ADDRESS_LIST_ENTRY]
            }
        }
        [IFLA_VLAN_FILTER] = {
            [IFLA_VLAN_LIST] = {
                [IFLA_VLAN]
            }
        }
    } 

    - Is there any need to keep address and vlan filters separate. And have
two rtnl_link_ops, set_rx_address_filter, set_rx_vlan_filter ?. I don't see
one .

    [IFLA_RX_ADDRESS_FILTER] = {
        [IFLA_ADDRESS_FILTER_FLAGS]
        [IFLA_ADDRESS_LIST] = {
            [IFLA_ADDRESS_LIST_ENTRY]
        }
    }
    [IFLA_RX_VLAN_FILTER] = {
        [IFLA_VLAN_LIST] = {
            [IFLA_VLAN]
        }
    } 


Thanks,
Roopa



On 9/12/11 10:02 AM, "Roopa Prabhu" <roprabhu@cisco.com> wrote:

> 
> 
> 
> On 9/11/11 12:03 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>> On Sun, Sep 11, 2011 at 06:18:01AM -0700, Roopa Prabhu wrote:
>>> 
>>> 
>>> 
>>> On 9/11/11 2:44 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>> 
>>>> 
>>>> Yes, but what I mean is, if the size of the single filter table
>>>> is limited, we need to decide how many addresses is
>>>> each guest allowed. If we let one guest ask for
>>>> as many as it wants, it can lock others out.
>>> 
>>> Yes true. In these cases ie when the number of unicast addresses being
>>> registered is more than it can handle, The VF driver will put the VF  in
>>> promiscuous mode (Or at least its supposed to do. I think all drivers do
>>> that).
>>> 
>>> 
>>> Thanks,
>>> Roopa
>> 
>> Right, so that works at least but likely performs worse
>> than a hardware filter. So we better allocate it in
>> some fair way, as a minimum. Maybe a way for
>> the admin to control that allocation is useful.
> 
> Yes I think we will have to do something like that. There is a maximum that hw
> can support. Might need to consider that too. But there is no interface to get
> that today. I think the virtualization case gets a little trickier. Virtio-net
> allows upto 64 unicast addresses. But the lowerdev may allow only upto say 10
> unicast addresses (I think intel supports 10 unicast addresses on the VF). Am
> not sure if there is a good way to notify the guest of blocked addresses.
> Maybe putting the lower dev in promiscuous mode could be a policy decision too
> in this case. 
> 
> One other thing, I had indicated that I will look up details on opening my
> patch for non-passthru to enable hw filtering (without adding filtering
> support in macvlan right away. Ie phase1). Turns out in current code in
> macvlan_handle_frame, for non-passthru case, it does not fwd unicast pkts
> destined to macs other than the ones in macvlan hash. So a filter or hash
> lookup there for additional unicast addresses needs to be definitely added for
> non-passthru.
> 
> Thanks,
> Roopa
> 
> 
>  


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode
  2011-09-15 13:46                         ` Roopa Prabhu
@ 2011-09-26 23:06                           ` Christian Benvenuti (benve)
  0 siblings, 0 replies; 32+ messages in thread
From: Christian Benvenuti (benve) @ 2011-09-26 23:06 UTC (permalink / raw)
  To: Roopa Prabhu (roprabhu), Michael S. Tsirkin
  Cc: Sridhar Samudrala, netdev, dragos.tatulea, arnd,
	David Wang (dwang2),
	kaber, davem, eric.dumazet, mchan, kvm

> -----Original Message-----
> From: Roopa Prabhu (roprabhu)
> Sent: Thursday, September 15, 2011 6:47 AM
> To: Michael S. Tsirkin
> Cc: Sridhar Samudrala; netdev@vger.kernel.org;
> dragos.tatulea@gmail.com; arnd@arndb.de; David Wang (dwang2);
Christian
> Benvenuti (benve); kaber@trash.net; davem@davemloft.net;
> eric.dumazet@gmail.com; mchan@broadcom.com; kvm@vger.kernel.org
> Subject: Re: [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address
> filtering support for passthru mode
> 
> 
> 
> The netlink patch is still in the works. I will post the patches after
> I
> clean it up a bit and also accommodate or find answers to most
> questions
> discussed for non-passthru case. Thought I will post the netlink
> interface
> here to see if anyone has any early comments. I have a
> rtnl_link_ops->set_rx_filter defined.
> 
> [IFLA_RX_FILTER] = {
>     [IFLA_ADDRESS_FILTER] = {
>         [IFLA_ADDRESS_FILTER_FLAGS]
>         [IFLA_ADDRESS_LIST] = {
>             [IFLA_ADDRESS_LIST_ENTRY]
>         }
>     }
>     [IFLA_VLAN_FILTER] = {
>         [IFLA_VLAN_LIST] = {
>             [IFLA_VLAN]
>         }
>     }
> }
> 
> Some open questions:
>     - The VLAN filter above shows a VLAN list. It could also be a
> bitmap or
> the interface could provide both a bitmap and VLAN list for more
> flexibility
> . Like the below
> 
> [IFLA_RX_FILTER] = {
>     [IFLA_ADDRESS_FILTER] = {
>         [IFLA_ADDRESS_FILTER_FLAGS]
>         [IFLA_ADDRESS_LIST] = {
>             [IFLA_ADDRESS_LIST_ENTRY]
>         }
>     }
>     [IFLA_VLAN_FILTER] = {
>         [IFLA_VLAN_BITMAP]
>         [IFLA_VLAN_LIST] = {
>             [IFLA_VLAN]
>         }
>     }
> }

The simplest interface probably is to stick to a bitmap (knowing that in
the worst
case it will take 256 bytes, but we can compress it ...), because
sending
a vlan list may end up requiring much more than that (on interfaces
configured as trunks).
This regardless of whether the most common use case is that of a server
configured
with just few vlans or that of a switch configured with few trunks.

Another option would be a list of ranges, but that one would not work
well
in those cases where trunks are configured, for example, to carry big
numbers
of odd or even vlan IDs or other groups of vlans IDs that cannot be
grouped
into ranges. Probably an acceptable compromise, if we care about the
size
of this attribute, would be:
- to use a list of IDs for less than 256 vlans (or a list of ranges)
- to use a bitmap for more than 256 vlan.

I would recommend the two attributes (IFLA_VLAN_BITMAP and
IFLA_VLAN_LIST)
to be mutually exclusive to reduce the complexity of the merging and
error/misconfig detection code.

>     - Do you see any advantage in keeping Unicast and multicast
address
> list
> separate ? Something like the below :
>     [IFLA_RX_FILTER] = {
>         [IFLA_ADDRESS_FILTER_FLAGS]
>         [IFLA_UC_ADDRESS_FILTER] = {
>             [IFLA_ADDRESS_LIST] = {
>                 [IFLA_ADDRESS_LIST_ENTRY]
>             }
>         }
>         [IFLA_MC_ADDRESS_FILTER] = {
>             [IFLA_ADDRESS_LIST] = {
>                 [IFLA_ADDRESS_LIST_ENTRY]
>             }
>         }
>         [IFLA_VLAN_FILTER] = {
>             [IFLA_VLAN_LIST] = {
>                 [IFLA_VLAN]
>             }
>         }
>     }

I personally like the idea of grouping UC and MC addresses into two
distinct
attributes/groups.
The receiver (the kernel) would have to check them anyway (I suppose),
but
I like the idea of having the kernel being able to detect the case
where, for
example, the user configures a MC address thinking he is actually
configuring
a UC address.

Most probably the iproute2 commands used to configure/add/del UC and MC
address
will be assigned two different keywords.
BTW, once this code will be in, it will be possible for "ip link show"
to show
all UC/MC MAC addresses; right now "ip link" only shows dev->dev_addr.
This output is useful for debugging.

The output from "ip maddr" only shows the MC list and anyway I think the
best
place for the list of MAC addresses is "ip link show".

Would "ip link show" also show the list of vlans?
Probably, best would be to add new flags (to ask for the extended
output) or
simply add the extra output (uc/mc/vlan lists) under the already
existent "-s" flag ?

>     - Is there any need to keep address and vlan filters separate. And
> have
> two rtnl_link_ops, set_rx_address_filter, set_rx_vlan_filter ?. I
don't
> see
> one .
> 
>     [IFLA_RX_ADDRESS_FILTER] = {
>         [IFLA_ADDRESS_FILTER_FLAGS]
>         [IFLA_ADDRESS_LIST] = {
>             [IFLA_ADDRESS_LIST_ENTRY]
>         }
>     }
>     [IFLA_RX_VLAN_FILTER] = {
>         [IFLA_VLAN_LIST] = {
>             [IFLA_VLAN]
>         }
>     }

I think both approaches are good.
Anyway, given that you can have/configure nested vlans, having
IFLA_RX_VLAN_FILTER
inside IFLA_RX_FILTER would be syntactically correct too.

/Chris

 
> Thanks,
> Roopa
> 
> 
> 
> On 9/12/11 10:02 AM, "Roopa Prabhu" <roprabhu@cisco.com> wrote:
> 
> >
> >
> >
> > On 9/11/11 12:03 PM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >
> >> On Sun, Sep 11, 2011 at 06:18:01AM -0700, Roopa Prabhu wrote:
> >>>
> >>>
> >>>
> >>> On 9/11/11 2:44 AM, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>>
> >>>>
> >>>> Yes, but what I mean is, if the size of the single filter table
> >>>> is limited, we need to decide how many addresses is
> >>>> each guest allowed. If we let one guest ask for
> >>>> as many as it wants, it can lock others out.
> >>>
> >>> Yes true. In these cases ie when the number of unicast addresses
> being
> >>> registered is more than it can handle, The VF driver will put the
> VF  in
> >>> promiscuous mode (Or at least its supposed to do. I think all
> drivers do
> >>> that).
> >>>
> >>>
> >>> Thanks,
> >>> Roopa
> >>
> >> Right, so that works at least but likely performs worse
> >> than a hardware filter. So we better allocate it in
> >> some fair way, as a minimum. Maybe a way for
> >> the admin to control that allocation is useful.
> >
> > Yes I think we will have to do something like that. There is a
> maximum that hw
> > can support. Might need to consider that too. But there is no
> interface to get
> > that today. I think the virtualization case gets a little trickier.
> Virtio-net
> > allows upto 64 unicast addresses. But the lowerdev may allow only
> upto say 10
> > unicast addresses (I think intel supports 10 unicast addresses on
the
> VF). Am
> > not sure if there is a good way to notify the guest of blocked
> addresses.
> > Maybe putting the lower dev in promiscuous mode could be a policy
> decision too
> > in this case.
> >
> > One other thing, I had indicated that I will look up details on
> opening my
> > patch for non-passthru to enable hw filtering (without adding
> filtering
> > support in macvlan right away. Ie phase1). Turns out in current code
> in
> > macvlan_handle_frame, for non-passthru case, it does not fwd unicast
> pkts
> > destined to macs other than the ones in macvlan hash. So a filter or
> hash
> > lookup there for additional unicast addresses needs to be definitely
> added for
> > non-passthru.
> >
> > Thanks,
> > Roopa
> >
> >
> >

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2011-09-26 23:06 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-06 22:35 [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Roopa Prabhu
2011-09-06 22:35 ` [net-next-2.6 PATCH 1/3 RFC] macvlan: Add support for unicast filtering in macvlan Roopa Prabhu
2011-09-06 22:35 ` [net-next-2.6 PATCH 2/3 RFC] macvlan: Add function to set addr filters for device in passthru mode Roopa Prabhu
2011-09-06 22:35 ` [net-next-2.6 PATCH 3/3 RFC] macvtap: Add support for TUNSETTXFILTER Roopa Prabhu
2011-09-08 16:25   ` Arnd Bergmann
2011-09-08 19:06     ` Roopa Prabhu
2011-09-07 12:34 ` [net-next-2.6 PATCH 0/3 RFC] macvlan: MAC Address filtering support for passthru mode Michael S. Tsirkin
2011-09-08  5:20   ` Roopa Prabhu
2011-09-08 11:08     ` Michael S. Tsirkin
2011-09-08 16:19       ` Roopa Prabhu
2011-09-08 17:42         ` Sridhar Samudrala
2011-09-08 19:23           ` Roopa Prabhu
2011-09-08 19:33             ` Michael S. Tsirkin
2011-09-09  3:00               ` Roopa Prabhu
2011-09-09  4:25                 ` Sridhar Samudrala
2011-09-09 16:21                   ` Roopa Prabhu
2011-09-11  9:44                 ` Michael S. Tsirkin
2011-09-11 13:18                   ` Roopa Prabhu
2011-09-11 19:03                     ` Michael S. Tsirkin
2011-09-12 17:02                       ` Roopa Prabhu
2011-09-15 13:46                         ` Roopa Prabhu
2011-09-26 23:06                           ` Christian Benvenuti (benve)
2011-09-12  4:30                     ` Sridhar Samudrala
2011-09-12 17:23                       ` Roopa Prabhu
2011-09-08 19:11         ` Michael S. Tsirkin
2011-09-09  2:53           ` Roopa Prabhu
2011-09-09  5:55             ` Michael S. Tsirkin
2011-09-09 16:33               ` Roopa Prabhu
2011-09-11  9:38                 ` Michael S. Tsirkin
2011-09-11 13:18                   ` Roopa Prabhu
2011-09-11 18:52                     ` Michael S. Tsirkin
2011-09-12 13:38                       ` Roopa Prabhu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.