All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] qemu: TAP filtering support
@ 2009-02-10 21:28 ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, alex.williamson

This series adds infrastructure and support for using the filtering
support built into recent Linux kernels.  The tap device currently
provides a small array for exact MAC address matches, plus a hash
for inexact multicast addresses.  By programing this filter, we can
reduce the number of extraneous packets qemu needs to handle.

I've restricted the configuration where this can be used to only
the case of having two devices on a qemu vlan.  In this case, we
have a clear consumer/produce model for the backend NIC to request
filtering from the TAP source.  To handle hotplug, I've added
notifiers that allow vlan clients to be notified when someone gets
added or removed so they can return to a non-filtered (of self-
filtered) mode.

This series also includes code for both virtio-net and e1000 to
make use of the filtering.  The tap devices provides a new
rxfilter=off option to disable exporting this new feature to
the backends.  The primary reason for this is that the current
Linux tun driver has a bug that if the exact match array
overflows, there's a possiblity of dropping unicast packets.
There are currently 8 exact matches, so the guest needs to be
trying to filter more than 8 to hit this bug.  I've already
submitted a patch for 2.6.29 and stable to fix this.

Please comment, thanks,

Alex

---

Alex Williamson (4):
      qemu:e1000: Add support for qemu_vlan_rxfilter
      qemu:virtio-net: Add support for qemu_vlan_rxfilter
      qemu:net: Add TAP support for RX filtering on Linux
      qemu:net: Add infrastructure for setting an RX filter through the vlan


 hw/e1000.c      |   60 +++++++++++++++++++++++++++++++++
 hw/virtio-net.c |   56 ++++++++++++++++++++++++++++++-
 net.c           |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++----
 net.h           |   14 ++++++++
 qemu-doc.texi   |    6 ++-
 vl.c            |    3 +-
 6 files changed, 226 insertions(+), 12 deletions(-)

-- 
Alex Williamson

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 0/4] qemu: TAP filtering support
@ 2009-02-10 21:28 ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, kvm

This series adds infrastructure and support for using the filtering
support built into recent Linux kernels.  The tap device currently
provides a small array for exact MAC address matches, plus a hash
for inexact multicast addresses.  By programing this filter, we can
reduce the number of extraneous packets qemu needs to handle.

I've restricted the configuration where this can be used to only
the case of having two devices on a qemu vlan.  In this case, we
have a clear consumer/produce model for the backend NIC to request
filtering from the TAP source.  To handle hotplug, I've added
notifiers that allow vlan clients to be notified when someone gets
added or removed so they can return to a non-filtered (of self-
filtered) mode.

This series also includes code for both virtio-net and e1000 to
make use of the filtering.  The tap devices provides a new
rxfilter=off option to disable exporting this new feature to
the backends.  The primary reason for this is that the current
Linux tun driver has a bug that if the exact match array
overflows, there's a possiblity of dropping unicast packets.
There are currently 8 exact matches, so the guest needs to be
trying to filter more than 8 to hit this bug.  I've already
submitted a patch for 2.6.29 and stable to fix this.

Please comment, thanks,

Alex

---

Alex Williamson (4):
      qemu:e1000: Add support for qemu_vlan_rxfilter
      qemu:virtio-net: Add support for qemu_vlan_rxfilter
      qemu:net: Add TAP support for RX filtering on Linux
      qemu:net: Add infrastructure for setting an RX filter through the vlan


 hw/e1000.c      |   60 +++++++++++++++++++++++++++++++++
 hw/virtio-net.c |   56 ++++++++++++++++++++++++++++++-
 net.c           |   99 +++++++++++++++++++++++++++++++++++++++++++++++++++----
 net.h           |   14 ++++++++
 qemu-doc.texi   |    6 ++-
 vl.c            |    3 +-
 6 files changed, 226 insertions(+), 12 deletions(-)

-- 
Alex Williamson

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 1/4] qemu:net: Add infrastructure for setting an RX filter through the vlan
  2009-02-10 21:28 ` [Qemu-devel] " Alex Williamson
@ 2009-02-10 21:28   ` Alex Williamson
  -1 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, alex.williamson

Some interfaces, such as tap, can set a filter to prevent unwanted
packets from interrupting qemu.  This patch adds infrastructure for
vlan clients to request a set of filtering parameters.  The main
client interface for this is the new qemu_vlan_rxfilter() function.
For simplicity, we restrict the topology that can make use of a filter
to exactly two clients on a vlan.  To handle hotplug devices, we add
vlan_client_added/removed notifiers which can automatically revert
back to a shared media protocol.  Devices providing filtering can do
so via the new fd_rxfilter() interface.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

 net.c |   43 ++++++++++++++++++++++++++++++++++++++++++-
 net.h |   14 ++++++++++++++
 2 files changed, 56 insertions(+), 1 deletions(-)

diff --git a/net.c b/net.c
index e7c097e..e68cb40 100644
--- a/net.c
+++ b/net.c
@@ -345,8 +345,11 @@ VLANClientState *qemu_new_vlan_client(VLANState *vlan,
 
     vc->next = NULL;
     pvc = &vlan->first_client;
-    while (*pvc != NULL)
+    while (*pvc != NULL) {
+        if ((*pvc)->vlan_client_added)
+            (*pvc)->vlan_client_added((*pvc)->opaque);
         pvc = &(*pvc)->next;
+    }
     *pvc = vc;
     return vc;
 }
@@ -364,6 +367,12 @@ void qemu_del_vlan_client(VLANClientState *vc)
             break;
         } else
             pvc = &(*pvc)->next;
+
+    /* Notify other clients after removal for dependencies on new topology */
+    pvc = &vc->vlan->first_client;
+    while (*pvc != NULL)
+        if ((*pvc)->vlan_client_removed)
+            (*pvc)->vlan_client_removed((*pvc)->opaque);
 }
 
 int qemu_can_send_packet(VLANClientState *vc1)
@@ -458,6 +467,38 @@ ssize_t qemu_sendv_packet(VLANClientState *vc1, const struct iovec *iov,
     return max_len;
 }
 
+/*
+ * This function allows a qemu vlan client to request filtering on
+ * it's receiving interface.  We only use this for the typical case
+ * of having exactly two clients on the vlan.  Clients making use
+ * of this function should also implement a vlan_client_added()
+ * function to be notified when they need to fall back to their own
+ * filtering or disable filtering setup for a single consumer.
+ *
+ * A return value of 1 indicates the entire requested filter set
+ * is handled by the provided filter.
+ */
+int qemu_vlan_rxfilter(VLANClientState *vc1, int flags, int count,
+                       uint8_t *buf)
+{
+    VLANState *vlan = vc1->vlan;
+    VLANClientState *vc;
+
+    if (vlan->first_client == NULL || vlan->first_client->next == NULL ||
+        vlan->first_client->next->next != NULL)
+        return 0;
+
+    if (vlan->first_client == vc1)
+        vc = vc1->next;
+    else
+        vc = vlan->first_client;
+
+    if (vc->fd_rxfilter)
+        return (vc->fd_rxfilter(vc->opaque, flags, count, buf) == count);
+
+    return 0;
+}
+
 #if defined(CONFIG_SLIRP)
 
 /* slirp network adapter */
diff --git a/net.h b/net.h
index 291807a..2090111 100644
--- a/net.h
+++ b/net.h
@@ -11,13 +11,22 @@ typedef struct VLANClientState VLANClientState;
 
 typedef void (LinkStatusChanged)(VLANClientState *);
 
+typedef int (IORXFilter)(void *, unsigned int , int , uint8_t *);
+
+typedef void (VLANClientAdded)(void *);
+
+typedef void (VLANClientRemoved)(void *);
+
 struct VLANClientState {
     IOReadHandler *fd_read;
     IOReadvHandler *fd_readv;
     /* Packets may still be sent if this returns zero.  It's used to
        rate-limit the slirp code.  */
     IOCanRWHandler *fd_can_read;
+    IORXFilter *fd_rxfilter;
     LinkStatusChanged *link_status_changed;
+    VLANClientAdded *vlan_client_added;
+    VLANClientRemoved *vlan_client_removed;
     int link_down;
     void *opaque;
     struct VLANClientState *next;
@@ -51,10 +60,15 @@ void qemu_check_nic_model(NICInfo *nd, const char *model);
 void qemu_check_nic_model_list(NICInfo *nd, const char * const *models,
                                const char *default_model);
 void qemu_handler_true(void *opaque);
+int qemu_vlan_rxfilter(VLANClientState *vc1, int flags, int count,
+                       uint8_t *buf);
 
 void do_info_network(void);
 int do_set_link(const char *name, const char *up_or_down);
 
+#define QEMU_NET_PROMISC     1
+#define QEMU_NET_ALLMULTI    2
+
 /* NIC info */
 
 #define MAX_NICS 8


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 1/4] qemu:net: Add infrastructure for setting an RX filter through the vlan
@ 2009-02-10 21:28   ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, kvm

Some interfaces, such as tap, can set a filter to prevent unwanted
packets from interrupting qemu.  This patch adds infrastructure for
vlan clients to request a set of filtering parameters.  The main
client interface for this is the new qemu_vlan_rxfilter() function.
For simplicity, we restrict the topology that can make use of a filter
to exactly two clients on a vlan.  To handle hotplug devices, we add
vlan_client_added/removed notifiers which can automatically revert
back to a shared media protocol.  Devices providing filtering can do
so via the new fd_rxfilter() interface.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

 net.c |   43 ++++++++++++++++++++++++++++++++++++++++++-
 net.h |   14 ++++++++++++++
 2 files changed, 56 insertions(+), 1 deletions(-)

diff --git a/net.c b/net.c
index e7c097e..e68cb40 100644
--- a/net.c
+++ b/net.c
@@ -345,8 +345,11 @@ VLANClientState *qemu_new_vlan_client(VLANState *vlan,
 
     vc->next = NULL;
     pvc = &vlan->first_client;
-    while (*pvc != NULL)
+    while (*pvc != NULL) {
+        if ((*pvc)->vlan_client_added)
+            (*pvc)->vlan_client_added((*pvc)->opaque);
         pvc = &(*pvc)->next;
+    }
     *pvc = vc;
     return vc;
 }
@@ -364,6 +367,12 @@ void qemu_del_vlan_client(VLANClientState *vc)
             break;
         } else
             pvc = &(*pvc)->next;
+
+    /* Notify other clients after removal for dependencies on new topology */
+    pvc = &vc->vlan->first_client;
+    while (*pvc != NULL)
+        if ((*pvc)->vlan_client_removed)
+            (*pvc)->vlan_client_removed((*pvc)->opaque);
 }
 
 int qemu_can_send_packet(VLANClientState *vc1)
@@ -458,6 +467,38 @@ ssize_t qemu_sendv_packet(VLANClientState *vc1, const struct iovec *iov,
     return max_len;
 }
 
+/*
+ * This function allows a qemu vlan client to request filtering on
+ * it's receiving interface.  We only use this for the typical case
+ * of having exactly two clients on the vlan.  Clients making use
+ * of this function should also implement a vlan_client_added()
+ * function to be notified when they need to fall back to their own
+ * filtering or disable filtering setup for a single consumer.
+ *
+ * A return value of 1 indicates the entire requested filter set
+ * is handled by the provided filter.
+ */
+int qemu_vlan_rxfilter(VLANClientState *vc1, int flags, int count,
+                       uint8_t *buf)
+{
+    VLANState *vlan = vc1->vlan;
+    VLANClientState *vc;
+
+    if (vlan->first_client == NULL || vlan->first_client->next == NULL ||
+        vlan->first_client->next->next != NULL)
+        return 0;
+
+    if (vlan->first_client == vc1)
+        vc = vc1->next;
+    else
+        vc = vlan->first_client;
+
+    if (vc->fd_rxfilter)
+        return (vc->fd_rxfilter(vc->opaque, flags, count, buf) == count);
+
+    return 0;
+}
+
 #if defined(CONFIG_SLIRP)
 
 /* slirp network adapter */
diff --git a/net.h b/net.h
index 291807a..2090111 100644
--- a/net.h
+++ b/net.h
@@ -11,13 +11,22 @@ typedef struct VLANClientState VLANClientState;
 
 typedef void (LinkStatusChanged)(VLANClientState *);
 
+typedef int (IORXFilter)(void *, unsigned int , int , uint8_t *);
+
+typedef void (VLANClientAdded)(void *);
+
+typedef void (VLANClientRemoved)(void *);
+
 struct VLANClientState {
     IOReadHandler *fd_read;
     IOReadvHandler *fd_readv;
     /* Packets may still be sent if this returns zero.  It's used to
        rate-limit the slirp code.  */
     IOCanRWHandler *fd_can_read;
+    IORXFilter *fd_rxfilter;
     LinkStatusChanged *link_status_changed;
+    VLANClientAdded *vlan_client_added;
+    VLANClientRemoved *vlan_client_removed;
     int link_down;
     void *opaque;
     struct VLANClientState *next;
@@ -51,10 +60,15 @@ void qemu_check_nic_model(NICInfo *nd, const char *model);
 void qemu_check_nic_model_list(NICInfo *nd, const char * const *models,
                                const char *default_model);
 void qemu_handler_true(void *opaque);
+int qemu_vlan_rxfilter(VLANClientState *vc1, int flags, int count,
+                       uint8_t *buf);
 
 void do_info_network(void);
 int do_set_link(const char *name, const char *up_or_down);
 
+#define QEMU_NET_PROMISC     1
+#define QEMU_NET_ALLMULTI    2
+
 /* NIC info */
 
 #define MAX_NICS 8

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 2/4] qemu:net: Add TAP support for RX filtering on Linux
  2009-02-10 21:28 ` [Qemu-devel] " Alex Williamson
@ 2009-02-10 21:28   ` Alex Williamson
  -1 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, alex.williamson

The Linux tap driver provides an ioctl to set a TX filter.  Setting
this restricts the packets received onto the vlan.  We provide a
hotplug callback to clear the filter when a new device gets added.
The new rxfilter=off option can be used to disable exporting this
feature to backend drivers.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

 net.c         |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++------
 qemu-doc.texi |    6 ++++--
 vl.c          |    3 ++-
 3 files changed, 56 insertions(+), 9 deletions(-)

diff --git a/net.c b/net.c
index e68cb40..973efea 100644
--- a/net.c
+++ b/net.c
@@ -724,6 +724,38 @@ static ssize_t tap_receive_iov(void *opaque, const struct iovec *iov,
 }
 #endif
 
+#ifdef TUNSETTXFILTER
+static int tap_rxfilter(void *opaque, unsigned int flags, int count,
+                         uint8_t *list)
+{
+    TAPState *s = opaque;
+    struct tun_filter *filter;
+    int ret;
+
+    if (flags & QEMU_NET_PROMISC)
+        count = 0;
+
+    filter = qemu_mallocz(sizeof(*filter) + (count * ETH_ALEN));
+
+    memcpy(filter->addr, list, count * ETH_ALEN);
+    filter->count = count;
+
+    if (flags & QEMU_NET_ALLMULTI)
+        filter->flags |= TUN_FLT_ALLMULTI;
+  
+    ret = ioctl(s->fd, TUNSETTXFILTER, filter);
+
+    qemu_free(filter);
+
+    return ret;
+}
+
+static void tap_vlan_client_added(void *opaque)
+{
+    tap_rxfilter(opaque, 0, 0, NULL);
+}
+#endif
+
 static void tap_receive(void *opaque, const uint8_t *buf, int size)
 {
     TAPState *s = opaque;
@@ -762,7 +794,7 @@ static void tap_send(void *opaque)
 static TAPState *net_tap_fd_init(VLANState *vlan,
                                  const char *model,
                                  const char *name,
-                                 int fd)
+                                 int fd, int rxfilter)
 {
     TAPState *s;
 
@@ -772,6 +804,12 @@ static TAPState *net_tap_fd_init(VLANState *vlan,
 #ifdef HAVE_IOVEC
     s->vc->fd_readv = tap_receive_iov;
 #endif
+#ifdef TUNSETTXFILTER
+    if (rxfilter) {
+        s->vc->fd_rxfilter = tap_rxfilter;
+        s->vc->vlan_client_added = tap_vlan_client_added;
+    }
+#endif
     qemu_set_fd_handler(s->fd, tap_send, NULL, s);
     snprintf(s->vc->info_str, sizeof(s->vc->info_str), "fd=%d", fd);
     return s;
@@ -1005,7 +1043,8 @@ static int launch_script(const char *setup_script, const char *ifname, int fd)
 
 static int net_tap_init(VLANState *vlan, const char *model,
                         const char *name, const char *ifname1,
-                        const char *setup_script, const char *down_script)
+                        const char *setup_script, const char *down_script,
+                        int rxfilter)
 {
     TAPState *s;
     int fd;
@@ -1025,7 +1064,7 @@ static int net_tap_init(VLANState *vlan, const char *model,
 	if (launch_script(setup_script, ifname, fd))
 	    return -1;
     }
-    s = net_tap_fd_init(vlan, model, name, fd);
+    s = net_tap_fd_init(vlan, model, name, fd, rxfilter);
     if (!s)
         return -1;
     snprintf(s->vc->info_str, sizeof(s->vc->info_str),
@@ -1663,13 +1702,18 @@ int net_client_init(const char *device, const char *p)
     if (!strcmp(device, "tap")) {
         char ifname[64];
         char setup_script[1024], down_script[1024];
-        int fd;
+        int fd, rxfilter = 1;
         vlan->nb_host_devs++;
+
+        if (get_param_value(buf, sizeof(buf), "rxfilter", p) > 0)
+            if (!strcmp(buf, "off"))
+                rxfilter = 0;
+        
         if (get_param_value(buf, sizeof(buf), "fd", p) > 0) {
             fd = strtol(buf, NULL, 0);
             fcntl(fd, F_SETFL, O_NONBLOCK);
             ret = -1;
-            if (net_tap_fd_init(vlan, device, name, fd))
+            if (net_tap_fd_init(vlan, device, name, fd, rxfilter))
                 ret = 0;
         } else {
             if (get_param_value(ifname, sizeof(ifname), "ifname", p) <= 0) {
@@ -1681,7 +1725,7 @@ int net_client_init(const char *device, const char *p)
             if (get_param_value(down_script, sizeof(down_script), "downscript", p) == 0) {
                 pstrcpy(down_script, sizeof(down_script), DEFAULT_NETWORK_DOWN_SCRIPT);
             }
-            ret = net_tap_init(vlan, device, name, ifname, setup_script, down_script);
+            ret = net_tap_init(vlan, device, name, ifname, setup_script, down_script, rxfilter);
         }
     } else
 #endif
diff --git a/qemu-doc.texi b/qemu-doc.texi
index efb88d2..9a3957c 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -643,7 +643,7 @@ Use the user mode network stack which requires no administrator
 privilege to run.  @option{hostname=name} can be used to specify the client
 hostname reported by the builtin DHCP server.
 
-@item -net tap[,vlan=@var{n}][,name=@var{name}][,fd=@var{h}][,ifname=@var{name}][,script=@var{file}][,downscript=@var{dfile}]
+@item -net tap[,vlan=@var{n}][,name=@var{name}][,fd=@var{h}][,ifname=@var{name}][,script=@var{file}][,downscript=@var{dfile}][,rxfilter=on|off]
 Connect the host TAP network interface @var{name} to VLAN @var{n}, use
 the network script @var{file} to configure it and the network script 
 @var{dfile} to deconfigure it. If @var{name} is not provided, the OS 
@@ -651,7 +651,9 @@ automatically provides one. @option{fd}=@var{h} can be used to specify
 the handle of an already opened host TAP interface. The default network 
 configure script is @file{/etc/qemu-ifup} and the default network 
 deconfigure script is @file{/etc/qemu-ifdown}. Use @option{script=no} 
-or @option{downscript=no} to disable script execution. Example:
+or @option{downscript=no} to disable script execution. The rxfilter
+(Linux-only) option enables or disables availability of the tap device
+MAC filtering (default on). Example:
 
 @example
 qemu linux.img -net nic -net tap
diff --git a/vl.c b/vl.c
index aff2b2c..96ab696 100644
--- a/vl.c
+++ b/vl.c
@@ -3909,12 +3909,13 @@ static void help(int exitcode)
            "-net tap[,vlan=n][,name=str],ifname=name\n"
            "                connect the host TAP network interface to VLAN 'n'\n"
 #else
-           "-net tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile]\n"
+           "-net tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile][,rxfilter=on|off]\n"
            "                connect the host TAP network interface to VLAN 'n' and use the\n"
            "                network scripts 'file' (default=%s)\n"
            "                and 'dfile' (default=%s);\n"
            "                use '[down]script=no' to disable script execution;\n"
            "                use 'fd=h' to connect to an already opened TAP interface\n"
+           "                rxfilter enables|disables use of the tap MAC filter (default on)\n"
 #endif
            "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
            "                connect the vlan 'n' to another VLAN using a socket connection\n"


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 2/4] qemu:net: Add TAP support for RX filtering on Linux
@ 2009-02-10 21:28   ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, kvm

The Linux tap driver provides an ioctl to set a TX filter.  Setting
this restricts the packets received onto the vlan.  We provide a
hotplug callback to clear the filter when a new device gets added.
The new rxfilter=off option can be used to disable exporting this
feature to backend drivers.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

 net.c         |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++------
 qemu-doc.texi |    6 ++++--
 vl.c          |    3 ++-
 3 files changed, 56 insertions(+), 9 deletions(-)

diff --git a/net.c b/net.c
index e68cb40..973efea 100644
--- a/net.c
+++ b/net.c
@@ -724,6 +724,38 @@ static ssize_t tap_receive_iov(void *opaque, const struct iovec *iov,
 }
 #endif
 
+#ifdef TUNSETTXFILTER
+static int tap_rxfilter(void *opaque, unsigned int flags, int count,
+                         uint8_t *list)
+{
+    TAPState *s = opaque;
+    struct tun_filter *filter;
+    int ret;
+
+    if (flags & QEMU_NET_PROMISC)
+        count = 0;
+
+    filter = qemu_mallocz(sizeof(*filter) + (count * ETH_ALEN));
+
+    memcpy(filter->addr, list, count * ETH_ALEN);
+    filter->count = count;
+
+    if (flags & QEMU_NET_ALLMULTI)
+        filter->flags |= TUN_FLT_ALLMULTI;
+  
+    ret = ioctl(s->fd, TUNSETTXFILTER, filter);
+
+    qemu_free(filter);
+
+    return ret;
+}
+
+static void tap_vlan_client_added(void *opaque)
+{
+    tap_rxfilter(opaque, 0, 0, NULL);
+}
+#endif
+
 static void tap_receive(void *opaque, const uint8_t *buf, int size)
 {
     TAPState *s = opaque;
@@ -762,7 +794,7 @@ static void tap_send(void *opaque)
 static TAPState *net_tap_fd_init(VLANState *vlan,
                                  const char *model,
                                  const char *name,
-                                 int fd)
+                                 int fd, int rxfilter)
 {
     TAPState *s;
 
@@ -772,6 +804,12 @@ static TAPState *net_tap_fd_init(VLANState *vlan,
 #ifdef HAVE_IOVEC
     s->vc->fd_readv = tap_receive_iov;
 #endif
+#ifdef TUNSETTXFILTER
+    if (rxfilter) {
+        s->vc->fd_rxfilter = tap_rxfilter;
+        s->vc->vlan_client_added = tap_vlan_client_added;
+    }
+#endif
     qemu_set_fd_handler(s->fd, tap_send, NULL, s);
     snprintf(s->vc->info_str, sizeof(s->vc->info_str), "fd=%d", fd);
     return s;
@@ -1005,7 +1043,8 @@ static int launch_script(const char *setup_script, const char *ifname, int fd)
 
 static int net_tap_init(VLANState *vlan, const char *model,
                         const char *name, const char *ifname1,
-                        const char *setup_script, const char *down_script)
+                        const char *setup_script, const char *down_script,
+                        int rxfilter)
 {
     TAPState *s;
     int fd;
@@ -1025,7 +1064,7 @@ static int net_tap_init(VLANState *vlan, const char *model,
 	if (launch_script(setup_script, ifname, fd))
 	    return -1;
     }
-    s = net_tap_fd_init(vlan, model, name, fd);
+    s = net_tap_fd_init(vlan, model, name, fd, rxfilter);
     if (!s)
         return -1;
     snprintf(s->vc->info_str, sizeof(s->vc->info_str),
@@ -1663,13 +1702,18 @@ int net_client_init(const char *device, const char *p)
     if (!strcmp(device, "tap")) {
         char ifname[64];
         char setup_script[1024], down_script[1024];
-        int fd;
+        int fd, rxfilter = 1;
         vlan->nb_host_devs++;
+
+        if (get_param_value(buf, sizeof(buf), "rxfilter", p) > 0)
+            if (!strcmp(buf, "off"))
+                rxfilter = 0;
+        
         if (get_param_value(buf, sizeof(buf), "fd", p) > 0) {
             fd = strtol(buf, NULL, 0);
             fcntl(fd, F_SETFL, O_NONBLOCK);
             ret = -1;
-            if (net_tap_fd_init(vlan, device, name, fd))
+            if (net_tap_fd_init(vlan, device, name, fd, rxfilter))
                 ret = 0;
         } else {
             if (get_param_value(ifname, sizeof(ifname), "ifname", p) <= 0) {
@@ -1681,7 +1725,7 @@ int net_client_init(const char *device, const char *p)
             if (get_param_value(down_script, sizeof(down_script), "downscript", p) == 0) {
                 pstrcpy(down_script, sizeof(down_script), DEFAULT_NETWORK_DOWN_SCRIPT);
             }
-            ret = net_tap_init(vlan, device, name, ifname, setup_script, down_script);
+            ret = net_tap_init(vlan, device, name, ifname, setup_script, down_script, rxfilter);
         }
     } else
 #endif
diff --git a/qemu-doc.texi b/qemu-doc.texi
index efb88d2..9a3957c 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -643,7 +643,7 @@ Use the user mode network stack which requires no administrator
 privilege to run.  @option{hostname=name} can be used to specify the client
 hostname reported by the builtin DHCP server.
 
-@item -net tap[,vlan=@var{n}][,name=@var{name}][,fd=@var{h}][,ifname=@var{name}][,script=@var{file}][,downscript=@var{dfile}]
+@item -net tap[,vlan=@var{n}][,name=@var{name}][,fd=@var{h}][,ifname=@var{name}][,script=@var{file}][,downscript=@var{dfile}][,rxfilter=on|off]
 Connect the host TAP network interface @var{name} to VLAN @var{n}, use
 the network script @var{file} to configure it and the network script 
 @var{dfile} to deconfigure it. If @var{name} is not provided, the OS 
@@ -651,7 +651,9 @@ automatically provides one. @option{fd}=@var{h} can be used to specify
 the handle of an already opened host TAP interface. The default network 
 configure script is @file{/etc/qemu-ifup} and the default network 
 deconfigure script is @file{/etc/qemu-ifdown}. Use @option{script=no} 
-or @option{downscript=no} to disable script execution. Example:
+or @option{downscript=no} to disable script execution. The rxfilter
+(Linux-only) option enables or disables availability of the tap device
+MAC filtering (default on). Example:
 
 @example
 qemu linux.img -net nic -net tap
diff --git a/vl.c b/vl.c
index aff2b2c..96ab696 100644
--- a/vl.c
+++ b/vl.c
@@ -3909,12 +3909,13 @@ static void help(int exitcode)
            "-net tap[,vlan=n][,name=str],ifname=name\n"
            "                connect the host TAP network interface to VLAN 'n'\n"
 #else
-           "-net tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile]\n"
+           "-net tap[,vlan=n][,name=str][,fd=h][,ifname=name][,script=file][,downscript=dfile][,rxfilter=on|off]\n"
            "                connect the host TAP network interface to VLAN 'n' and use the\n"
            "                network scripts 'file' (default=%s)\n"
            "                and 'dfile' (default=%s);\n"
            "                use '[down]script=no' to disable script execution;\n"
            "                use 'fd=h' to connect to an already opened TAP interface\n"
+           "                rxfilter enables|disables use of the tap MAC filter (default on)\n"
 #endif
            "-net socket[,vlan=n][,name=str][,fd=h][,listen=[host]:port][,connect=host:port]\n"
            "                connect the vlan 'n' to another VLAN using a socket connection\n"

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-10 21:28 ` [Qemu-devel] " Alex Williamson
@ 2009-02-10 21:28   ` Alex Williamson
  -1 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, alex.williamson

Make use of qemu_vlan_rxfilter so that we can filter at a lower
level.  We implement callbacks for devices being added and removed
so that we can fall back to our own filtering or make another attempt
to push filtering off to someone else.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

 hw/virtio-net.c |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 62153e9..3e4f526 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -35,6 +35,7 @@ typedef struct VirtIONet
     int mergeable_rx_bufs;
     int promisc;
     int allmulti;
+    int vlan_rxfilter;
     struct {
         int in_use;
         uint8_t *macs;
@@ -51,6 +52,43 @@ static VirtIONet *to_virtio_net(VirtIODevice *vdev)
     return (VirtIONet *)vdev;
 }
 
+static void virtio_net_set_vlan_rxfilter(VirtIONet *n)
+{
+    static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+    uint8_t *buf;
+    int flags = 0;
+
+    if (n->promisc)
+        flags |= QEMU_NET_PROMISC;
+    if (n->allmulti)
+        flags |= QEMU_NET_ALLMULTI;
+
+    buf = qemu_mallocz((n->mac_table.in_use + 2) * ETH_ALEN);
+
+    memcpy(&buf[ETH_ALEN*0], n->mac, ETH_ALEN);
+    memcpy(&buf[ETH_ALEN*1], n->mac_table.macs, n->mac_table.in_use * ETH_ALEN);
+    /* broadcast is a multicast addr, so add it at the end */
+    memcpy(&buf[ETH_ALEN*(n->mac_table.in_use + 1)], bcast, ETH_ALEN);
+
+    n->vlan_rxfilter = qemu_vlan_rxfilter(n->vc, flags,
+                                          n->mac_table.in_use + 2, buf);
+    qemu_free(buf);
+}
+
+static void virtio_net_vlan_client_added(void *opaque)
+{
+    VirtIONet *n = opaque;
+
+    n->vlan_rxfilter = 0;
+}
+
+static void virtio_net_vlan_client_removed(void *opaque)
+{
+    VirtIONet *n = opaque;
+
+    virtio_net_set_vlan_rxfilter(n);
+}
+
 static void virtio_net_get_config(VirtIODevice *vdev, uint8_t *config)
 {
     VirtIONet *n = to_virtio_net(vdev);
@@ -70,6 +108,7 @@ static void virtio_net_set_config(VirtIODevice *vdev, const uint8_t *config)
 
     if (memcmp(netcfg.mac, n->mac, ETH_ALEN)) {
         memcpy(n->mac, netcfg.mac, ETH_ALEN);
+        virtio_net_set_vlan_rxfilter(n);
         qemu_format_nic_info_str(n->vc, n->mac);
     }
 }
@@ -99,6 +138,7 @@ static void virtio_net_reset(VirtIODevice *vdev)
     /* Flush any MAC and VLAN filter table state */
     n->mac_table.in_use = 0;
     memset(n->mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
+    virtio_net_set_vlan_rxfilter(n);
     memset(n->vlans, 0, MAX_VLAN >> 3);
 }
 
@@ -247,6 +287,10 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 
         virtqueue_push(vq, &elem, sizeof(status));
         virtio_notify(vdev, vq);
+
+        if (ctrl.class == VIRTIO_NET_CTRL_RX_MODE ||
+            ctrl.class == VIRTIO_NET_CTRL_MAC)
+            virtio_net_set_vlan_rxfilter(n);
     }
 }
 
@@ -334,15 +378,19 @@ static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
             return 0;
     }
 
-    if ((ptr[0] & 1) && n->allmulti)
+    /* If hw filtering is complete, no need to continue */
+    if (n->vlan_rxfilter)
         return 1;
 
-    if (!memcmp(ptr, bcast, sizeof(bcast)))
+    if ((ptr[0] & 1) && n->allmulti)
         return 1;
 
     if (!memcmp(ptr, n->mac, ETH_ALEN))
         return 1;
 
+    if (!memcmp(ptr, bcast, sizeof(bcast)))
+        return 1;
+
     for (i = 0; i < n->mac_table.in_use; i++) {
         if (!memcmp(ptr, &n->mac_table.macs[i * ETH_ALEN], ETH_ALEN))
             return 1;
@@ -552,6 +600,8 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
     if (version_id >= 6)
         qemu_get_buffer(f, (uint8_t *)n->vlans, MAX_VLAN >> 3);
 
+    virtio_net_set_vlan_rxfilter(n);
+
     if (n->tx_timer_active) {
         qemu_mod_timer(n->tx_timer,
                        qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
@@ -589,6 +639,8 @@ void virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn)
     n->vc = qemu_new_vlan_client(nd->vlan, nd->model, nd->name,
                                  virtio_net_receive, virtio_net_can_receive, n);
     n->vc->link_status_changed = virtio_net_set_link_status;
+    n->vc->vlan_client_added = virtio_net_vlan_client_added;
+    n->vc->vlan_client_removed = virtio_net_vlan_client_removed;
 
     qemu_format_nic_info_str(n->vc, n->mac);
 


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
@ 2009-02-10 21:28   ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, kvm

Make use of qemu_vlan_rxfilter so that we can filter at a lower
level.  We implement callbacks for devices being added and removed
so that we can fall back to our own filtering or make another attempt
to push filtering off to someone else.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

 hw/virtio-net.c |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 62153e9..3e4f526 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -35,6 +35,7 @@ typedef struct VirtIONet
     int mergeable_rx_bufs;
     int promisc;
     int allmulti;
+    int vlan_rxfilter;
     struct {
         int in_use;
         uint8_t *macs;
@@ -51,6 +52,43 @@ static VirtIONet *to_virtio_net(VirtIODevice *vdev)
     return (VirtIONet *)vdev;
 }
 
+static void virtio_net_set_vlan_rxfilter(VirtIONet *n)
+{
+    static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+    uint8_t *buf;
+    int flags = 0;
+
+    if (n->promisc)
+        flags |= QEMU_NET_PROMISC;
+    if (n->allmulti)
+        flags |= QEMU_NET_ALLMULTI;
+
+    buf = qemu_mallocz((n->mac_table.in_use + 2) * ETH_ALEN);
+
+    memcpy(&buf[ETH_ALEN*0], n->mac, ETH_ALEN);
+    memcpy(&buf[ETH_ALEN*1], n->mac_table.macs, n->mac_table.in_use * ETH_ALEN);
+    /* broadcast is a multicast addr, so add it at the end */
+    memcpy(&buf[ETH_ALEN*(n->mac_table.in_use + 1)], bcast, ETH_ALEN);
+
+    n->vlan_rxfilter = qemu_vlan_rxfilter(n->vc, flags,
+                                          n->mac_table.in_use + 2, buf);
+    qemu_free(buf);
+}
+
+static void virtio_net_vlan_client_added(void *opaque)
+{
+    VirtIONet *n = opaque;
+
+    n->vlan_rxfilter = 0;
+}
+
+static void virtio_net_vlan_client_removed(void *opaque)
+{
+    VirtIONet *n = opaque;
+
+    virtio_net_set_vlan_rxfilter(n);
+}
+
 static void virtio_net_get_config(VirtIODevice *vdev, uint8_t *config)
 {
     VirtIONet *n = to_virtio_net(vdev);
@@ -70,6 +108,7 @@ static void virtio_net_set_config(VirtIODevice *vdev, const uint8_t *config)
 
     if (memcmp(netcfg.mac, n->mac, ETH_ALEN)) {
         memcpy(n->mac, netcfg.mac, ETH_ALEN);
+        virtio_net_set_vlan_rxfilter(n);
         qemu_format_nic_info_str(n->vc, n->mac);
     }
 }
@@ -99,6 +138,7 @@ static void virtio_net_reset(VirtIODevice *vdev)
     /* Flush any MAC and VLAN filter table state */
     n->mac_table.in_use = 0;
     memset(n->mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
+    virtio_net_set_vlan_rxfilter(n);
     memset(n->vlans, 0, MAX_VLAN >> 3);
 }
 
@@ -247,6 +287,10 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 
         virtqueue_push(vq, &elem, sizeof(status));
         virtio_notify(vdev, vq);
+
+        if (ctrl.class == VIRTIO_NET_CTRL_RX_MODE ||
+            ctrl.class == VIRTIO_NET_CTRL_MAC)
+            virtio_net_set_vlan_rxfilter(n);
     }
 }
 
@@ -334,15 +378,19 @@ static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
             return 0;
     }
 
-    if ((ptr[0] & 1) && n->allmulti)
+    /* If hw filtering is complete, no need to continue */
+    if (n->vlan_rxfilter)
         return 1;
 
-    if (!memcmp(ptr, bcast, sizeof(bcast)))
+    if ((ptr[0] & 1) && n->allmulti)
         return 1;
 
     if (!memcmp(ptr, n->mac, ETH_ALEN))
         return 1;
 
+    if (!memcmp(ptr, bcast, sizeof(bcast)))
+        return 1;
+
     for (i = 0; i < n->mac_table.in_use; i++) {
         if (!memcmp(ptr, &n->mac_table.macs[i * ETH_ALEN], ETH_ALEN))
             return 1;
@@ -552,6 +600,8 @@ static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
     if (version_id >= 6)
         qemu_get_buffer(f, (uint8_t *)n->vlans, MAX_VLAN >> 3);
 
+    virtio_net_set_vlan_rxfilter(n);
+
     if (n->tx_timer_active) {
         qemu_mod_timer(n->tx_timer,
                        qemu_get_clock(vm_clock) + TX_TIMER_INTERVAL);
@@ -589,6 +639,8 @@ void virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn)
     n->vc = qemu_new_vlan_client(nd->vlan, nd->model, nd->name,
                                  virtio_net_receive, virtio_net_can_receive, n);
     n->vc->link_status_changed = virtio_net_set_link_status;
+    n->vc->vlan_client_added = virtio_net_vlan_client_added;
+    n->vc->vlan_client_removed = virtio_net_vlan_client_removed;
 
     qemu_format_nic_info_str(n->vc, n->mac);
 

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 4/4] qemu:e1000: Add support for qemu_vlan_rxfilter
  2009-02-10 21:28 ` [Qemu-devel] " Alex Williamson
@ 2009-02-10 21:29   ` Alex Williamson
  -1 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, alex.williamson

Make use of qemu_vlan_rxfilter so that we can filter at a lower
level.  We implement callbacks for devices being added and removed
so that we can fall back to our own filtering or make another attempt
to push filtering off to someone else.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

 hw/e1000.c |   60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 60 insertions(+), 0 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index 6f841d6..7855a91 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -85,6 +85,7 @@ typedef struct E1000State_st {
     uint32_t rxbuf_size;
     uint32_t rxbuf_min_shift;
     int check_rxov;
+    int vlan_rxfilter;
     struct e1000_tx {
         unsigned char header[256];
         unsigned char vlan_header[4];
@@ -143,6 +144,53 @@ static const char phy_regcap[0x20] = {
     [PHY_ID2] = PHY_R,		[M88E1000_PHY_SPEC_STATUS] = PHY_R
 };
 
+static void e1000_set_vlan_rxfilter(E1000State *s)
+{
+    static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+    uint32_t rctl = s->mac_reg[RCTL], ra[2], *rp;
+    uint8_t *buf;
+    int i, flags = 0;
+
+    if (rctl & E1000_RCTL_UPE)
+        flags |= QEMU_NET_PROMISC;
+    if (rctl & E1000_RCTL_MPE)
+        flags |= QEMU_NET_ALLMULTI;
+
+    /* Allocate for maximum size, 16 + bcast */
+    buf = qemu_mallocz(17 * 6);
+
+    for (i = 0, rp = s->mac_reg + RA; rp < s->mac_reg + RA + 32; rp += 2) {
+        if (!(rp[1] & E1000_RAH_AV))
+            continue;
+        ra[0] = cpu_to_le32(rp[0]);
+        ra[1] = cpu_to_le32(rp[1]);
+        memcpy(&buf[i * 6], (uint8_t *)ra, 6);
+        i++;
+    }
+
+    if (rctl & E1000_RCTL_BAM) {
+        memcpy(&buf[i * 6], bcast, 6);
+        i++;
+    }
+
+    s->vlan_rxfilter = qemu_vlan_rxfilter(s->vc, flags, i, buf);
+    qemu_free(buf);
+}
+
+static void e1000_vlan_client_added(void *opaque)
+{
+    E1000State *s = opaque;
+
+    s->vlan_rxfilter = 0;
+}
+
+static void e1000_vlan_client_removed(void *opaque)
+{
+    E1000State *s = opaque;
+
+    e1000_set_vlan_rxfilter(s);
+}
+
 static void
 ioport_map(PCIDevice *pci_dev, int region_num, uint32_t addr,
            uint32_t size, int type)
@@ -198,6 +246,7 @@ set_rx_control(E1000State *s, int index, uint32_t val)
     s->rxbuf_min_shift = ((val / E1000_RCTL_RDMTS_QUAT) & 3) + 1;
     DBGOUT(RX, "RCTL: %d, mac_reg[RCTL] = 0x%x\n", s->mac_reg[RDT],
            s->mac_reg[RCTL]);
+    e1000_set_vlan_rxfilter(s);
 }
 
 static void
@@ -532,6 +581,9 @@ receive_filter(E1000State *s, const uint8_t *buf, int size)
             return 0;
     }
 
+    if (s->vlan_rxfilter)
+        return 1;
+
     if (rctl & E1000_RCTL_UPE)			// promiscuous
         return 1;
 
@@ -712,6 +764,9 @@ static void
 mac_writereg(E1000State *s, int index, uint32_t val)
 {
     s->mac_reg[index] = val;
+
+    if (index >= E1000_RA && index < E1000_RA + 32)
+        e1000_set_vlan_rxfilter(s);
 }
 
 static void
@@ -964,6 +1019,9 @@ nic_load(QEMUFile *f, void *opaque, int version_id)
         for (j = 0; j < mac_regarraystosave[i].size; j++)
             qemu_get_be32s(f,
                            s->mac_reg + mac_regarraystosave[i].array0 + j);
+
+    e1000_set_vlan_rxfilter(s);
+
     return 0;
 }
 
@@ -1088,6 +1146,8 @@ pci_e1000_init(PCIBus *bus, NICInfo *nd, int devfn)
     d->vc = qemu_new_vlan_client(nd->vlan, nd->model, nd->name,
                                  e1000_receive, e1000_can_receive, d);
     d->vc->link_status_changed = e1000_set_link_status;
+    d->vc->vlan_client_added = e1000_vlan_client_added;
+    d->vc->vlan_client_removed = e1000_vlan_client_removed;
 
     qemu_format_nic_info_str(d->vc, d->nd->macaddr);
 


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 4/4] qemu:e1000: Add support for qemu_vlan_rxfilter
@ 2009-02-10 21:29   ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-10 21:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, kvm

Make use of qemu_vlan_rxfilter so that we can filter at a lower
level.  We implement callbacks for devices being added and removed
so that we can fall back to our own filtering or make another attempt
to push filtering off to someone else.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

 hw/e1000.c |   60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 60 insertions(+), 0 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index 6f841d6..7855a91 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -85,6 +85,7 @@ typedef struct E1000State_st {
     uint32_t rxbuf_size;
     uint32_t rxbuf_min_shift;
     int check_rxov;
+    int vlan_rxfilter;
     struct e1000_tx {
         unsigned char header[256];
         unsigned char vlan_header[4];
@@ -143,6 +144,53 @@ static const char phy_regcap[0x20] = {
     [PHY_ID2] = PHY_R,		[M88E1000_PHY_SPEC_STATUS] = PHY_R
 };
 
+static void e1000_set_vlan_rxfilter(E1000State *s)
+{
+    static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+    uint32_t rctl = s->mac_reg[RCTL], ra[2], *rp;
+    uint8_t *buf;
+    int i, flags = 0;
+
+    if (rctl & E1000_RCTL_UPE)
+        flags |= QEMU_NET_PROMISC;
+    if (rctl & E1000_RCTL_MPE)
+        flags |= QEMU_NET_ALLMULTI;
+
+    /* Allocate for maximum size, 16 + bcast */
+    buf = qemu_mallocz(17 * 6);
+
+    for (i = 0, rp = s->mac_reg + RA; rp < s->mac_reg + RA + 32; rp += 2) {
+        if (!(rp[1] & E1000_RAH_AV))
+            continue;
+        ra[0] = cpu_to_le32(rp[0]);
+        ra[1] = cpu_to_le32(rp[1]);
+        memcpy(&buf[i * 6], (uint8_t *)ra, 6);
+        i++;
+    }
+
+    if (rctl & E1000_RCTL_BAM) {
+        memcpy(&buf[i * 6], bcast, 6);
+        i++;
+    }
+
+    s->vlan_rxfilter = qemu_vlan_rxfilter(s->vc, flags, i, buf);
+    qemu_free(buf);
+}
+
+static void e1000_vlan_client_added(void *opaque)
+{
+    E1000State *s = opaque;
+
+    s->vlan_rxfilter = 0;
+}
+
+static void e1000_vlan_client_removed(void *opaque)
+{
+    E1000State *s = opaque;
+
+    e1000_set_vlan_rxfilter(s);
+}
+
 static void
 ioport_map(PCIDevice *pci_dev, int region_num, uint32_t addr,
            uint32_t size, int type)
@@ -198,6 +246,7 @@ set_rx_control(E1000State *s, int index, uint32_t val)
     s->rxbuf_min_shift = ((val / E1000_RCTL_RDMTS_QUAT) & 3) + 1;
     DBGOUT(RX, "RCTL: %d, mac_reg[RCTL] = 0x%x\n", s->mac_reg[RDT],
            s->mac_reg[RCTL]);
+    e1000_set_vlan_rxfilter(s);
 }
 
 static void
@@ -532,6 +581,9 @@ receive_filter(E1000State *s, const uint8_t *buf, int size)
             return 0;
     }
 
+    if (s->vlan_rxfilter)
+        return 1;
+
     if (rctl & E1000_RCTL_UPE)			// promiscuous
         return 1;
 
@@ -712,6 +764,9 @@ static void
 mac_writereg(E1000State *s, int index, uint32_t val)
 {
     s->mac_reg[index] = val;
+
+    if (index >= E1000_RA && index < E1000_RA + 32)
+        e1000_set_vlan_rxfilter(s);
 }
 
 static void
@@ -964,6 +1019,9 @@ nic_load(QEMUFile *f, void *opaque, int version_id)
         for (j = 0; j < mac_regarraystosave[i].size; j++)
             qemu_get_be32s(f,
                            s->mac_reg + mac_regarraystosave[i].array0 + j);
+
+    e1000_set_vlan_rxfilter(s);
+
     return 0;
 }
 
@@ -1088,6 +1146,8 @@ pci_e1000_init(PCIBus *bus, NICInfo *nd, int devfn)
     d->vc = qemu_new_vlan_client(nd->vlan, nd->model, nd->name,
                                  e1000_receive, e1000_can_receive, d);
     d->vc->link_status_changed = e1000_set_link_status;
+    d->vc->vlan_client_added = e1000_vlan_client_added;
+    d->vc->vlan_client_removed = e1000_vlan_client_removed;
 
     qemu_format_nic_info_str(d->vc, d->nd->macaddr);
 

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 4/4] qemu:e1000: Add support for qemu_vlan_rxfilter
  2009-02-10 21:29   ` [Qemu-devel] " Alex Williamson
@ 2009-02-11 15:11     ` Alex Williamson
  -1 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-11 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm

On Tue, 2009-02-10 at 14:29 -0700, Alex Williamson wrote:
> Make use of qemu_vlan_rxfilter so that we can filter at a lower
> level.  We implement callbacks for devices being added and removed
> so that we can fall back to our own filtering or make another attempt
> to push filtering off to someone else.

Hmm, I've thought of a bug in this patch, I'm not accounting for the
multicast hash e1000 can use for imperfect filtering.  I'll need to
rework this one, maybe skip the tap filter if the hash is in use.  This
is independent of the other patches though.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] Re: [PATCH 4/4] qemu:e1000: Add support for qemu_vlan_rxfilter
@ 2009-02-11 15:11     ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-11 15:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm

On Tue, 2009-02-10 at 14:29 -0700, Alex Williamson wrote:
> Make use of qemu_vlan_rxfilter so that we can filter at a lower
> level.  We implement callbacks for devices being added and removed
> so that we can fall back to our own filtering or make another attempt
> to push filtering off to someone else.

Hmm, I've thought of a bug in this patch, I'm not accounting for the
multicast hash e1000 can use for imperfect filtering.  I'll need to
rework this one, maybe skip the tap filter if the hash is in use.  This
is independent of the other patches though.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 4/4] qemu:e1000: Add support for qemu_vlan_rxfilter
  2009-02-10 21:29   ` [Qemu-devel] " Alex Williamson
@ 2009-02-11 17:11     ` Alex Williamson
  -1 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-11 17:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, alex.williamson

Make use of qemu_vlan_rxfilter so that we can filter at a lower
level.  We implement callbacks for devices being added and removed
so that we can fall back to our own filtering or make another attempt
to push filtering off to someone else.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

Updated to address e1000 multicast hash table.

 hw/e1000.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index 6f841d6..e3e5fa5 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -85,6 +85,7 @@ typedef struct E1000State_st {
     uint32_t rxbuf_size;
     uint32_t rxbuf_min_shift;
     int check_rxov;
+    int vlan_rxfilter;
     struct e1000_tx {
         unsigned char header[256];
         unsigned char vlan_header[4];
@@ -143,6 +144,66 @@ static const char phy_regcap[0x20] = {
     [PHY_ID2] = PHY_R,		[M88E1000_PHY_SPEC_STATUS] = PHY_R
 };
 
+static void e1000_set_vlan_rxfilter(E1000State *s)
+{
+    static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+    uint32_t rctl = s->mac_reg[RCTL], ra[2], *rp;
+    uint8_t *buf;
+    int i, flags = 0;
+
+    /*
+     * If the guest has encoded anything in the multicast hash table,
+     * we're stuck doing the filtering ourselves.  This should only
+     * happen if the guest has already consumed the RAR table.
+     */
+    for (i = 0; i < 128; i++) {
+        if (s->mac_reg[MTA + i]) {
+            qemu_vlan_rxfilter(s->vc, QEMU_NET_PROMISC, 0, NULL);
+            s->vlan_rxfilter = 0;
+            return;
+        }
+    }
+
+    if (rctl & E1000_RCTL_UPE)
+        flags |= QEMU_NET_PROMISC;
+    if (rctl & E1000_RCTL_MPE)
+        flags |= QEMU_NET_ALLMULTI;
+
+    /* Allocate for maximum size, 16 + bcast */
+    buf = qemu_mallocz(17 * 6);
+
+    for (i = 0, rp = s->mac_reg + RA; rp < s->mac_reg + RA + 32; rp += 2) {
+        if (!(rp[1] & E1000_RAH_AV))
+            continue;
+        ra[0] = cpu_to_le32(rp[0]);
+        ra[1] = cpu_to_le32(rp[1]);
+        memcpy(&buf[i * 6], (uint8_t *)ra, 6);
+        i++;
+    }
+
+    if (rctl & E1000_RCTL_BAM) {
+        memcpy(&buf[i * 6], bcast, 6);
+        i++;
+    }
+
+    s->vlan_rxfilter = qemu_vlan_rxfilter(s->vc, flags, i, buf);
+    qemu_free(buf);
+}
+
+static void e1000_vlan_client_added(void *opaque)
+{
+    E1000State *s = opaque;
+
+    s->vlan_rxfilter = 0;
+}
+
+static void e1000_vlan_client_removed(void *opaque)
+{
+    E1000State *s = opaque;
+
+    e1000_set_vlan_rxfilter(s);
+}
+
 static void
 ioport_map(PCIDevice *pci_dev, int region_num, uint32_t addr,
            uint32_t size, int type)
@@ -198,6 +259,7 @@ set_rx_control(E1000State *s, int index, uint32_t val)
     s->rxbuf_min_shift = ((val / E1000_RCTL_RDMTS_QUAT) & 3) + 1;
     DBGOUT(RX, "RCTL: %d, mac_reg[RCTL] = 0x%x\n", s->mac_reg[RDT],
            s->mac_reg[RCTL]);
+    e1000_set_vlan_rxfilter(s);
 }
 
 static void
@@ -532,6 +594,9 @@ receive_filter(E1000State *s, const uint8_t *buf, int size)
             return 0;
     }
 
+    if (s->vlan_rxfilter)
+        return 1;
+
     if (rctl & E1000_RCTL_UPE)			// promiscuous
         return 1;
 
@@ -715,6 +780,13 @@ mac_writereg(E1000State *s, int index, uint32_t val)
 }
 
 static void
+mac_writereg_rx(E1000State *s, int index, uint32_t val)
+{
+    s->mac_reg[index] = val;
+    e1000_set_vlan_rxfilter(s);
+}
+
+static void
 set_rdt(E1000State *s, int index, uint32_t val)
 {
     s->check_rxov = 0;
@@ -790,8 +862,8 @@ static void (*macreg_writeops[])(E1000State *, int, uint32_t) = {
     [TDH] = set_16bit,	[RDH] = set_16bit,	[RDT] = set_rdt,
     [IMC] = set_imc,	[IMS] = set_ims,	[ICR] = set_icr,
     [EECD] = set_eecd,	[RCTL] = set_rx_control,
-    [RA ... RA+31] = &mac_writereg,
-    [MTA ... MTA+127] = &mac_writereg,
+    [RA ... RA+31] = &mac_writereg_rx,
+    [MTA ... MTA+127] = &mac_writereg_rx,
     [VFTA ... VFTA+127] = &mac_writereg,
 };
 enum { NWRITEOPS = ARRAY_SIZE(macreg_writeops) };
@@ -964,6 +1036,9 @@ nic_load(QEMUFile *f, void *opaque, int version_id)
         for (j = 0; j < mac_regarraystosave[i].size; j++)
             qemu_get_be32s(f,
                            s->mac_reg + mac_regarraystosave[i].array0 + j);
+
+    e1000_set_vlan_rxfilter(s);
+
     return 0;
 }
 
@@ -1088,6 +1163,8 @@ pci_e1000_init(PCIBus *bus, NICInfo *nd, int devfn)
     d->vc = qemu_new_vlan_client(nd->vlan, nd->model, nd->name,
                                  e1000_receive, e1000_can_receive, d);
     d->vc->link_status_changed = e1000_set_link_status;
+    d->vc->vlan_client_added = e1000_vlan_client_added;
+    d->vc->vlan_client_removed = e1000_vlan_client_removed;
 
     qemu_format_nic_info_str(d->vc, d->nd->macaddr);
 


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [Qemu-devel] [PATCH 4/4] qemu:e1000: Add support for qemu_vlan_rxfilter
@ 2009-02-11 17:11     ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-11 17:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, kvm

Make use of qemu_vlan_rxfilter so that we can filter at a lower
level.  We implement callbacks for devices being added and removed
so that we can fall back to our own filtering or make another attempt
to push filtering off to someone else.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---

Updated to address e1000 multicast hash table.

 hw/e1000.c |   81 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index 6f841d6..e3e5fa5 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -85,6 +85,7 @@ typedef struct E1000State_st {
     uint32_t rxbuf_size;
     uint32_t rxbuf_min_shift;
     int check_rxov;
+    int vlan_rxfilter;
     struct e1000_tx {
         unsigned char header[256];
         unsigned char vlan_header[4];
@@ -143,6 +144,66 @@ static const char phy_regcap[0x20] = {
     [PHY_ID2] = PHY_R,		[M88E1000_PHY_SPEC_STATUS] = PHY_R
 };
 
+static void e1000_set_vlan_rxfilter(E1000State *s)
+{
+    static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+    uint32_t rctl = s->mac_reg[RCTL], ra[2], *rp;
+    uint8_t *buf;
+    int i, flags = 0;
+
+    /*
+     * If the guest has encoded anything in the multicast hash table,
+     * we're stuck doing the filtering ourselves.  This should only
+     * happen if the guest has already consumed the RAR table.
+     */
+    for (i = 0; i < 128; i++) {
+        if (s->mac_reg[MTA + i]) {
+            qemu_vlan_rxfilter(s->vc, QEMU_NET_PROMISC, 0, NULL);
+            s->vlan_rxfilter = 0;
+            return;
+        }
+    }
+
+    if (rctl & E1000_RCTL_UPE)
+        flags |= QEMU_NET_PROMISC;
+    if (rctl & E1000_RCTL_MPE)
+        flags |= QEMU_NET_ALLMULTI;
+
+    /* Allocate for maximum size, 16 + bcast */
+    buf = qemu_mallocz(17 * 6);
+
+    for (i = 0, rp = s->mac_reg + RA; rp < s->mac_reg + RA + 32; rp += 2) {
+        if (!(rp[1] & E1000_RAH_AV))
+            continue;
+        ra[0] = cpu_to_le32(rp[0]);
+        ra[1] = cpu_to_le32(rp[1]);
+        memcpy(&buf[i * 6], (uint8_t *)ra, 6);
+        i++;
+    }
+
+    if (rctl & E1000_RCTL_BAM) {
+        memcpy(&buf[i * 6], bcast, 6);
+        i++;
+    }
+
+    s->vlan_rxfilter = qemu_vlan_rxfilter(s->vc, flags, i, buf);
+    qemu_free(buf);
+}
+
+static void e1000_vlan_client_added(void *opaque)
+{
+    E1000State *s = opaque;
+
+    s->vlan_rxfilter = 0;
+}
+
+static void e1000_vlan_client_removed(void *opaque)
+{
+    E1000State *s = opaque;
+
+    e1000_set_vlan_rxfilter(s);
+}
+
 static void
 ioport_map(PCIDevice *pci_dev, int region_num, uint32_t addr,
            uint32_t size, int type)
@@ -198,6 +259,7 @@ set_rx_control(E1000State *s, int index, uint32_t val)
     s->rxbuf_min_shift = ((val / E1000_RCTL_RDMTS_QUAT) & 3) + 1;
     DBGOUT(RX, "RCTL: %d, mac_reg[RCTL] = 0x%x\n", s->mac_reg[RDT],
            s->mac_reg[RCTL]);
+    e1000_set_vlan_rxfilter(s);
 }
 
 static void
@@ -532,6 +594,9 @@ receive_filter(E1000State *s, const uint8_t *buf, int size)
             return 0;
     }
 
+    if (s->vlan_rxfilter)
+        return 1;
+
     if (rctl & E1000_RCTL_UPE)			// promiscuous
         return 1;
 
@@ -715,6 +780,13 @@ mac_writereg(E1000State *s, int index, uint32_t val)
 }
 
 static void
+mac_writereg_rx(E1000State *s, int index, uint32_t val)
+{
+    s->mac_reg[index] = val;
+    e1000_set_vlan_rxfilter(s);
+}
+
+static void
 set_rdt(E1000State *s, int index, uint32_t val)
 {
     s->check_rxov = 0;
@@ -790,8 +862,8 @@ static void (*macreg_writeops[])(E1000State *, int, uint32_t) = {
     [TDH] = set_16bit,	[RDH] = set_16bit,	[RDT] = set_rdt,
     [IMC] = set_imc,	[IMS] = set_ims,	[ICR] = set_icr,
     [EECD] = set_eecd,	[RCTL] = set_rx_control,
-    [RA ... RA+31] = &mac_writereg,
-    [MTA ... MTA+127] = &mac_writereg,
+    [RA ... RA+31] = &mac_writereg_rx,
+    [MTA ... MTA+127] = &mac_writereg_rx,
     [VFTA ... VFTA+127] = &mac_writereg,
 };
 enum { NWRITEOPS = ARRAY_SIZE(macreg_writeops) };
@@ -964,6 +1036,9 @@ nic_load(QEMUFile *f, void *opaque, int version_id)
         for (j = 0; j < mac_regarraystosave[i].size; j++)
             qemu_get_be32s(f,
                            s->mac_reg + mac_regarraystosave[i].array0 + j);
+
+    e1000_set_vlan_rxfilter(s);
+
     return 0;
 }
 
@@ -1088,6 +1163,8 @@ pci_e1000_init(PCIBus *bus, NICInfo *nd, int devfn)
     d->vc = qemu_new_vlan_client(nd->vlan, nd->model, nd->name,
                                  e1000_receive, e1000_can_receive, d);
     d->vc->link_status_changed = e1000_set_link_status;
+    d->vc->vlan_client_added = e1000_vlan_client_added;
+    d->vc->vlan_client_removed = e1000_vlan_client_removed;
 
     qemu_format_nic_info_str(d->vc, d->nd->macaddr);
 

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/4] qemu: TAP filtering support
  2009-02-10 21:28 ` [Qemu-devel] " Alex Williamson
@ 2009-02-11 19:31   ` Mark McLoughlin
  -1 siblings, 0 replies; 43+ messages in thread
From: Mark McLoughlin @ 2009-02-11 19:31 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, kvm

Hi Alex,

Just had a quick looked over these and they seem pretty good, but some
broad comments:

  - The logic around "is this a NIC+TAP pair?" would be good to have a 
    better API around. We need this to merge virtio GSO support too. 
    Anthony had some ideas here.

  - I think you could keep the client_added()/removed() logic in net.c 
    and things would be a lot cleaner. I think you just want to trigger 
    a reload of the filter, right? So a "reload this filter" callback 
    to qemu_vlan_rxfilter() might do it.

  - What do we need rxfilter=on|off on the command line for?

Cheers,
Mark.




^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] Re: [PATCH 0/4] qemu: TAP filtering support
@ 2009-02-11 19:31   ` Mark McLoughlin
  0 siblings, 0 replies; 43+ messages in thread
From: Mark McLoughlin @ 2009-02-11 19:31 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, kvm

Hi Alex,

Just had a quick looked over these and they seem pretty good, but some
broad comments:

  - The logic around "is this a NIC+TAP pair?" would be good to have a 
    better API around. We need this to merge virtio GSO support too. 
    Anthony had some ideas here.

  - I think you could keep the client_added()/removed() logic in net.c 
    and things would be a lot cleaner. I think you just want to trigger 
    a reload of the filter, right? So a "reload this filter" callback 
    to qemu_vlan_rxfilter() might do it.

  - What do we need rxfilter=on|off on the command line for?

Cheers,
Mark.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/4] qemu: TAP filtering support
  2009-02-11 19:31   ` [Qemu-devel] " Mark McLoughlin
@ 2009-02-11 19:43     ` Anthony Liguori
  -1 siblings, 0 replies; 43+ messages in thread
From: Anthony Liguori @ 2009-02-11 19:43 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: Alex Williamson, qemu-devel, kvm

Mark McLoughlin wrote:
> Hi Alex,
>
> Just had a quick looked over these and they seem pretty good, but some
> broad comments:
>
>   - The logic around "is this a NIC+TAP pair?" would be good to have a 
>     better API around. We need this to merge virtio GSO support too. 
>     Anthony had some ideas here.
>   

Right now, the networking API is based on VLANs and VLAN clients.  This 
is fine for a simple interface of sending/receiving packets one at a time.

When you want to support special features (like filtering, GSO) or want 
to supply many receive buffers, this falls apart because the VLAN client 
needs to consider what features are supported by every other VLAN client 
and also deal with things like hot add/remove from a VLAN.

A better architecture IMHO would be to have a NIC backend/frontend 
architecture.  This provides a rather simple interface to support 
feature negotiation and optimizes the interface for the common case (one 
VLAN client per VLAN).

You would then implement the VLAN layer as a NIC backend.  The 
user-visible interface wouldn't change at all.  You could even go as far 
as to special case the circumstance where you had a VLAN containing one 
host device and one guest device and simply avoid using the VLAN at all.

Regards,

Anthony Liguori

>   - I think you could keep the client_added()/removed() logic in net.c 
>     and things would be a lot cleaner. I think you just want to trigger 
>     a reload of the filter, right? So a "reload this filter" callback 
>     to qemu_vlan_rxfilter() might do it.
>
>   - What do we need rxfilter=on|off on the command line for?
>
> Cheers,
> Mark.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] Re: [PATCH 0/4] qemu: TAP filtering support
@ 2009-02-11 19:43     ` Anthony Liguori
  0 siblings, 0 replies; 43+ messages in thread
From: Anthony Liguori @ 2009-02-11 19:43 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: qemu-devel, Alex Williamson, kvm

Mark McLoughlin wrote:
> Hi Alex,
>
> Just had a quick looked over these and they seem pretty good, but some
> broad comments:
>
>   - The logic around "is this a NIC+TAP pair?" would be good to have a 
>     better API around. We need this to merge virtio GSO support too. 
>     Anthony had some ideas here.
>   

Right now, the networking API is based on VLANs and VLAN clients.  This 
is fine for a simple interface of sending/receiving packets one at a time.

When you want to support special features (like filtering, GSO) or want 
to supply many receive buffers, this falls apart because the VLAN client 
needs to consider what features are supported by every other VLAN client 
and also deal with things like hot add/remove from a VLAN.

A better architecture IMHO would be to have a NIC backend/frontend 
architecture.  This provides a rather simple interface to support 
feature negotiation and optimizes the interface for the common case (one 
VLAN client per VLAN).

You would then implement the VLAN layer as a NIC backend.  The 
user-visible interface wouldn't change at all.  You could even go as far 
as to special case the circumstance where you had a VLAN containing one 
host device and one guest device and simply avoid using the VLAN at all.

Regards,

Anthony Liguori

>   - I think you could keep the client_added()/removed() logic in net.c 
>     and things would be a lot cleaner. I think you just want to trigger 
>     a reload of the filter, right? So a "reload this filter" callback 
>     to qemu_vlan_rxfilter() might do it.
>
>   - What do we need rxfilter=on|off on the command line for?
>
> Cheers,
> Mark.
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/4] qemu: TAP filtering support
  2009-02-11 19:31   ` [Qemu-devel] " Mark McLoughlin
@ 2009-02-11 19:51     ` Alex Williamson
  -1 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-11 19:51 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: qemu-devel, kvm

Hi Mark,

Thanks for the comments.

On Wed, 2009-02-11 at 19:31 +0000, Mark McLoughlin wrote:
> 
>   - The logic around "is this a NIC+TAP pair?" would be good to have a 
>     better API around. We need this to merge virtio GSO support too. 
>     Anthony had some ideas here.

Ok, I'll see if I can dig that up.

>   - I think you could keep the client_added()/removed() logic in net.c 
>     and things would be a lot cleaner. I think you just want to trigger 
>     a reload of the filter, right? So a "reload this filter" callback 
>     to qemu_vlan_rxfilter() might do it.

In this series, the vlan doesn't maintain state for the filter, so a
reload requires interaction of the NIC backend.  Anthony had requested a
common software filter that would make something like you're suggesting
easier, but I'm still wrestling with the nuances of how that might work.
Things like the e1000 multicast hash throw a kink in the plan, which
left me with each NIC needing to re-enable it's own filtering when
another client is added.

>   - What do we need rxfilter=on|off on the command line for?

Primarily because the current tun driver in Linux has a bug that it can
drop unicast packets requested to be included in the filter if it
overflows the exact match table.

http://www.spinics.net/lists/netdev/msg88451.html

Thanks,

Alex


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] Re: [PATCH 0/4] qemu: TAP filtering support
@ 2009-02-11 19:51     ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-11 19:51 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: qemu-devel, kvm

Hi Mark,

Thanks for the comments.

On Wed, 2009-02-11 at 19:31 +0000, Mark McLoughlin wrote:
> 
>   - The logic around "is this a NIC+TAP pair?" would be good to have a 
>     better API around. We need this to merge virtio GSO support too. 
>     Anthony had some ideas here.

Ok, I'll see if I can dig that up.

>   - I think you could keep the client_added()/removed() logic in net.c 
>     and things would be a lot cleaner. I think you just want to trigger 
>     a reload of the filter, right? So a "reload this filter" callback 
>     to qemu_vlan_rxfilter() might do it.

In this series, the vlan doesn't maintain state for the filter, so a
reload requires interaction of the NIC backend.  Anthony had requested a
common software filter that would make something like you're suggesting
easier, but I'm still wrestling with the nuances of how that might work.
Things like the e1000 multicast hash throw a kink in the plan, which
left me with each NIC needing to re-enable it's own filtering when
another client is added.

>   - What do we need rxfilter=on|off on the command line for?

Primarily because the current tun driver in Linux has a bug that it can
drop unicast packets requested to be included in the filter if it
overflows the exact match table.

http://www.spinics.net/lists/netdev/msg88451.html

Thanks,

Alex

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/4] qemu: TAP filtering support
  2009-02-11 19:51     ` [Qemu-devel] " Alex Williamson
@ 2009-02-11 20:19       ` Mark McLoughlin
  -1 siblings, 0 replies; 43+ messages in thread
From: Mark McLoughlin @ 2009-02-11 20:19 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, kvm

On Wed, 2009-02-11 at 12:51 -0700, Alex Williamson wrote:
> >   - What do we need rxfilter=on|off on the command line for?
> 
> Primarily because the current tun driver in Linux has a bug that it can
> drop unicast packets requested to be included in the filter if it
> overflows the exact match table.
> 
> http://www.spinics.net/lists/netdev/msg88451.html

Oh dear. Does this suggest we should make rxfilter=off the default?

Is there any way we can automatically detect we're running on a broken
kernel?

Thanks,
Mark.



^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] Re: [PATCH 0/4] qemu: TAP filtering support
@ 2009-02-11 20:19       ` Mark McLoughlin
  0 siblings, 0 replies; 43+ messages in thread
From: Mark McLoughlin @ 2009-02-11 20:19 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, kvm

On Wed, 2009-02-11 at 12:51 -0700, Alex Williamson wrote:
> >   - What do we need rxfilter=on|off on the command line for?
> 
> Primarily because the current tun driver in Linux has a bug that it can
> drop unicast packets requested to be included in the filter if it
> overflows the exact match table.
> 
> http://www.spinics.net/lists/netdev/msg88451.html

Oh dear. Does this suggest we should make rxfilter=off the default?

Is there any way we can automatically detect we're running on a broken
kernel?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/4] qemu: TAP filtering support
  2009-02-11 20:19       ` [Qemu-devel] " Mark McLoughlin
@ 2009-02-11 20:37         ` Alex Williamson
  -1 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-11 20:37 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: qemu-devel, kvm

On Wed, 2009-02-11 at 20:19 +0000, Mark McLoughlin wrote:
> On Wed, 2009-02-11 at 12:51 -0700, Alex Williamson wrote:
> > >   - What do we need rxfilter=on|off on the command line for?
> > 
> > Primarily because the current tun driver in Linux has a bug that it can
> > drop unicast packets requested to be included in the filter if it
> > overflows the exact match table.
> > 
> > http://www.spinics.net/lists/netdev/msg88451.html
> 
> Oh dear. Does this suggest we should make rxfilter=off the default?
> 
> Is there any way we can automatically detect we're running on a broken
> kernel?

TUNSETTXFITLER has only existed since 2.6.26, so the ioctl will fail on
anything older and it will be disabled anyway.  The patch will fix .29
and should get rolled into .28 stable, so we're looking at an exposure
of 2 kernel releases.  Unfortunately a few community distros went out on
those kernels, so perhaps the prudent approach would be to make the
default disabled until we're a few releases beyond.  I don't know any
way you could detect it outside of ugly parsing of uname -r.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [Qemu-devel] Re: [PATCH 0/4] qemu: TAP filtering support
@ 2009-02-11 20:37         ` Alex Williamson
  0 siblings, 0 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-11 20:37 UTC (permalink / raw)
  To: Mark McLoughlin; +Cc: qemu-devel, kvm

On Wed, 2009-02-11 at 20:19 +0000, Mark McLoughlin wrote:
> On Wed, 2009-02-11 at 12:51 -0700, Alex Williamson wrote:
> > >   - What do we need rxfilter=on|off on the command line for?
> > 
> > Primarily because the current tun driver in Linux has a bug that it can
> > drop unicast packets requested to be included in the filter if it
> > overflows the exact match table.
> > 
> > http://www.spinics.net/lists/netdev/msg88451.html
> 
> Oh dear. Does this suggest we should make rxfilter=off the default?
> 
> Is there any way we can automatically detect we're running on a broken
> kernel?

TUNSETTXFITLER has only existed since 2.6.26, so the ioctl will fail on
anything older and it will be disabled anyway.  The patch will fix .29
and should get rolled into .28 stable, so we're looking at an exposure
of 2 kernel releases.  Unfortunately a few community distros went out on
those kernels, so perhaps the prudent approach would be to make the
default disabled until we're a few releases beyond.  I don't know any
way you could detect it outside of ugly parsing of uname -r.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-10 21:28   ` [Qemu-devel] " Alex Williamson
@ 2009-02-12 16:26     ` Paul Brook
  -1 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-12 16:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, kvm

> +static void virtio_net_vlan_client_added(void *opaque)
>...
> +static void virtio_net_vlan_client_removed(void *opaque)

Why are these two different?

It looks like what you really want is a callback for "Something changed,
and you need to reset your MAC filter."

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
@ 2009-02-12 16:26     ` Paul Brook
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-12 16:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Alex Williamson

> +static void virtio_net_vlan_client_added(void *opaque)
>...
> +static void virtio_net_vlan_client_removed(void *opaque)

Why are these two different?

It looks like what you really want is a callback for "Something changed,
and you need to reset your MAC filter."

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-12 16:26     ` Paul Brook
  (?)
@ 2009-02-12 16:36     ` Alex Williamson
  2009-02-12 17:05         ` Paul Brook
  -1 siblings, 1 reply; 43+ messages in thread
From: Alex Williamson @ 2009-02-12 16:36 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel, kvm

On Thu, 2009-02-12 at 16:26 +0000, Paul Brook wrote:
> > +static void virtio_net_vlan_client_added(void *opaque)
> >...
> > +static void virtio_net_vlan_client_removed(void *opaque)
> 
> Why are these two different?
> 
> It looks like what you really want is a callback for "Something changed,
> and you need to reset your MAC filter."

I think we'd end up with a race if we only had one callback.  For
instance if "change" was the result of a vlan client being removed, the
tap would clear the filter and the nic would try to install a filter.
The results would be different based on the calling order.

Alex


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-12 16:36     ` Alex Williamson
@ 2009-02-12 17:05         ` Paul Brook
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-12 17:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, kvm

On Thursday 12 February 2009, Alex Williamson wrote:
> On Thu, 2009-02-12 at 16:26 +0000, Paul Brook wrote:
> > > +static void virtio_net_vlan_client_added(void *opaque)
> > >...
> > > +static void virtio_net_vlan_client_removed(void *opaque)
> >
> > Why are these two different?
> >
> > It looks like what you really want is a callback for "Something changed,
> > and you need to reset your MAC filter."
>
> I think we'd end up with a race if we only had one callback.  For
> instance if "change" was the result of a vlan client being removed, the
> tap would clear the filter and the nic would try to install a filter.
> The results would be different based on the calling order.

In that case either your implementation or your abstraction is wrong. 
Devices shouldn't need to know or care why a filter failed to apply.

I'm not sure what you mean by "the tap would clear the filter".

You have three options:

- Devices request a filter, and that request may fail. qemu notifies the 
device when it needs to retry the filter. It doesn't make any difference 
whether we think we just broke the old filter, or we think a previously 
failed filter may succeed now, the device response is the same: Try to set a 
filter and see if it works. This is what I assume you're trying to implement.
- Implement reliable filters. The device requests a filter once[1]. qemu makes 
sure that filter is always honored.
- Devices request a filter once. qemu remembers what that filter is, and 
notifies the device if/when that filter becomes active/inactive.

Paul

[1] once == every time the filter set changes.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
@ 2009-02-12 17:05         ` Paul Brook
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-12 17:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Alex Williamson

On Thursday 12 February 2009, Alex Williamson wrote:
> On Thu, 2009-02-12 at 16:26 +0000, Paul Brook wrote:
> > > +static void virtio_net_vlan_client_added(void *opaque)
> > >...
> > > +static void virtio_net_vlan_client_removed(void *opaque)
> >
> > Why are these two different?
> >
> > It looks like what you really want is a callback for "Something changed,
> > and you need to reset your MAC filter."
>
> I think we'd end up with a race if we only had one callback.  For
> instance if "change" was the result of a vlan client being removed, the
> tap would clear the filter and the nic would try to install a filter.
> The results would be different based on the calling order.

In that case either your implementation or your abstraction is wrong. 
Devices shouldn't need to know or care why a filter failed to apply.

I'm not sure what you mean by "the tap would clear the filter".

You have three options:

- Devices request a filter, and that request may fail. qemu notifies the 
device when it needs to retry the filter. It doesn't make any difference 
whether we think we just broke the old filter, or we think a previously 
failed filter may succeed now, the device response is the same: Try to set a 
filter and see if it works. This is what I assume you're trying to implement.
- Implement reliable filters. The device requests a filter once[1]. qemu makes 
sure that filter is always honored.
- Devices request a filter once. qemu remembers what that filter is, and 
notifies the device if/when that filter becomes active/inactive.

Paul

[1] once == every time the filter set changes.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-12 17:05         ` Paul Brook
  (?)
@ 2009-02-12 18:21         ` Alex Williamson
  2009-02-12 20:26           ` Jamie Lokier
  2009-02-13 12:40             ` Paul Brook
  -1 siblings, 2 replies; 43+ messages in thread
From: Alex Williamson @ 2009-02-12 18:21 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel, kvm

On Thu, 2009-02-12 at 17:05 +0000, Paul Brook wrote:
> On Thursday 12 February 2009, Alex Williamson wrote:
> > On Thu, 2009-02-12 at 16:26 +0000, Paul Brook wrote:
> > > > +static void virtio_net_vlan_client_added(void *opaque)
> > > >...
> > > > +static void virtio_net_vlan_client_removed(void *opaque)
> > >
> > > Why are these two different?
> > >
> > > It looks like what you really want is a callback for "Something changed,
> > > and you need to reset your MAC filter."
> >
> > I think we'd end up with a race if we only had one callback.  For
> > instance if "change" was the result of a vlan client being removed, the
> > tap would clear the filter and the nic would try to install a filter.
> > The results would be different based on the calling order.
> 
> In that case either your implementation or your abstraction is wrong. 
> Devices shouldn't need to know or care why a filter failed to apply.

They don't, but they do need to know whether a filter they previously
applied successfully is no longer in place.  If they don't know this,
they have to assume the filter on the other side of the vlan is
transient and always double check it with their own filtering, which
seems like a waste of cache and cycles.

Is your objection that the NIC needs to associate the change in filter
state with a vlan event?  If so, maybe we can re-architect the callbacks
to something more appropriate.  We would need both a filter_cleared()
and a filter_retry() to achieve the same mechanics.  filter_cleared()
feels like something the tap device (the "filterer") would call on the
NIC (the "filteree").  But then the "filterer" needs some callback to
know when to clear and notify, which gets us back to somebody needs to
associate a vlan event with a change in the filter.  Suggestions?

> I'm not sure what you mean by "the tap would clear the filter".

By clear, I mean disable the filter, bringing the vlan back to an
unfiltered state.

> You have three options:
> 
> - Devices request a filter, and that request may fail. qemu notifies the 
> device when it needs to retry the filter. It doesn't make any difference 
> whether we think we just broke the old filter, or we think a previously 
> failed filter may succeed now, the device response is the same: Try to set a 
> filter and see if it works. This is what I assume you're trying to implement.
> - Implement reliable filters. The device requests a filter once[1]. qemu makes 
> sure that filter is always honored.
> - Devices request a filter once. qemu remembers what that filter is, and 
> notifies the device if/when that filter becomes active/inactive.

None of these are complete solutions.  My code behaves like this:

- A device requests a filter and is told if the request is successful
  - On success the device may skip it's own filtering
- If another vlan client is added, the following _must_ occur:
  - The "filterer" must clear (remove) the filter
  - The "filteree" must revert to using their own filtering
- If a vlan client is removed, the following _may_ occur:
  - vlan clients are notified that they may retry their filter

The added()/removed() interface feels like the right solution to this.
We could use a changed() function, but it would need to know the
direction of the change, which leads back to the same mechanics.  If
there's a better way, please suggest it.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 0/4] qemu: TAP filtering support
  2009-02-11 20:37         ` [Qemu-devel] " Alex Williamson
  (?)
@ 2009-02-12 19:57         ` Jamie Lokier
  -1 siblings, 0 replies; 43+ messages in thread
From: Jamie Lokier @ 2009-02-12 19:57 UTC (permalink / raw)
  To: qemu-devel; +Cc: Mark McLoughlin, kvm

Alex Williamson wrote:
> TUNSETTXFITLER has only existed since 2.6.26, so the ioctl will fail on
> anything older and it will be disabled anyway.  The patch will fix .29
> and should get rolled into .28 stable, so we're looking at an exposure
> of 2 kernel releases.  Unfortunately a few community distros went out on
> those kernels, so perhaps the prudent approach would be to make the
> default disabled until we're a few releases beyond.  I don't know any
> way you could detect it outside of ugly parsing of uname -r.  Thanks,

I would parse uname -r - it's not hard, Linux versions have always
matched %d.%d.%d, and map in a standard way to a 32-bit integer for
easy comparisons.

A few QEMU releases later, someone will run the new version on a
2.6.26 to 2.6.28 host.

It's quite usual in my experience to run a QEMU that's much newer than
the corresponding host kernel, because upgrading host is a big deal
(if you only have one or two) due to being heavily used, whereas
upgrading QEMU/KVM is much easier as you can do it one VM guest at a time.

-- Jamie

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-12 18:21         ` Alex Williamson
@ 2009-02-12 20:26           ` Jamie Lokier
  2009-02-13 12:40             ` Paul Brook
  1 sibling, 0 replies; 43+ messages in thread
From: Jamie Lokier @ 2009-02-12 20:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paul Brook, kvm

Alex Williamson wrote:
> They don't, but they do need to know whether a filter they previously
> applied successfully is no longer in place.  If they don't know this,
> they have to assume the filter on the other side of the vlan is
> transient and always double check it with their own filtering, which
> seems like a waste of cache and cycles.

Double checking is not a waste of cycles if the host filtering is good
but not guaranteed, as it still means fewer packets cross the host
kernel/user boundary.

How reliable are the host filter interfaces anyway?

    - Are they 100% reliable?  For example, I remember at one time on
      Linux, when application A listened for broadcast IP packets, you
      wouldn't receive any which has a unicast MAC address for a
      different host, except for the times when tcpdump was run in
      parallel.  That's an example of when the app sees the host filtering
      behave differently depending on other apps run at the same time.
      Multicast hash filtering is similar, but also depends on the host NIC.

    - When updated, do the host filters take effect from the instant
      the host syscall returned, or can there be already queued
      packets which bypass the new filter?

> - A device requests a filter and is told if the request is successful
>   - On success the device may skip it's own filtering
> - If another vlan client is added, the following _must_ occur:
>   - The "filterer" must clear (remove) the filter
>   - The "filteree" must revert to using their own filtering
> - If a vlan client is removed, the following _may_ occur:
>   - vlan clients are notified that they may retry their filter
> 
> The added()/removed() interface feels like the right solution to this.
> We could use a changed() function, but it would need to know the
> direction of the change, which leads back to the same mechanics.  If
> there's a better way, please suggest it.  Thanks,

If there are two vlan clients, you don't necessarily need them to use
their own filters, if they are both the same, or compatible in some way.

How about this:

   - On any change, notify all clients by walking the list twice.
   - Phase 1.  For each client:
     - The client requests a filter.
   - Phase 2.  For each client:
     - The client enquires whether its filter is in place.
       If yes, it relies on it.  If no or unreliable, it filters itself.

During this procedure, don't do any packet I/O.

During phase 1, don't send a request to the host kernel until just one
at the end (if the filter has changed), otherwise there will be a
brief time window with no host filter which could leak packets despite
no real change wanted and no packet I/O performed.

The same procedure is done for any configuration change, and together
with that change.  Actually: Stop packet I/O, then phase 1, then
update host kernel taps if they've changed, then phase 2, then restart
packet I/O.

-- Jamie

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-12 18:21         ` Alex Williamson
@ 2009-02-13 12:40             ` Paul Brook
  2009-02-13 12:40             ` Paul Brook
  1 sibling, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-13 12:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, kvm

> - A device requests a filter and is told if the request is successful
>   - On success the device may skip it's own filtering
> - If another vlan client is added, the following _must_ occur:
>   - The "filterer" must clear (remove) the filter
>   - The "filteree" must revert to using their own filtering
> - If a vlan client is removed, the following _may_ occur:
>   - vlan clients are notified that they may retry their filter
>..
> The added()/removed() interface feels like the right solution to this.

I think your analysis is fundamentally flawed. Firstly I'm not sure the above 
holds when going between 1 and 2 clients on a vlan.  Even ignoring that, you 
are making implicit assumptions about when a filter will succeed. If these 
assumptions are broken (which seems likely if we ever implement filtering 
with more than 2 devices on a vlan) they you'll get subtle breakage in every 
single user of the API.

> We could use a changed() function, but it would need to know the
> direction of the change, which leads back to the same mechanics.  If
> there's a better way, please suggest it.  Thanks,

I still don't see why the device needs to know what's changed. The response 
should always be the same: Request a filter, and see if it works.

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
@ 2009-02-13 12:40             ` Paul Brook
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-13 12:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Alex Williamson

> - A device requests a filter and is told if the request is successful
>   - On success the device may skip it's own filtering
> - If another vlan client is added, the following _must_ occur:
>   - The "filterer" must clear (remove) the filter
>   - The "filteree" must revert to using their own filtering
> - If a vlan client is removed, the following _may_ occur:
>   - vlan clients are notified that they may retry their filter
>..
> The added()/removed() interface feels like the right solution to this.

I think your analysis is fundamentally flawed. Firstly I'm not sure the above 
holds when going between 1 and 2 clients on a vlan.  Even ignoring that, you 
are making implicit assumptions about when a filter will succeed. If these 
assumptions are broken (which seems likely if we ever implement filtering 
with more than 2 devices on a vlan) they you'll get subtle breakage in every 
single user of the API.

> We could use a changed() function, but it would need to know the
> direction of the change, which leads back to the same mechanics.  If
> there's a better way, please suggest it.  Thanks,

I still don't see why the device needs to know what's changed. The response 
should always be the same: Request a filter, and see if it works.

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-13 12:40             ` Paul Brook
@ 2009-02-13 16:00               ` Jamie Lokier
  -1 siblings, 0 replies; 43+ messages in thread
From: Jamie Lokier @ 2009-02-13 16:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Alex Williamson

Paul Brook wrote:
> > We could use a changed() function, but it would need to know the
> > direction of the change, which leads back to the same mechanics.  If
> > there's a better way, please suggest it.  Thanks,
> 
> I still don't see why the device needs to know what's changed. The response 
> should always be the same: Request a filter, and see if it works.

Well, you do need some way to notify a client that the filter which
used to work has been removed because it's no longer available, for
example when migrating between host kernels.

Or implement reliable filtering in the infrastructure which relays
packets internally in QEMU, so that each client can request a filter
and it always works, and the tap driver can tell the QEMU
infrastructure when kernel filtering is working and not, but each
client doesn't need to know that.

-- Jamie

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
@ 2009-02-13 16:00               ` Jamie Lokier
  0 siblings, 0 replies; 43+ messages in thread
From: Jamie Lokier @ 2009-02-13 16:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alex Williamson, kvm

Paul Brook wrote:
> > We could use a changed() function, but it would need to know the
> > direction of the change, which leads back to the same mechanics.  If
> > there's a better way, please suggest it.  Thanks,
> 
> I still don't see why the device needs to know what's changed. The response 
> should always be the same: Request a filter, and see if it works.

Well, you do need some way to notify a client that the filter which
used to work has been removed because it's no longer available, for
example when migrating between host kernels.

Or implement reliable filtering in the infrastructure which relays
packets internally in QEMU, so that each client can request a filter
and it always works, and the tap driver can tell the QEMU
infrastructure when kernel filtering is working and not, but each
client doesn't need to know that.

-- Jamie

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-13 16:00               ` Jamie Lokier
@ 2009-02-13 16:17                 ` Paul Brook
  -1 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-13 16:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jamie Lokier, Alex Williamson, kvm

On Friday 13 February 2009, Jamie Lokier wrote:
> Paul Brook wrote:
> > > We could use a changed() function, but it would need to know the
> > > direction of the change, which leads back to the same mechanics.  If
> > > there's a better way, please suggest it.  Thanks,
> >
> > I still don't see why the device needs to know what's changed. The
> > response should always be the same: Request a filter, and see if it
> > works.
>
> Well, you do need some way to notify a client that the filter which
> used to work has been removed because it's no longer available, for
> example when migrating between host kernels.

This is actually more evidence that the "add" and "remove" callbacks are 
bogus.

My point is that we're allowing the filter to fail for arbitrary reasons. Once 
you assume this, trying to be clever is just going to catch you out when we 
encounter a case you didn't think of.

A simple "Something changed, please try your filter again" callback 
automatically covers all these cases.

It may be that in some cases qemu already knows the filter is going to fail. 
However these events are rare (i.e. not performance critical) so it's far 
simpler to just use the same callback and have the device try anyway.

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
@ 2009-02-13 16:17                 ` Paul Brook
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-13 16:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Alex Williamson

On Friday 13 February 2009, Jamie Lokier wrote:
> Paul Brook wrote:
> > > We could use a changed() function, but it would need to know the
> > > direction of the change, which leads back to the same mechanics.  If
> > > there's a better way, please suggest it.  Thanks,
> >
> > I still don't see why the device needs to know what's changed. The
> > response should always be the same: Request a filter, and see if it
> > works.
>
> Well, you do need some way to notify a client that the filter which
> used to work has been removed because it's no longer available, for
> example when migrating between host kernels.

This is actually more evidence that the "add" and "remove" callbacks are 
bogus.

My point is that we're allowing the filter to fail for arbitrary reasons. Once 
you assume this, trying to be clever is just going to catch you out when we 
encounter a case you didn't think of.

A simple "Something changed, please try your filter again" callback 
automatically covers all these cases.

It may be that in some cases qemu already knows the filter is going to fail. 
However these events are rare (i.e. not performance critical) so it's far 
simpler to just use the same callback and have the device try anyway.

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-13 16:17                 ` Paul Brook
  (?)
@ 2009-02-13 16:46                 ` Jamie Lokier
  2009-02-13 17:04                   ` Paul Brook
  -1 siblings, 1 reply; 43+ messages in thread
From: Jamie Lokier @ 2009-02-13 16:46 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel, Alex Williamson, kvm

Paul Brook wrote:
> > Well, you do need some way to notify a client that the filter which
> > used to work has been removed because it's no longer available, for
> > example when migrating between host kernels.
> 
> This is actually more evidence that the "add" and "remove" callbacks are 
> bogus.

I agree, "add" and "remove" are misnamed among other bogosities.

> My point is that we're allowing the filter to fail for arbitrary
> reasons. Once you assume this, trying to be clever is just going to
> catch you out when we encounter a case you didn't think of.

Yes.

> A simple "Something changed, please try your filter again" callback 
> automatically covers all these cases.

It doesn't, if tap has no memory of how many clients require a filter.

If tap just answers "YES and installs the kernel filter" or "NO and
doesn't install the kernel filter" and doesn't remember how many
clients need a filter, then:

    1. Client A requests filter A
         -> tap says YES, installs kernel filter.
    2. Client B requests filter B
         -> tap can't do both, so deinstalls kernel filter.
         -> tap says NO to client B.
         -> tap notifies client A "there's been a change".
    3. Client A requests filter A
         -> tap doesn't remember that client B wants a filter, so
         -> tap says YES.
         -> tap notifies client B "there's been a change".
    4. Goto 2
         -> endless loop.

In other words, tap needs to distinguish three states:

     "1 filter requested and installed in the kernel"
     ">1 filter requested, none installed in the kernel"
     "0 filters requested, none installed in the kernel"

The interface to clients needs to keep that state updated in tap
somehow.  Just requesting a filter and reverting to software filtering
when the request failed doesn't do that.

Note this has nothing to do with the software filtering fallback.
Even if qemu internal filtering is always done, you still need the
above to keep the kernel filter correct.

Imho, since there needs to be software filter code for fallback
_anyway_, a better place to put that fallback is in the
client-independent plumbing which relays packets to each client.  And
if it's there, you can have a different software fallback filter for
each client anyway, so clients can assume filters are always installed
and working perfectly.

The interface between the plumbing and the tap driver still needs to
handle the issues described above.

-- Jamie

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-13 16:46                 ` Jamie Lokier
@ 2009-02-13 17:04                   ` Paul Brook
  2009-02-13 20:38                     ` Jamie Lokier
  0 siblings, 1 reply; 43+ messages in thread
From: Paul Brook @ 2009-02-13 17:04 UTC (permalink / raw)
  To: Jamie Lokier; +Cc: qemu-devel, Alex Williamson, kvm

> > A simple "Something changed, please try your filter again" callback
> > automatically covers all these cases.
>
> It doesn't, if tap has no memory of how many clients require a filter.

I'm talking about a callback for devices requesting an inbound filter. Devices 
implementing outgoing (device->vlan) filters just do as they're told.

> If tap just answers "YES and installs the kernel filter" or "NO and
> doesn't install the kernel filter" and doesn't remember how many
> clients need a filter, then:
>...
> In other words, tap needs to distinguish three states:
>
>      "1 filter requested and installed in the kernel"
>      ">1 filter requested, none installed in the kernel"
>      "0 filters requested, none installed in the kernel"

Absolutely not. This is the reason we have separate the "request an incoming 
filter" API from the "provide an outgoing filter" callback. It allows the 
vlan code to arbitrate in the middle. A vlan is a bus network, not a set of 
point-point connections. I haven't checked whether the proposed patch gets 
this right. I suspect it probably doesn't.

This is why the initial patch that had clients talking to each other directly 
was completely wrong.

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-13 17:04                   ` Paul Brook
@ 2009-02-13 20:38                     ` Jamie Lokier
  2009-02-15 16:25                         ` Paul Brook
  0 siblings, 1 reply; 43+ messages in thread
From: Jamie Lokier @ 2009-02-13 20:38 UTC (permalink / raw)
  To: Paul Brook; +Cc: qemu-devel, Alex Williamson, kvm

Paul Brook wrote:
> > > A simple "Something changed, please try your filter again" callback
> > > automatically covers all these cases.
> >
> > It doesn't, if tap has no memory of how many clients require a filter.
> 
> I'm talking about a callback for devices requesting an inbound
> filter. Devices implementing outgoing (device->vlan) filters just do
> as they're told.

So am I.

> > If tap just answers "YES and installs the kernel filter" or "NO and
> > doesn't install the kernel filter" and doesn't remember how many
> > clients need a filter, then:
> >...
> > In other words, tap needs to distinguish three states:
> >
> >      "1 filter requested and installed in the kernel"
> >      ">1 filter requested, none installed in the kernel"
> >      "0 filters requested, none installed in the kernel"
> 
> Absolutely not. This is the reason we have separate the "request an incoming 
> filter" API from the "provide an outgoing filter" callback. It allows the 
> vlan code to arbitrate in the middle.

I'm guessing that "vlan code to arbitrate in the middle" is exactly
what I've suggested.

Maybe the description was messed up.  Substitute something else for
"tap" in the above: "The _vlan arbitration code which talks to the tap
device_ needs to distinguish three states...".

> A vlan is a bus network, not a set of point-point connections.

Yes, this is what I've assumed.

The callback you suggest for devices requesting an inbound filter will
infinite-loop when there's two such devices on the same vlan bus,
because each time the callback is called, that device will re-issue
its filter request which triggers the callback on the other similar
device.  Back and forth.

To avoid the infinite loop, the vlan code in the middle (if that's
where you want it, and I agree) has to distinguish between no inbound
filters requested by attached devices, and multiple incompatible
inbound filters requested by attached devices.

-- Jamie

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
  2009-02-13 20:38                     ` Jamie Lokier
@ 2009-02-15 16:25                         ` Paul Brook
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-15 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jamie Lokier, Alex Williamson, kvm

> The callback you suggest for devices requesting an inbound filter will
> infinite-loop when there's two such devices on the same vlan bus,
> because each time the callback is called, that device will re-issue
> its filter request which triggers the callback on the other similar
> device.  Back and forth.
>
> To avoid the infinite loop, the vlan code in the middle (if that's
> where you want it, and I agree) has to distinguish between no inbound
> filters requested by attached devices, and multiple incompatible
> inbound filters requested by attached devices.

Of course.

As I've said repeatedly, the only sane way to implement this is if you isolate 
individual devices from this kind of implementation detail. If you're API 
doesn't do that then it's wrong.

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter
@ 2009-02-15 16:25                         ` Paul Brook
  0 siblings, 0 replies; 43+ messages in thread
From: Paul Brook @ 2009-02-15 16:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: kvm, Alex Williamson

> The callback you suggest for devices requesting an inbound filter will
> infinite-loop when there's two such devices on the same vlan bus,
> because each time the callback is called, that device will re-issue
> its filter request which triggers the callback on the other similar
> device.  Back and forth.
>
> To avoid the infinite loop, the vlan code in the middle (if that's
> where you want it, and I agree) has to distinguish between no inbound
> filters requested by attached devices, and multiple incompatible
> inbound filters requested by attached devices.

Of course.

As I've said repeatedly, the only sane way to implement this is if you isolate 
individual devices from this kind of implementation detail. If you're API 
doesn't do that then it's wrong.

Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2009-02-15 16:25 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-10 21:28 [PATCH 0/4] qemu: TAP filtering support Alex Williamson
2009-02-10 21:28 ` [Qemu-devel] " Alex Williamson
2009-02-10 21:28 ` [PATCH 1/4] qemu:net: Add infrastructure for setting an RX filter through the vlan Alex Williamson
2009-02-10 21:28   ` [Qemu-devel] " Alex Williamson
2009-02-10 21:28 ` [PATCH 2/4] qemu:net: Add TAP support for RX filtering on Linux Alex Williamson
2009-02-10 21:28   ` [Qemu-devel] " Alex Williamson
2009-02-10 21:28 ` [PATCH 3/4] qemu:virtio-net: Add support for qemu_vlan_rxfilter Alex Williamson
2009-02-10 21:28   ` [Qemu-devel] " Alex Williamson
2009-02-12 16:26   ` Paul Brook
2009-02-12 16:26     ` Paul Brook
2009-02-12 16:36     ` Alex Williamson
2009-02-12 17:05       ` Paul Brook
2009-02-12 17:05         ` Paul Brook
2009-02-12 18:21         ` Alex Williamson
2009-02-12 20:26           ` Jamie Lokier
2009-02-13 12:40           ` Paul Brook
2009-02-13 12:40             ` Paul Brook
2009-02-13 16:00             ` Jamie Lokier
2009-02-13 16:00               ` Jamie Lokier
2009-02-13 16:17               ` Paul Brook
2009-02-13 16:17                 ` Paul Brook
2009-02-13 16:46                 ` Jamie Lokier
2009-02-13 17:04                   ` Paul Brook
2009-02-13 20:38                     ` Jamie Lokier
2009-02-15 16:25                       ` Paul Brook
2009-02-15 16:25                         ` Paul Brook
2009-02-10 21:29 ` [PATCH 4/4] qemu:e1000: " Alex Williamson
2009-02-10 21:29   ` [Qemu-devel] " Alex Williamson
2009-02-11 15:11   ` Alex Williamson
2009-02-11 15:11     ` [Qemu-devel] " Alex Williamson
2009-02-11 17:11   ` Alex Williamson
2009-02-11 17:11     ` [Qemu-devel] " Alex Williamson
2009-02-11 19:31 ` [PATCH 0/4] qemu: TAP filtering support Mark McLoughlin
2009-02-11 19:31   ` [Qemu-devel] " Mark McLoughlin
2009-02-11 19:43   ` Anthony Liguori
2009-02-11 19:43     ` [Qemu-devel] " Anthony Liguori
2009-02-11 19:51   ` Alex Williamson
2009-02-11 19:51     ` [Qemu-devel] " Alex Williamson
2009-02-11 20:19     ` Mark McLoughlin
2009-02-11 20:19       ` [Qemu-devel] " Mark McLoughlin
2009-02-11 20:37       ` Alex Williamson
2009-02-11 20:37         ` [Qemu-devel] " Alex Williamson
2009-02-12 19:57         ` Jamie Lokier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.