All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter
@ 2016-08-17  8:10 Zhang Chen
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 01/10] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext Zhang Chen
                   ` (11 more replies)
  0 siblings, 12 replies; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

COLO-compare is a part of COLO project. It is used
to compare the network package to help COLO decide
whether to do checkpoint.

Filter-rewriter is a part of COLO project too.
It will rewrite some of secondary packet to make
secondary guest's connection established successfully.
In this module we will rewrite tcp packet's ack to the secondary
from primary,and rewrite tcp packet's seq to the primary from
secondary.

The full version in this github:
https://github.com/zhangckid/qemu/tree/colo-v2.7-proxy-mode-compare-and-rewriter-aug16


v12:
  - add qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
    to this series as the first patch.
  - update COLO net ascii figure.
  - add chardev socket check.
  - fix some typo.
  - add some comments.
  - rename net/colo-base.c to net/colo.c
  - rename network/transport_layer to network/transport_header.
  - move the job that clear coon_list when hashtable_size oversize
    to connection_get.
  - reuse connection_destroy() do colo_rm_connection().
  - fix pkt mem leak in colo_compare_connection().
    (result be released in g_queue_remove(), so it were not leak)
  - rename thread_name "compare" to "colo-compare".
  - change icmp compare to memcmp().

v11:
  - Make patch 5 to a independent patch series.
    [PATCH V3] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
  - For Jason's comments, merge filter-rewriter to this series.
    (patch 7,8,9)
  - Add reverse_connection_key()
  - remove conn_list in filter-rewriter
  - remove unprocessed_connections
  - add some comments

v10:
  - fix typo
  - Should we make patch 5 independent with this series?
    This patch just add a API for qemu-char.

v9:
 p5:
  - use chr_update_read_handler_full() replace
    the chr_update_read_handler()
  - use io_watch_poll_prepare_full() replace
    the io_watch_poll_prepare()
  - use io_watch_poll_funcs_full replace
    the io_watch_poll_funcs
  - avoid code duplication

v8:
 p5:
  - add new patch:
    qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext

v7:
 p5:
   - add [PATCH]qemu-char: Fix context for g_source_attach()
     in this patch series.

v6: 
 p6:
   - add more commit log.
   - fix icmp comparison to compare all packet.

 p5:
   - add more cpmments in commit log.
   - change REGULAR_CHECK_MS to REGULAR_PACKET_CHECK_MS
   - make check old packet independent to compare thread
   - remove thread_status

 p4:
   - change this patch only about
     Connection and ConnectionKey.
   - add some comments in commit log.
   - remove mode in fill_connection_key().
   - fix some comments and bug.
   - move colo_conn_state to patch of
     "work with colo-frame"
   - remove conn_list_lock.
   - add MAX_QUEUE_SIZE, if primary_list or
     secondary_list biger than MAX_QUEUE_SIZE
     we will drop packet. 

 p3:
   - add new independent kernel jhash patch.

 p2:
   - add new independent colo-base patch.

 p1:
   - add a ascii figure and some comments to explain it
   - move trace.h to p2
   - move QTAILQ_HEAD(, CompareState) net_compares to
     patch of "work with colo-frame"
   - add some comments in qemu-option.hx


v5:
 p3:
    - comments from Jason
      we poll and handle chardev in comapre thread,
      Through this way, there's no need for extra 
      synchronization with main loop
      this depend on another patch:
      qemu-char: Fix context for g_source_attach()
    - remove QemuEvent
 p2:
    - remove conn->list_lock
 p1:
    - move compare_pri/sec_chr_in to p3
    - move compare_chr_send to p2

v4:
 p4:
    - add some comments
    - fix some trace-events
    - fix tcp compare error
 p3:
    - add rcu_read_lock().
    - fix trace name
    - fix jason's other comments
    - rebase some Dave's branch function
 p2:
    - colo_compare_connection() change g_queue_push_head() to
    - g_queue_push_tail() match to sorted order.
    - remove pkt->s
    - move data structure to colo-base.h
    - add colo-base.c reuse codes for filter-rewriter
    - add some filter-rewriter needs struct
    - depends on previous SocketReadState patch
 p1:
    - except move qemu_chr_add_handlers()
      to colo thread
    - remove class_finalize
    - remove secondary arp codes
    - depends on previous SocketReadState patch

v3:
  - rebase colo-compare to colo-frame v2.7
  - fix most of Dave's comments
    (except RCU)
  - add TCP,UDP,ICMP and other packet comparison
  - add trace-event
  - add some comments
  - other bug fix
  - add RFC index
  - add usage in patch 1/4

v2:
  - add jhash.h

v1:
  - initial patch


Zhang Chen (10):
  qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
  colo-compare: introduce colo compare initialization
  net/colo.c: add colo.c to define and handle packet
  Jhash: add linux kernel jhashtable in qemu
  colo-compare: track connection and enqueue packet
  colo-compare: introduce packet comparison thread
  colo-compare: add TCP,UDP,ICMP packet comparison
  filter-rewriter: introduce filter-rewriter initialization
  filter-rewriter: track connection and parse packet
  filter-rewriter: rewrite tcp packet to keep secondary connection

 include/qemu/jhash.h  |  59 ++++
 include/sysemu/char.h |  11 +-
 net/Makefile.objs     |   3 +
 net/colo-compare.c    | 784 ++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo.c            | 204 +++++++++++++
 net/colo.h            |  76 +++++
 net/filter-rewriter.c | 268 +++++++++++++++++
 qemu-char.c           |  77 +++--
 qemu-options.hx       |  52 ++++
 trace-events          |  14 +
 vl.c                  |   4 +-
 11 files changed, 1526 insertions(+), 26 deletions(-)
 create mode 100644 include/qemu/jhash.h
 create mode 100644 net/colo-compare.c
 create mode 100644 net/colo.c
 create mode 100644 net/colo.h
 create mode 100644 net/filter-rewriter.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 01/10] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization Zhang Chen
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert, Daniel P . Berrange,
	Paolo Bonzini

Add qemu_chr_add_handlers_full() API, we can use
this API pass in a GMainContext,make handler run
in the context rather than main_loop.
This comments from Daniel P . Berrange.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>

Cc: Daniel P . Berrange <berrange@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>

---
 include/sysemu/char.h | 11 +++++++-
 qemu-char.c           | 77 +++++++++++++++++++++++++++++++++++----------------
 2 files changed, 63 insertions(+), 25 deletions(-)

diff --git a/include/sysemu/char.h b/include/sysemu/char.h
index 307fd8f..f997849 100644
--- a/include/sysemu/char.h
+++ b/include/sysemu/char.h
@@ -65,7 +65,8 @@ struct CharDriverState {
     int (*chr_sync_read)(struct CharDriverState *s,
                          const uint8_t *buf, int len);
     GSource *(*chr_add_watch)(struct CharDriverState *s, GIOCondition cond);
-    void (*chr_update_read_handler)(struct CharDriverState *s);
+    void (*chr_update_read_handler)(struct CharDriverState *s,
+                                    GMainContext *context);
     int (*chr_ioctl)(struct CharDriverState *s, int cmd, void *arg);
     int (*get_msgfds)(struct CharDriverState *s, int* fds, int num);
     int (*set_msgfds)(struct CharDriverState *s, int *fds, int num);
@@ -388,6 +389,14 @@ void qemu_chr_add_handlers(CharDriverState *s,
                            IOEventHandler *fd_event,
                            void *opaque);
 
+/* This API can make handler run in the context what you pass to. */
+void qemu_chr_add_handlers_full(CharDriverState *s,
+                                IOCanReadHandler *fd_can_read,
+                                IOReadHandler *fd_read,
+                                IOEventHandler *fd_event,
+                                void *opaque,
+                                GMainContext *context);
+
 void qemu_chr_be_generic_open(CharDriverState *s);
 void qemu_chr_accept_input(CharDriverState *s);
 int qemu_chr_add_client(CharDriverState *s, int fd);
diff --git a/qemu-char.c b/qemu-char.c
index b597ee1..a926175 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -448,11 +448,12 @@ void qemu_chr_fe_printf(CharDriverState *s, const char *fmt, ...)
 
 static void remove_fd_in_watch(CharDriverState *chr);
 
-void qemu_chr_add_handlers(CharDriverState *s,
-                           IOCanReadHandler *fd_can_read,
-                           IOReadHandler *fd_read,
-                           IOEventHandler *fd_event,
-                           void *opaque)
+void qemu_chr_add_handlers_full(CharDriverState *s,
+                                IOCanReadHandler *fd_can_read,
+                                IOReadHandler *fd_read,
+                                IOEventHandler *fd_event,
+                                void *opaque,
+                                GMainContext *context)
 {
     int fe_open;
 
@@ -466,8 +467,9 @@ void qemu_chr_add_handlers(CharDriverState *s,
     s->chr_read = fd_read;
     s->chr_event = fd_event;
     s->handler_opaque = opaque;
-    if (fe_open && s->chr_update_read_handler)
-        s->chr_update_read_handler(s);
+    if (fe_open && s->chr_update_read_handler) {
+        s->chr_update_read_handler(s, context);
+    }
 
     if (!s->explicit_fe_open) {
         qemu_chr_fe_set_open(s, fe_open);
@@ -480,6 +482,16 @@ void qemu_chr_add_handlers(CharDriverState *s,
     }
 }
 
+void qemu_chr_add_handlers(CharDriverState *s,
+                           IOCanReadHandler *fd_can_read,
+                           IOReadHandler *fd_read,
+                           IOEventHandler *fd_event,
+                           void *opaque)
+{
+    qemu_chr_add_handlers_full(s, fd_can_read, fd_read,
+                               fd_event, opaque, NULL);
+}
+
 static int null_chr_write(CharDriverState *chr, const uint8_t *buf, int len)
 {
     return len;
@@ -717,7 +729,8 @@ static void mux_chr_event(void *opaque, int event)
         mux_chr_send_event(d, i, event);
 }
 
-static void mux_chr_update_read_handler(CharDriverState *chr)
+static void mux_chr_update_read_handler(CharDriverState *chr,
+                                        GMainContext *context)
 {
     MuxDriver *d = chr->opaque;
 
@@ -731,8 +744,10 @@ static void mux_chr_update_read_handler(CharDriverState *chr)
     d->chr_event[d->mux_cnt] = chr->chr_event;
     /* Fix up the real driver with mux routines */
     if (d->mux_cnt == 0) {
-        qemu_chr_add_handlers(d->drv, mux_chr_can_read, mux_chr_read,
-                              mux_chr_event, chr);
+        qemu_chr_add_handlers_full(d->drv, mux_chr_can_read,
+                                   mux_chr_read,
+                                   mux_chr_event,
+                                   chr, context);
     }
     if (d->focus != -1) {
         mux_chr_send_event(d, d->focus, CHR_EVENT_MUX_OUT);
@@ -840,6 +855,7 @@ typedef struct IOWatchPoll
     IOCanReadHandler *fd_can_read;
     GSourceFunc fd_read;
     void *opaque;
+    GMainContext *context;
 } IOWatchPoll;
 
 static IOWatchPoll *io_watch_poll_from_source(GSource *source)
@@ -847,7 +863,8 @@ static IOWatchPoll *io_watch_poll_from_source(GSource *source)
     return container_of(source, IOWatchPoll, parent);
 }
 
-static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_)
+static gboolean io_watch_poll_prepare(GSource *source,
+                                      gint *timeout_)
 {
     IOWatchPoll *iwp = io_watch_poll_from_source(source);
     bool now_active = iwp->fd_can_read(iwp->opaque) > 0;
@@ -860,7 +877,7 @@ static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_)
         iwp->src = qio_channel_create_watch(
             iwp->ioc, G_IO_IN | G_IO_ERR | G_IO_HUP | G_IO_NVAL);
         g_source_set_callback(iwp->src, iwp->fd_read, iwp->opaque, NULL);
-        g_source_attach(iwp->src, NULL);
+        g_source_attach(iwp->src, iwp->context);
     } else {
         g_source_destroy(iwp->src);
         g_source_unref(iwp->src);
@@ -907,19 +924,22 @@ static GSourceFuncs io_watch_poll_funcs = {
 static guint io_add_watch_poll(QIOChannel *ioc,
                                IOCanReadHandler *fd_can_read,
                                QIOChannelFunc fd_read,
-                               gpointer user_data)
+                               gpointer user_data,
+                               GMainContext *context)
 {
     IOWatchPoll *iwp;
     int tag;
 
-    iwp = (IOWatchPoll *) g_source_new(&io_watch_poll_funcs, sizeof(IOWatchPoll));
+    iwp = (IOWatchPoll *) g_source_new(&io_watch_poll_funcs,
+                                       sizeof(IOWatchPoll));
     iwp->fd_can_read = fd_can_read;
     iwp->opaque = user_data;
     iwp->ioc = ioc;
     iwp->fd_read = (GSourceFunc) fd_read;
     iwp->src = NULL;
+    iwp->context = context;
 
-    tag = g_source_attach(&iwp->parent, NULL);
+    tag = g_source_attach(&iwp->parent, context);
     g_source_unref(&iwp->parent);
     return tag;
 }
@@ -1051,7 +1071,8 @@ static GSource *fd_chr_add_watch(CharDriverState *chr, GIOCondition cond)
     return qio_channel_create_watch(s->ioc_out, cond);
 }
 
-static void fd_chr_update_read_handler(CharDriverState *chr)
+static void fd_chr_update_read_handler(CharDriverState *chr,
+                                       GMainContext *context)
 {
     FDCharDriver *s = chr->opaque;
 
@@ -1059,7 +1080,8 @@ static void fd_chr_update_read_handler(CharDriverState *chr)
     if (s->ioc_in) {
         chr->fd_in_tag = io_add_watch_poll(s->ioc_in,
                                            fd_chr_read_poll,
-                                           fd_chr_read, chr);
+                                           fd_chr_read, chr,
+                                           context);
     }
 }
 
@@ -1303,7 +1325,8 @@ static void pty_chr_update_read_handler_locked(CharDriverState *chr)
     }
 }
 
-static void pty_chr_update_read_handler(CharDriverState *chr)
+static void pty_chr_update_read_handler(CharDriverState *chr,
+                                        GMainContext *context)
 {
     qemu_mutex_lock(&chr->chr_write_lock);
     pty_chr_update_read_handler_locked(chr);
@@ -1407,7 +1430,8 @@ static void pty_chr_state(CharDriverState *chr, int connected)
         if (!chr->fd_in_tag) {
             chr->fd_in_tag = io_add_watch_poll(s->ioc,
                                                pty_chr_read_poll,
-                                               pty_chr_read, chr);
+                                               pty_chr_read,
+                                               chr, NULL);
         }
     }
 }
@@ -2546,7 +2570,8 @@ static gboolean udp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
     return TRUE;
 }
 
-static void udp_chr_update_read_handler(CharDriverState *chr)
+static void udp_chr_update_read_handler(CharDriverState *chr,
+                                        GMainContext *context)
 {
     NetCharDriver *s = chr->opaque;
 
@@ -2554,7 +2579,8 @@ static void udp_chr_update_read_handler(CharDriverState *chr)
     if (s->ioc) {
         chr->fd_in_tag = io_add_watch_poll(s->ioc,
                                            udp_chr_read_poll,
-                                           udp_chr_read, chr);
+                                           udp_chr_read, chr,
+                                           context);
     }
 }
 
@@ -2931,12 +2957,14 @@ static void tcp_chr_connect(void *opaque)
     if (s->ioc) {
         chr->fd_in_tag = io_add_watch_poll(s->ioc,
                                            tcp_chr_read_poll,
-                                           tcp_chr_read, chr);
+                                           tcp_chr_read,
+                                           chr, NULL);
     }
     qemu_chr_be_generic_open(chr);
 }
 
-static void tcp_chr_update_read_handler(CharDriverState *chr)
+static void tcp_chr_update_read_handler(CharDriverState *chr,
+                                        GMainContext *context)
 {
     TCPCharDriver *s = chr->opaque;
 
@@ -2948,7 +2976,8 @@ static void tcp_chr_update_read_handler(CharDriverState *chr)
     if (s->ioc) {
         chr->fd_in_tag = io_add_watch_poll(s->ioc,
                                            tcp_chr_read_poll,
-                                           tcp_chr_read, chr);
+                                           tcp_chr_read, chr,
+                                           context);
     }
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 01/10] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-31  7:53   ` Jason Wang
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 03/10] net/colo.c: add colo.c to define and handle packet Zhang Chen
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

This a COLO net ascii figure:

 Primary qemu                                                           Secondary qemu
+--------------------------------------------------------------+       +----------------------------------------------------------------+
| +----------------------------------------------------------+ |       |  +-----------------------------------------------------------+ |
| |                                                          | |       |  |                                                           | |
| |                        guest                             | |       |  |                        guest                              | |
| |                                                          | |       |  |                                                           | |
| +-------^--------------------------+-----------------------+ |       |  +---------------------+--------+----------------------------+ |
|         |                          |                         |       |                        ^        |                              |
|         |                          |                         |       |                        |        |                              |
|         |  +------------------------------------------------------+  |                        |        |                              |
|netfilter|  |                       |                         |    |  |   netfilter            |        |                              |
| +----------+ +----------------------------+                  |    |  |  +-----------------------------------------------------------+ |
| |       |  |                       |      |        out       |    |  |  |                     |        |  filter excute order       | |
| |       |  |          +-----------------------------+        |    |  |  |                     |        | +------------------->      | |
| |       |  |          |            |      |         |        |    |  |  |                     |        |   TCP                      | |
| | +-----+--+-+  +-----v----+ +-----v----+ |pri +----+----+sec|    |  |  | +------------+  +---+----+---v+rewriter++  +------------+ | |
| | |          |  |          | |          | |in  |         |in |    |  |  | |            |  |        |              |  |            | | |
| | |  filter  |  |  filter  | |  filter  +------>  colo   <------+ +-------->  filter   +--> adjust |   adjust     +-->   filter   | | |
| | |  mirror  |  |redirector| |redirector| |    | compare |   |  |    |  | | redirector |  | ack    |   seq        |  | redirector | | |
| | |          |  |          | |          | |    |         |   |  |    |  | |            |  |        |              |  |            | | |
| | +----^-----+  +----+-----+ +----------+ |    +---------+   |  |    |  | +------------+  +--------+--------------+  +---+--------+ | |
| |      |   tx        |   rx           rx  |                  |  |    |  |            tx                        all       |  rx      | |
| |      |             |                    |                  |  |    |  +-----------------------------------------------------------+ |
| |      |             +--------------+     |                  |  |    |                                                   |            |
| |      |   filter excute order      |     |                  |  |    |                                                   |            |
| |      |  +---------------->        |     |                  |  +--------------------------------------------------------+            |
| +-----------------------------------------+                  |       |                                                                |
|        |                            |                        |       |                                                                |
+--------------------------------------------------------------+       +----------------------------------------------------------------+
         |guest receive               | guest send
         |                            |
+--------+----------------------------v------------------------+
|                                                              |                          NOTE: filter direction is rx/tx/all
|                         tap                                  |                          rx:receive packets sent to the netdev
|                                                              |                          tx:receive packets sent by the netdev
+--------------------------------------------------------------+

In COLO-compare, we do packet comparing job.
Packets coming from the primary char indev will be sent to outdev.
Packets coming from the secondary char dev will be dropped after comparing.
colo-comapre need two input chardev and one output chardev:
primary_in=chardev1-id (source: primary send packet)
secondary_in=chardev2-id (source: secondary send packet)
outdev=chardev3-id

usage:

primary:
-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0

secondary:
-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-ifdown
-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
-chardev socket,id=red0,host=3.3.3.3,port=9003
-chardev socket,id=red1,host=3.3.3.3,port=9004
-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/Makefile.objs  |   1 +
 net/colo-compare.c | 284 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 qemu-options.hx    |  39 ++++++++
 vl.c               |   3 +-
 4 files changed, 326 insertions(+), 1 deletion(-)
 create mode 100644 net/colo-compare.c

diff --git a/net/Makefile.objs b/net/Makefile.objs
index b7c22fd..ba92f73 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -16,3 +16,4 @@ common-obj-$(CONFIG_NETMAP) += netmap.o
 common-obj-y += filter.o
 common-obj-y += filter-buffer.o
 common-obj-y += filter-mirror.o
+common-obj-y += colo-compare.o
diff --git a/net/colo-compare.c b/net/colo-compare.c
new file mode 100644
index 0000000..cdc3e0e
--- /dev/null
+++ b/net/colo-compare.c
@@ -0,0 +1,284 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu-common.h"
+#include "qapi/qmp/qerror.h"
+#include "qapi/error.h"
+#include "net/net.h"
+#include "net/vhost_net.h"
+#include "qom/object_interfaces.h"
+#include "qemu/iov.h"
+#include "qom/object.h"
+#include "qemu/typedefs.h"
+#include "net/queue.h"
+#include "sysemu/char.h"
+#include "qemu/sockets.h"
+#include "qapi-visit.h"
+
+#define TYPE_COLO_COMPARE "colo-compare"
+#define COLO_COMPARE(obj) \
+    OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
+
+#define COMPARE_READ_LEN_MAX NET_BUFSIZE
+
+typedef struct CompareState {
+    Object parent;
+
+    char *pri_indev;
+    char *sec_indev;
+    char *outdev;
+    CharDriverState *chr_pri_in;
+    CharDriverState *chr_sec_in;
+    CharDriverState *chr_out;
+    QTAILQ_ENTRY(CompareState) next;
+    SocketReadState pri_rs;
+    SocketReadState sec_rs;
+} CompareState;
+
+typedef struct CompareClass {
+    ObjectClass parent_class;
+} CompareClass;
+
+typedef struct CompareChardevProps {
+    bool is_socket;
+    bool is_unix;
+} CompareChardevProps;
+
+static char *compare_get_pri_indev(Object *obj, Error **errp)
+{
+    CompareState *s = COLO_COMPARE(obj);
+
+    return g_strdup(s->pri_indev);
+}
+
+static void compare_set_pri_indev(Object *obj, const char *value, Error **errp)
+{
+    CompareState *s = COLO_COMPARE(obj);
+
+    g_free(s->pri_indev);
+    s->pri_indev = g_strdup(value);
+}
+
+static char *compare_get_sec_indev(Object *obj, Error **errp)
+{
+    CompareState *s = COLO_COMPARE(obj);
+
+    return g_strdup(s->sec_indev);
+}
+
+static void compare_set_sec_indev(Object *obj, const char *value, Error **errp)
+{
+    CompareState *s = COLO_COMPARE(obj);
+
+    g_free(s->sec_indev);
+    s->sec_indev = g_strdup(value);
+}
+
+static char *compare_get_outdev(Object *obj, Error **errp)
+{
+    CompareState *s = COLO_COMPARE(obj);
+
+    return g_strdup(s->outdev);
+}
+
+static void compare_set_outdev(Object *obj, const char *value, Error **errp)
+{
+    CompareState *s = COLO_COMPARE(obj);
+
+    g_free(s->outdev);
+    s->outdev = g_strdup(value);
+}
+
+static void compare_pri_rs_finalize(SocketReadState *pri_rs)
+{
+    /* if packet_enqueue pri pkt failed we will send unsupported packet */
+}
+
+static void compare_sec_rs_finalize(SocketReadState *sec_rs)
+{
+    /* if packet_enqueue sec pkt failed we will notify trace */
+}
+
+static int compare_chardev_opts(void *opaque,
+                                const char *name, const char *value,
+                                Error **errp)
+{
+    CompareChardevProps *props = opaque;
+
+    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") == 0) {
+        props->is_socket = true;
+    } else if (strcmp(name, "host") == 0) {
+        props->is_unix = true;
+    } else if (strcmp(name, "port") == 0) {
+    } else if (strcmp(name, "server") == 0) {
+    } else if (strcmp(name, "wait") == 0) {
+    } else {
+        error_setg(errp,
+                   "COLO-compare does not support a chardev with option %s=%s",
+                   name, value);
+        return -1;
+    }
+    return 0;
+}
+
+/*
+ * called from the main thread on the primary
+ * to setup colo-compare.
+ */
+static void colo_compare_complete(UserCreatable *uc, Error **errp)
+{
+    CompareState *s = COLO_COMPARE(uc);
+    CompareChardevProps props;
+
+    if (!s->pri_indev || !s->sec_indev || !s->outdev) {
+        error_setg(errp, "colo compare needs 'primary_in' ,"
+                   "'secondary_in','outdev' property set");
+        return;
+    } else if (!strcmp(s->pri_indev, s->outdev) ||
+               !strcmp(s->sec_indev, s->outdev) ||
+               !strcmp(s->pri_indev, s->sec_indev)) {
+        error_setg(errp, "'indev' and 'outdev' could not be same "
+                   "for compare module");
+        return;
+    }
+
+    s->chr_pri_in = qemu_chr_find(s->pri_indev);
+    if (s->chr_pri_in == NULL) {
+        error_setg(errp, "Primary IN Device '%s' not found",
+                   s->pri_indev);
+        return;
+    }
+
+    /* inspect chardev opts */
+    memset(&props, 0, sizeof(props));
+    if (qemu_opt_foreach(s->chr_pri_in->opts, compare_chardev_opts, &props, errp)) {
+        return;
+    }
+
+    if (!props.is_socket || !props.is_unix) {
+        error_setg(errp, "chardev \"%s\" is not a unix socket",
+                   s->pri_indev);
+        return;
+    }
+
+    s->chr_sec_in = qemu_chr_find(s->sec_indev);
+    if (s->chr_sec_in == NULL) {
+        error_setg(errp, "Secondary IN Device '%s' not found",
+                   s->sec_indev);
+        return;
+    }
+
+    memset(&props, 0, sizeof(props));
+    if (qemu_opt_foreach(s->chr_sec_in->opts, compare_chardev_opts, &props, errp)) {
+        return;
+    }
+
+    if (!props.is_socket || !props.is_unix) {
+        error_setg(errp, "chardev \"%s\" is not a unix socket",
+                   s->sec_indev);
+        return;
+    }
+
+    s->chr_out = qemu_chr_find(s->outdev);
+    if (s->chr_out == NULL) {
+        error_setg(errp, "OUT Device '%s' not found", s->outdev);
+        return;
+    }
+
+    memset(&props, 0, sizeof(props));
+    if (qemu_opt_foreach(s->chr_out->opts, compare_chardev_opts, &props, errp)) {
+        return;
+    }
+
+    if (!props.is_socket || !props.is_unix) {
+        error_setg(errp, "chardev \"%s\" is not a unix socket",
+                   s->outdev);
+        return;
+    }
+
+    qemu_chr_fe_claim_no_fail(s->chr_pri_in);
+
+    qemu_chr_fe_claim_no_fail(s->chr_sec_in);
+
+    qemu_chr_fe_claim_no_fail(s->chr_out);
+
+    net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
+    net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
+
+    return;
+}
+
+static void colo_compare_class_init(ObjectClass *oc, void *data)
+{
+    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
+
+    ucc->complete = colo_compare_complete;
+}
+
+static void colo_compare_init(Object *obj)
+{
+    object_property_add_str(obj, "primary_in",
+                            compare_get_pri_indev, compare_set_pri_indev,
+                            NULL);
+    object_property_add_str(obj, "secondary_in",
+                            compare_get_sec_indev, compare_set_sec_indev,
+                            NULL);
+    object_property_add_str(obj, "outdev",
+                            compare_get_outdev, compare_set_outdev,
+                            NULL);
+}
+
+static void colo_compare_finalize(Object *obj)
+{
+    CompareState *s = COLO_COMPARE(obj);
+
+    if (s->chr_pri_in) {
+        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
+        qemu_chr_fe_release(s->chr_pri_in);
+    }
+    if (s->chr_sec_in) {
+        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
+        qemu_chr_fe_release(s->chr_sec_in);
+    }
+    if (s->chr_out) {
+        qemu_chr_fe_release(s->chr_out);
+    }
+
+    g_free(s->pri_indev);
+    g_free(s->sec_indev);
+    g_free(s->outdev);
+}
+
+static const TypeInfo colo_compare_info = {
+    .name = TYPE_COLO_COMPARE,
+    .parent = TYPE_OBJECT,
+    .instance_size = sizeof(CompareState),
+    .instance_init = colo_compare_init,
+    .instance_finalize = colo_compare_finalize,
+    .class_size = sizeof(CompareClass),
+    .class_init = colo_compare_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_USER_CREATABLE },
+        { }
+    }
+};
+
+static void register_types(void)
+{
+    type_register_static(&colo_compare_info);
+}
+
+type_init(register_types);
diff --git a/qemu-options.hx b/qemu-options.hx
index 587de8f..33d5d0b 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3866,6 +3866,45 @@ Dump the network traffic on netdev @var{dev} to the file specified by
 The file format is libpcap, so it can be analyzed with tools such as tcpdump
 or Wireshark.
 
+@item -object colo-compare,id=@var{id},primary_in=@var{chardevid},secondary_in=@var{chardevid},
+outdev=@var{chardevid}
+
+Colo-compare gets packet from primary_in@var{chardevid} and secondary_in@var{chardevid}, than compare primary packet with
+secondary packet. If the packet same, we will output primary
+packet to outdev@var{chardevid}, else we will notify colo-frame
+do checkpoint and send primary packet to outdev@var{chardevid}.
+
+we can use it with the help of filter-mirror and filter-redirector.
+
+@example
+
+primary:
+-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
+-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
+-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
+-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
+-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
+-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
+-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
+-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
+-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
+-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
+-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
+-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
+
+secondary:
+-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-ifdown
+-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
+-chardev socket,id=red0,host=3.3.3.3,port=9003
+-chardev socket,id=red1,host=3.3.3.3,port=9004
+-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
+-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
+
+@end example
+
+If you want to know the detail of above command line, you can read
+the colo-compare git log.
+
 @item -object secret,id=@var{id},data=@var{string},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
 @item -object secret,id=@var{id},file=@var{filename},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
 
diff --git a/vl.c b/vl.c
index cbe51ac..c6b9a6f 100644
--- a/vl.c
+++ b/vl.c
@@ -2865,7 +2865,8 @@ static bool object_create_initial(const char *type)
     if (g_str_equal(type, "filter-buffer") ||
         g_str_equal(type, "filter-dump") ||
         g_str_equal(type, "filter-mirror") ||
-        g_str_equal(type, "filter-redirector")) {
+        g_str_equal(type, "filter-redirector") ||
+        g_str_equal(type, "colo-compare")) {
         return false;
     }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 03/10] net/colo.c: add colo.c to define and handle packet
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 01/10] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext Zhang Chen
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-31  8:04   ` Jason Wang
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 04/10] Jhash: add linux kernel jhashtable in qemu Zhang Chen
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

The net/colo.c is used by colo-compare and filter-rewriter.
this can share common data structure like net packet,
and other functions.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/Makefile.objs  |   1 +
 net/colo-compare.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 net/colo.c         |  70 +++++++++++++++++++++++++++++++++
 net/colo.h         |  38 ++++++++++++++++++
 trace-events       |   3 ++
 5 files changed, 223 insertions(+), 2 deletions(-)
 create mode 100644 net/colo.c
 create mode 100644 net/colo.h

diff --git a/net/Makefile.objs b/net/Makefile.objs
index ba92f73..beb504b 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -17,3 +17,4 @@ common-obj-y += filter.o
 common-obj-y += filter-buffer.o
 common-obj-y += filter-mirror.o
 common-obj-y += colo-compare.o
+common-obj-y += colo.o
diff --git a/net/colo-compare.c b/net/colo-compare.c
index cdc3e0e..d9e4459 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -27,13 +27,38 @@
 #include "sysemu/char.h"
 #include "qemu/sockets.h"
 #include "qapi-visit.h"
+#include "net/colo.h"
+#include "trace.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
     OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
+#define MAX_QUEUE_SIZE 1024
 
+/*
+  + CompareState ++
+  |               |
+  +---------------+   +---------------+         +---------------+
+  |conn list      +--->conn           +--------->conn           |
+  +---------------+   +---------------+         +---------------+
+  |               |     |           |             |          |
+  +---------------+ +---v----+  +---v----+    +---v----+ +---v----+
+                    |primary |  |secondary    |primary | |secondary
+                    |packet  |  |packet  +    |packet  | |packet  +
+                    +--------+  +--------+    +--------+ +--------+
+                        |           |             |          |
+                    +---v----+  +---v----+    +---v----+ +---v----+
+                    |primary |  |secondary    |primary | |secondary
+                    |packet  |  |packet  +    |packet  | |packet  +
+                    +--------+  +--------+    +--------+ +--------+
+                        |           |             |          |
+                    +---v----+  +---v----+    +---v----+ +---v----+
+                    |primary |  |secondary    |primary | |secondary
+                    |packet  |  |packet  +    |packet  | |packet  +
+                    +--------+  +--------+    +--------+ +--------+
+*/
 typedef struct CompareState {
     Object parent;
 
@@ -46,6 +71,9 @@ typedef struct CompareState {
     QTAILQ_ENTRY(CompareState) next;
     SocketReadState pri_rs;
     SocketReadState sec_rs;
+
+    /* hashtable to save connection */
+    GHashTable *connection_track_table;
 } CompareState;
 
 typedef struct CompareClass {
@@ -57,6 +85,76 @@ typedef struct CompareChardevProps {
     bool is_unix;
 } CompareChardevProps;
 
+enum {
+    PRIMARY_IN = 0,
+    SECONDARY_IN,
+};
+
+static int compare_chr_send(CharDriverState *out,
+                            const uint8_t *buf,
+                            uint32_t size);
+
+/*
+ * Return 0 on success, if return -1 means the pkt
+ * is unsupported(arp and ipv6) and will be sent later
+ */
+static int packet_enqueue(CompareState *s, int mode)
+{
+    Packet *pkt = NULL;
+
+    if (mode == PRIMARY_IN) {
+        pkt = packet_new(s->pri_rs.buf, s->pri_rs.packet_len);
+    } else {
+        pkt = packet_new(s->sec_rs.buf, s->sec_rs.packet_len);
+    }
+
+    if (parse_packet_early(pkt)) {
+        packet_destroy(pkt, NULL);
+        pkt = NULL;
+        return -1;
+    }
+    /* TODO: get connection key from pkt */
+
+    /*
+     * TODO: use connection key get conn from
+     * connection_track_table
+     */
+
+    /*
+     * TODO: insert pkt to it's conn->primary_list
+     * or conn->secondary_list
+     */
+
+    return 0;
+}
+
+static int compare_chr_send(CharDriverState *out,
+                            const uint8_t *buf,
+                            uint32_t size)
+{
+    int ret = 0;
+    uint32_t len = htonl(size);
+
+    if (!size) {
+        return 0;
+    }
+
+    ret = qemu_chr_fe_write_all(out, (uint8_t *)&len, sizeof(len));
+    if (ret != sizeof(len)) {
+        goto err;
+    }
+
+    ret = qemu_chr_fe_write_all(out, (uint8_t *)buf, size);
+    if (ret != size) {
+        goto err;
+    }
+
+    return 0;
+
+err:
+    return ret < 0 ? ret : -EIO;
+}
+
 static char *compare_get_pri_indev(Object *obj, Error **errp)
 {
     CompareState *s = COLO_COMPARE(obj);
@@ -104,12 +202,21 @@ static void compare_set_outdev(Object *obj, const char *value, Error **errp)
 
 static void compare_pri_rs_finalize(SocketReadState *pri_rs)
 {
-    /* if packet_enqueue pri pkt failed we will send unsupported packet */
+    CompareState *s = container_of(pri_rs, CompareState, pri_rs);
+
+    if (packet_enqueue(s, PRIMARY_IN)) {
+        trace_colo_compare_main("primary: unsupported packet in");
+        compare_chr_send(s->chr_out, pri_rs->buf, pri_rs->packet_len);
+    }
 }
 
 static void compare_sec_rs_finalize(SocketReadState *sec_rs)
 {
-    /* if packet_enqueue sec pkt failed we will notify trace */
+    CompareState *s = container_of(sec_rs, CompareState, sec_rs);
+
+    if (packet_enqueue(s, SECONDARY_IN)) {
+        trace_colo_compare_main("secondary: unsupported packet in");
+    }
 }
 
 static int compare_chardev_opts(void *opaque,
@@ -218,6 +325,8 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
     net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
     net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
 
+    /* use g_hash_table_new_full() to new a hashtable */
+
     return;
 }
 
diff --git a/net/colo.c b/net/colo.c
new file mode 100644
index 0000000..4daedd4
--- /dev/null
+++ b/net/colo.c
@@ -0,0 +1,70 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "net/colo.h"
+
+int parse_packet_early(Packet *pkt)
+{
+    int network_length;
+    uint8_t *data = pkt->data;
+    uint16_t l3_proto;
+    ssize_t l2hdr_len = eth_get_l2_hdr_length(data);
+
+    if (pkt->size < ETH_HLEN) {
+        error_report("pkt->size < ETH_HLEN");
+        return 1;
+    }
+    pkt->network_header = data + ETH_HLEN;
+    l3_proto = eth_get_l3_proto(data, l2hdr_len);
+    if (l3_proto != ETH_P_IP) {
+        return 1;
+    }
+
+    network_length = pkt->ip->ip_hl * 4;
+    if (pkt->size < ETH_HLEN + network_length) {
+        error_report("pkt->size < network_header + network_length");
+        return 1;
+    }
+    pkt->transport_header = pkt->network_header + network_length;
+
+    return 0;
+}
+
+Packet *packet_new(const void *data, int size)
+{
+    Packet *pkt = g_slice_new(Packet);
+
+    pkt->data = g_memdup(data, size);
+    pkt->size = size;
+
+    return pkt;
+}
+
+void packet_destroy(void *opaque, void *user_data)
+{
+    Packet *pkt = opaque;
+
+    g_free(pkt->data);
+    g_slice_free(Packet, pkt);
+}
+
+/*
+ * Clear hashtable, stop this hash growing really huge
+ */
+void connection_hashtable_reset(GHashTable *connection_track_table)
+{
+    g_hash_table_remove_all(connection_track_table);
+}
diff --git a/net/colo.h b/net/colo.h
new file mode 100644
index 0000000..8559f28
--- /dev/null
+++ b/net/colo.h
@@ -0,0 +1,38 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_BASE_H
+#define QEMU_COLO_BASE_H
+
+#include "slirp/slirp.h"
+#include "qemu/jhash.h"
+
+#define HASHTABLE_MAX_SIZE 16384
+
+typedef struct Packet {
+    void *data;
+    union {
+        uint8_t *network_header;
+        struct ip *ip;
+    };
+    uint8_t *transport_header;
+    int size;
+} Packet;
+
+int parse_packet_early(Packet *pkt);
+void connection_hashtable_reset(GHashTable *connection_track_table);
+Packet *packet_new(const void *data, int size);
+void packet_destroy(void *opaque, void *user_data);
+
+#endif /* QEMU_COLO_BASE_H */
diff --git a/trace-events b/trace-events
index ca7211b..703de1a 100644
--- a/trace-events
+++ b/trace-events
@@ -1916,3 +1916,6 @@ aspeed_vic_update_fiq(int flags) "Raising FIQ: %d"
 aspeed_vic_update_irq(int flags) "Raising IRQ: %d"
 aspeed_vic_read(uint64_t offset, unsigned size, uint32_t value) "From 0x%" PRIx64 " of size %u: 0x%" PRIx32
 aspeed_vic_write(uint64_t offset, unsigned size, uint32_t data) "To 0x%" PRIx64 " of size %u: 0x%" PRIx32
+
+# net/colo-compare.c
+colo_compare_main(const char *chr) ": %s"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 04/10] Jhash: add linux kernel jhashtable in qemu
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (2 preceding siblings ...)
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 03/10] net/colo.c: add colo.c to define and handle packet Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-31  8:05   ` Jason Wang
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 05/10] colo-compare: track connection and enqueue packet Zhang Chen
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

Jhash used by colo-compare and filter-rewriter
to save and lookup net connection info

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 include/qemu/jhash.h | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)
 create mode 100644 include/qemu/jhash.h

diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h
new file mode 100644
index 0000000..7222242
--- /dev/null
+++ b/include/qemu/jhash.h
@@ -0,0 +1,59 @@
+/* jhash.h: Jenkins hash support.
+  *
+  * Copyright (C) 2006. Bob Jenkins (bob_jenkins@burtleburtle.net)
+  *
+  * http://burtleburtle.net/bob/hash/
+  *
+  * These are the credits from Bob's sources:
+  *
+  * lookup3.c, by Bob Jenkins, May 2006, Public Domain.
+  *
+  * These are functions for producing 32-bit hashes for hash table lookup.
+  * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final()
+  * are externally useful functions.  Routines to test the hash are included
+  * if SELF_TEST is defined.  You can use this free for any purpose. It's in
+  * the public domain.  It has no warranty.
+  *
+  * Copyright (C) 2009-2010 Jozsef Kadlecsik (kadlec@blackhole.kfki.hu)
+  *
+  * I've modified Bob's hash to be useful in the Linux kernel, and
+  * any bugs present are my fault.
+  * Jozsef
+  */
+
+#ifndef QEMU_JHASH_H__
+#define QEMU_JHASH_H__
+
+#include "qemu/bitops.h"
+
+/*
+ * hashtable relation copy from linux kernel jhash
+ */
+
+/* __jhash_mix -- mix 3 32-bit values reversibly. */
+#define __jhash_mix(a, b, c)                \
+{                                           \
+    a -= c;  a ^= rol32(c, 4);  c += b;     \
+    b -= a;  b ^= rol32(a, 6);  a += c;     \
+    c -= b;  c ^= rol32(b, 8);  b += a;     \
+    a -= c;  a ^= rol32(c, 16); c += b;     \
+    b -= a;  b ^= rol32(a, 19); a += c;     \
+    c -= b;  c ^= rol32(b, 4);  b += a;     \
+}
+
+/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
+#define __jhash_final(a, b, c)  \
+{                               \
+    c ^= b; c -= rol32(b, 14);  \
+    a ^= c; a -= rol32(c, 11);  \
+    b ^= a; b -= rol32(a, 25);  \
+    c ^= b; c -= rol32(b, 16);  \
+    a ^= c; a -= rol32(c, 4);   \
+    b ^= a; b -= rol32(a, 14);  \
+    c ^= b; c -= rol32(b, 24);  \
+}
+
+/* An arbitrary initial parameter */
+#define JHASH_INITVAL           0xdeadbeef
+
+#endif /* QEMU_JHASH_H__ */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 05/10] colo-compare: track connection and enqueue packet
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (3 preceding siblings ...)
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 04/10] Jhash: add linux kernel jhashtable in qemu Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-31  8:52   ` Jason Wang
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread Zhang Chen
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

In this patch we use kernel jhash table to track
connection, and then enqueue net packet like this:

+ CompareState ++
|               |
+---------------+   +---------------+         +---------------+
|conn list      +--->conn           +--------->conn           |
+---------------+   +---------------+         +---------------+
|               |     |           |             |          |
+---------------+ +---v----+  +---v----+    +---v----+ +---v----+
                  |primary |  |secondary    |primary | |secondary
                  |packet  |  |packet  +    |packet  | |packet  +
                  +--------+  +--------+    +--------+ +--------+
                      |           |             |          |
                  +---v----+  +---v----+    +---v----+ +---v----+
                  |primary |  |secondary    |primary | |secondary
                  |packet  |  |packet  +    |packet  | |packet  +
                  +--------+  +--------+    +--------+ +--------+
                      |           |             |          |
                  +---v----+  +---v----+    +---v----+ +---v----+
                  |primary |  |secondary    |primary | |secondary
                  |packet  |  |packet  +    |packet  | |packet  +
                  +--------+  +--------+    +--------+ +--------+

We use conn_list to record connection info.
When we want to enqueue a packet, firstly get the
connection from connection_track_table. then push
the packet to g_queue(pri/sec) in it's own conn.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/colo-compare.c |  51 ++++++++++++++++++-----
 net/colo.c         | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo.h         |  27 +++++++++++++
 3 files changed, 185 insertions(+), 10 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index d9e4459..bab215b 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -72,6 +72,11 @@ typedef struct CompareState {
     SocketReadState pri_rs;
     SocketReadState sec_rs;
 
+    /* connection list: the connections belonged to this NIC could be found
+     * in this list.
+     * element type: Connection
+     */
+    GQueue conn_list;
     /* hashtable to save connection */
     GHashTable *connection_track_table;
 } CompareState;
@@ -100,7 +105,9 @@ static int compare_chr_send(CharDriverState *out,
  */
 static int packet_enqueue(CompareState *s, int mode)
 {
+    ConnectionKey key = {{ 0 } };
     Packet *pkt = NULL;
+    Connection *conn;
 
     if (mode == PRIMARY_IN) {
         pkt = packet_new(s->pri_rs.buf, s->pri_rs.packet_len);
@@ -113,17 +120,34 @@ static int packet_enqueue(CompareState *s, int mode)
         pkt = NULL;
         return -1;
     }
-    /* TODO: get connection key from pkt */
+    fill_connection_key(pkt, &key);
 
-    /*
-     * TODO: use connection key get conn from
-     * connection_track_table
-     */
+    conn = connection_get(s->connection_track_table,
+                          &key,
+                          &s->conn_list);
 
-    /*
-     * TODO: insert pkt to it's conn->primary_list
-     * or conn->secondary_list
-     */
+    if (!conn->processing) {
+        g_queue_push_tail(&s->conn_list, conn);
+        conn->processing = true;
+    }
+
+    if (mode == PRIMARY_IN) {
+        if (g_queue_get_length(&conn->primary_list) <
+                               MAX_QUEUE_SIZE) {
+            g_queue_push_tail(&conn->primary_list, pkt);
+        } else {
+            error_report("colo compare primary queue size too big,"
+            "drop packet");
+        }
+    } else {
+        if (g_queue_get_length(&conn->secondary_list) <
+                               MAX_QUEUE_SIZE) {
+            g_queue_push_tail(&conn->secondary_list, pkt);
+        } else {
+            error_report("colo compare secondary queue size too big,"
+            "drop packet");
+        }
+    }
 
     return 0;
 }
@@ -325,7 +349,12 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
     net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
     net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
 
-    /* use g_hash_table_new_full() to new a hashtable */
+    g_queue_init(&s->conn_list);
+
+    s->connection_track_table = g_hash_table_new_full(connection_key_hash,
+                                                      connection_key_equal,
+                                                      g_free,
+                                                      connection_destroy);
 
     return;
 }
@@ -366,6 +395,8 @@ static void colo_compare_finalize(Object *obj)
         qemu_chr_fe_release(s->chr_out);
     }
 
+    g_queue_free(&s->conn_list);
+
     g_free(s->pri_indev);
     g_free(s->sec_indev);
     g_free(s->outdev);
diff --git a/net/colo.c b/net/colo.c
index 4daedd4..bc86553 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -16,6 +16,29 @@
 #include "qemu/error-report.h"
 #include "net/colo.h"
 
+uint32_t connection_key_hash(const void *opaque)
+{
+    const ConnectionKey *key = opaque;
+    uint32_t a, b, c;
+
+    /* Jenkins hash */
+    a = b = c = JHASH_INITVAL + sizeof(*key);
+    a += key->src.s_addr;
+    b += key->dst.s_addr;
+    c += (key->src_port | key->dst_port << 16);
+    __jhash_mix(a, b, c);
+
+    a += key->ip_proto;
+    __jhash_final(a, b, c);
+
+    return c;
+}
+
+int connection_key_equal(const void *key1, const void *key2)
+{
+    return memcmp(key1, key2, sizeof(ConnectionKey)) == 0;
+}
+
 int parse_packet_early(Packet *pkt)
 {
     int network_length;
@@ -43,6 +66,62 @@ int parse_packet_early(Packet *pkt)
     return 0;
 }
 
+void fill_connection_key(Packet *pkt, ConnectionKey *key)
+{
+    uint32_t tmp_ports;
+
+    key->ip_proto = pkt->ip->ip_p;
+
+    switch (key->ip_proto) {
+    case IPPROTO_TCP:
+    case IPPROTO_UDP:
+    case IPPROTO_DCCP:
+    case IPPROTO_ESP:
+    case IPPROTO_SCTP:
+    case IPPROTO_UDPLITE:
+        tmp_ports = *(uint32_t *)(pkt->transport_header);
+        key->src = pkt->ip->ip_src;
+        key->dst = pkt->ip->ip_dst;
+        key->src_port = ntohs(tmp_ports & 0xffff);
+        key->dst_port = ntohs(tmp_ports >> 16);
+        break;
+    case IPPROTO_AH:
+        tmp_ports = *(uint32_t *)(pkt->transport_header + 4);
+        key->src = pkt->ip->ip_src;
+        key->dst = pkt->ip->ip_dst;
+        key->src_port = ntohs(tmp_ports & 0xffff);
+        key->dst_port = ntohs(tmp_ports >> 16);
+        break;
+    default:
+        key->src_port = 0;
+        key->dst_port = 0;
+        break;
+    }
+}
+
+Connection *connection_new(ConnectionKey *key)
+{
+    Connection *conn = g_slice_new(Connection);
+
+    conn->ip_proto = key->ip_proto;
+    conn->processing = false;
+    g_queue_init(&conn->primary_list);
+    g_queue_init(&conn->secondary_list);
+
+    return conn;
+}
+
+void connection_destroy(void *opaque)
+{
+    Connection *conn = opaque;
+
+    g_queue_foreach(&conn->primary_list, packet_destroy, NULL);
+    g_queue_free(&conn->primary_list);
+    g_queue_foreach(&conn->secondary_list, packet_destroy, NULL);
+    g_queue_free(&conn->secondary_list);
+    g_slice_free(Connection, conn);
+}
+
 Packet *packet_new(const void *data, int size)
 {
     Packet *pkt = g_slice_new(Packet);
@@ -68,3 +147,41 @@ void connection_hashtable_reset(GHashTable *connection_track_table)
 {
     g_hash_table_remove_all(connection_track_table);
 }
+
+static void colo_rm_connection(void *opaque, void *user_data)
+{
+    connection_destroy(opaque);
+}
+
+/* if not found, create a new connection and add to hash table */
+Connection *connection_get(GHashTable *connection_track_table,
+                           ConnectionKey *key,
+                           GQueue *conn_list)
+{
+    Connection *conn = g_hash_table_lookup(connection_track_table, key);
+    static uint32_t hashtable_size;
+
+    if (conn == NULL) {
+        ConnectionKey *new_key = g_memdup(key, sizeof(*key));
+
+        conn = connection_new(key);
+
+        hashtable_size += 1;
+        if (hashtable_size > HASHTABLE_MAX_SIZE) {
+            error_report("colo proxy connection hashtable full, clear it");
+            connection_hashtable_reset(connection_track_table);
+            /*
+             * clear the conn_list
+             */
+            if (conn_list) {
+                g_queue_foreach(conn_list, colo_rm_connection, NULL);
+            }
+
+            hashtable_size = 0;
+        }
+
+        g_hash_table_insert(connection_track_table, new_key, conn);
+    }
+
+    return conn;
+}
diff --git a/net/colo.h b/net/colo.h
index 8559f28..9cbc14e 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -30,7 +30,34 @@ typedef struct Packet {
     int size;
 } Packet;
 
+typedef struct ConnectionKey {
+    /* (src, dst) must be grouped, in the same way than in IP header */
+    struct in_addr src;
+    struct in_addr dst;
+    uint16_t src_port;
+    uint16_t dst_port;
+    uint8_t ip_proto;
+} QEMU_PACKED ConnectionKey;
+
+typedef struct Connection {
+    /* connection primary send queue: element type: Packet */
+    GQueue primary_list;
+    /* connection secondary send queue: element type: Packet */
+    GQueue secondary_list;
+    /* flag to enqueue unprocessed_connections */
+    bool processing;
+    uint8_t ip_proto;
+} Connection;
+
+uint32_t connection_key_hash(const void *opaque);
+int connection_key_equal(const void *opaque1, const void *opaque2);
 int parse_packet_early(Packet *pkt);
+void fill_connection_key(Packet *pkt, ConnectionKey *key);
+Connection *connection_new(ConnectionKey *key);
+void connection_destroy(void *opaque);
+Connection *connection_get(GHashTable *connection_track_table,
+                           ConnectionKey *key,
+                           GQueue *conn_list);
 void connection_hashtable_reset(GHashTable *connection_track_table);
 Packet *packet_new(const void *data, int size);
 void packet_destroy(void *opaque, void *user_data);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (4 preceding siblings ...)
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 05/10] colo-compare: track connection and enqueue packet Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-31  9:13   ` Jason Wang
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison Zhang Chen
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

If primary packet is same with secondary packet,
we will send primary packet and drop secondary
packet, otherwise notify COLO frame to do checkpoint.
If primary packet comes but secondary packet does not,
after REGULAR_PACKET_CHECK_MS milliseconds we set
the primary packet as old_packet,then do a checkpoint.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/colo-compare.c | 216 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo.c         |   1 +
 net/colo.h         |   3 +
 trace-events       |   2 +
 4 files changed, 222 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index bab215b..b90cf1f 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -36,6 +36,8 @@
 
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
+/* TODO: Should be configurable */
+#define REGULAR_PACKET_CHECK_MS 3000
 
 /*
   + CompareState ++
@@ -79,6 +81,10 @@ typedef struct CompareState {
     GQueue conn_list;
     /* hashtable to save connection */
     GHashTable *connection_track_table;
+    /* compare thread, a thread for each NIC */
+    QemuThread thread;
+    /* Timer used on the primary to find packets that are never matched */
+    QEMUTimer *timer;
 } CompareState;
 
 typedef struct CompareClass {
@@ -152,6 +158,113 @@ static int packet_enqueue(CompareState *s, int mode)
     return 0;
 }
 
+/*
+ * The IP packets sent by primary and secondary
+ * will be compared in here
+ * TODO support ip fragment, Out-Of-Order
+ * return:    0  means packet same
+ *            > 0 || < 0 means packet different
+ */
+static int colo_packet_compare(Packet *ppkt, Packet *spkt)
+{
+    trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
+                               inet_ntoa(ppkt->ip->ip_dst), spkt->size,
+                               inet_ntoa(spkt->ip->ip_src),
+                               inet_ntoa(spkt->ip->ip_dst));
+
+    if (ppkt->size == spkt->size) {
+        return memcmp(ppkt->data, spkt->data, spkt->size);
+    } else {
+        return -1;
+    }
+}
+
+static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
+{
+    trace_colo_compare_main("compare all");
+    return colo_packet_compare(ppkt, spkt);
+}
+
+static void colo_old_packet_check_one(void *opaque_packet,
+                                      void *opaque_found)
+{
+    int64_t now;
+    bool *found_old = (bool *)opaque_found;
+    Packet *ppkt = (Packet *)opaque_packet;
+
+    if (*found_old) {
+        /* Someone found an old packet earlier in the queue */
+        return;
+    }
+
+    now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    if ((now - ppkt->creation_ms) > REGULAR_PACKET_CHECK_MS) {
+        trace_colo_old_packet_check_found(ppkt->creation_ms);
+        *found_old = true;
+    }
+}
+
+static void colo_old_packet_check_one_conn(void *opaque,
+                                           void *user_data)
+{
+    bool found_old = false;
+    Connection *conn = opaque;
+
+    g_queue_foreach(&conn->primary_list, colo_old_packet_check_one,
+                    &found_old);
+    if (found_old) {
+        /* do checkpoint will flush old packet */
+        /* TODO: colo_notify_checkpoint();*/
+    }
+}
+
+/*
+ * Look for old packets that the secondary hasn't matched,
+ * if we have some then we have to checkpoint to wake
+ * the secondary up.
+ */
+static void colo_old_packet_check(void *opaque)
+{
+    CompareState *s = opaque;
+
+    g_queue_foreach(&s->conn_list, colo_old_packet_check_one_conn, NULL);
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare connection
+ */
+static void colo_compare_connection(void *opaque, void *user_data)
+{
+    CompareState *s = user_data;
+    Connection *conn = opaque;
+    Packet *pkt = NULL;
+    GList *result = NULL;
+    int ret;
+
+    while (!g_queue_is_empty(&conn->primary_list) &&
+           !g_queue_is_empty(&conn->secondary_list)) {
+        pkt = g_queue_pop_tail(&conn->primary_list);
+        result = g_queue_find_custom(&conn->secondary_list,
+                              pkt, (GCompareFunc)colo_packet_compare_all);
+
+        if (result) {
+            ret = compare_chr_send(s->chr_out, pkt->data, pkt->size);
+            if (ret < 0) {
+                error_report("colo_send_primary_packet failed");
+            }
+            trace_colo_compare_main("packet same and release packet");
+            g_queue_remove(&conn->secondary_list, result->data);
+            packet_destroy(pkt, NULL);
+        } else {
+            trace_colo_compare_main("packet different");
+            g_queue_push_tail(&conn->primary_list, pkt);
+            /* TODO: colo_notify_checkpoint();*/
+            break;
+        }
+    }
+}
+
 static int compare_chr_send(CharDriverState *out,
                             const uint8_t *buf,
                             uint32_t size)
@@ -179,6 +292,65 @@ err:
     return ret < 0 ? ret : -EIO;
 }
 
+static int compare_chr_can_read(void *opaque)
+{
+    return COMPARE_READ_LEN_MAX;
+}
+
+/*
+ * called from the main thread on the primary for packets
+ * arriving over the socket from the primary.
+ */
+static void compare_pri_chr_in(void *opaque, const uint8_t *buf, int size)
+{
+    CompareState *s = COLO_COMPARE(opaque);
+    int ret;
+
+    ret = net_fill_rstate(&s->pri_rs, buf, size);
+    if (ret == -1) {
+        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
+        error_report("colo-compare primary_in error");
+    }
+}
+
+/*
+ * called from the main thread on the primary for packets
+ * arriving over the socket from the secondary.
+ */
+static void compare_sec_chr_in(void *opaque, const uint8_t *buf, int size)
+{
+    CompareState *s = COLO_COMPARE(opaque);
+    int ret;
+
+    ret = net_fill_rstate(&s->sec_rs, buf, size);
+    if (ret == -1) {
+        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
+        error_report("colo-compare secondary_in error");
+    }
+}
+
+static void *colo_compare_thread(void *opaque)
+{
+    GMainContext *worker_context;
+    GMainLoop *compare_loop;
+    CompareState *s = opaque;
+
+    worker_context = g_main_context_new();
+
+    qemu_chr_add_handlers_full(s->chr_pri_in, compare_chr_can_read,
+                          compare_pri_chr_in, NULL, s, worker_context);
+    qemu_chr_add_handlers_full(s->chr_sec_in, compare_chr_can_read,
+                          compare_sec_chr_in, NULL, s, worker_context);
+
+    compare_loop = g_main_loop_new(worker_context, FALSE);
+
+    g_main_loop_run(compare_loop);
+
+    g_main_loop_unref(compare_loop);
+    g_main_context_unref(worker_context);
+    return NULL;
+}
+
 static char *compare_get_pri_indev(Object *obj, Error **errp)
 {
     CompareState *s = COLO_COMPARE(obj);
@@ -231,6 +403,9 @@ static void compare_pri_rs_finalize(SocketReadState *pri_rs)
     if (packet_enqueue(s, PRIMARY_IN)) {
         trace_colo_compare_main("primary: unsupported packet in");
         compare_chr_send(s->chr_out, pri_rs->buf, pri_rs->packet_len);
+    } else {
+        /* compare connection */
+        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
     }
 }
 
@@ -240,6 +415,9 @@ static void compare_sec_rs_finalize(SocketReadState *sec_rs)
 
     if (packet_enqueue(s, SECONDARY_IN)) {
         trace_colo_compare_main("secondary: unsupported packet in");
+    } else {
+        /* compare connection */
+        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
     }
 }
 
@@ -266,6 +444,20 @@ static int compare_chardev_opts(void *opaque,
 }
 
 /*
+ * Check old packet regularly so it can watch for any packets
+ * that the secondary hasn't produced equivalents of.
+ */
+static void check_old_packet_regular(void *opaque)
+{
+    CompareState *s = opaque;
+
+    timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+              REGULAR_PACKET_CHECK_MS);
+    /* if have old packet we will notify checkpoint */
+    colo_old_packet_check(s);
+}
+
+/*
  * called from the main thread on the primary
  * to setup colo-compare.
  */
@@ -273,6 +465,8 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
 {
     CompareState *s = COLO_COMPARE(uc);
     CompareChardevProps props;
+    char thread_name[64];
+    static int compare_id;
 
     if (!s->pri_indev || !s->sec_indev || !s->outdev) {
         error_setg(errp, "colo compare needs 'primary_in' ,"
@@ -356,6 +550,18 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
                                                       g_free,
                                                       connection_destroy);
 
+    sprintf(thread_name, "colo-compare %d", compare_id);
+    qemu_thread_create(&s->thread, thread_name,
+                       colo_compare_thread, s,
+                       QEMU_THREAD_JOINABLE);
+    compare_id++;
+
+    /* A regular timer to kick any packets that the secondary doesn't match */
+    s->timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, /* Only when guest runs */
+                            check_old_packet_regular, s);
+    timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+                        REGULAR_PACKET_CHECK_MS);
+
     return;
 }
 
@@ -397,6 +603,16 @@ static void colo_compare_finalize(Object *obj)
 
     g_queue_free(&s->conn_list);
 
+    if (s->thread.thread) {
+        /* compare connection */
+        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
+        qemu_thread_join(&s->thread);
+    }
+
+    if (s->timer) {
+        timer_del(s->timer);
+    }
+
     g_free(s->pri_indev);
     g_free(s->sec_indev);
     g_free(s->outdev);
diff --git a/net/colo.c b/net/colo.c
index bc86553..da4b771 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -128,6 +128,7 @@ Packet *packet_new(const void *data, int size)
 
     pkt->data = g_memdup(data, size);
     pkt->size = size;
+    pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
 
     return pkt;
 }
diff --git a/net/colo.h b/net/colo.h
index 9cbc14e..6b395a3 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -17,6 +17,7 @@
 
 #include "slirp/slirp.h"
 #include "qemu/jhash.h"
+#include "qemu/timer.h"
 
 #define HASHTABLE_MAX_SIZE 16384
 
@@ -28,6 +29,8 @@ typedef struct Packet {
     };
     uint8_t *transport_header;
     int size;
+    /* Time of packet creation, in wall clock ms */
+    int64_t creation_ms;
 } Packet;
 
 typedef struct ConnectionKey {
diff --git a/trace-events b/trace-events
index 703de1a..1537e91 100644
--- a/trace-events
+++ b/trace-events
@@ -1919,3 +1919,5 @@ aspeed_vic_write(uint64_t offset, unsigned size, uint32_t data) "To 0x%" PRIx64
 
 # net/colo-compare.c
 colo_compare_main(const char *chr) ": %s"
+colo_compare_ip_info(int psize, const char *sta, const char *stb, int ssize, const char *stc, const char *std) "ppkt size = %d, ip_src = %s, ip_dst = %s, spkt size = %d, ip_src = %s, ip_dst = %s"
+colo_old_packet_check_found(int64_t old_time) "%" PRId64
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (5 preceding siblings ...)
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-31  9:33   ` Jason Wang
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 08/10] filter-rewriter: introduce filter-rewriter initialization Zhang Chen
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

We add TCP,UDP,ICMP packet comparison to replace
IP packet comparison. This can increase the
accuracy of the package comparison.
Less checkpoint more efficiency.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/colo-compare.c | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 trace-events       |   4 ++
 2 files changed, 152 insertions(+), 4 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index b90cf1f..0daefd9 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -18,6 +18,7 @@
 #include "qapi/qmp/qerror.h"
 #include "qapi/error.h"
 #include "net/net.h"
+#include "net/eth.h"
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
 #include "qemu/iov.h"
@@ -179,9 +180,136 @@ static int colo_packet_compare(Packet *ppkt, Packet *spkt)
     }
 }
 
-static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
+/*
+ * called from the compare thread on the primary
+ * for compare tcp packet
+ * compare_tcp copied from Dr. David Alan Gilbert's branch
+ */
+static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
+{
+    struct tcphdr *ptcp, *stcp;
+    int res;
+    char *sdebug, *ddebug;
+
+    trace_colo_compare_main("compare tcp");
+    if (ppkt->size != spkt->size) {
+        if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+            trace_colo_compare_main("pkt size not same");
+        }
+        return -1;
+    }
+
+    ptcp = (struct tcphdr *)ppkt->transport_header;
+    stcp = (struct tcphdr *)spkt->transport_header;
+
+    if (ptcp->th_seq != stcp->th_seq) {
+        if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+            trace_colo_compare_main("pkt tcp seq not same");
+        }
+        return -1;
+    }
+
+    /*
+     * The 'identification' field in the IP header is *very* random
+     * it almost never matches.  Fudge this by ignoring differences in
+     * unfragmented packets; they'll normally sort themselves out if different
+     * anyway, and it should recover at the TCP level.
+     * An alternative would be to get both the primary and secondary to rewrite
+     * somehow; but that would need some sync traffic to sync the state
+     */
+    if (ntohs(ppkt->ip->ip_off) & IP_DF) {
+        spkt->ip->ip_id = ppkt->ip->ip_id;
+        /* and the sum will be different if the IDs were different */
+        spkt->ip->ip_sum = ppkt->ip->ip_sum;
+    }
+
+    res = memcmp(ppkt->data + ETH_HLEN, spkt->data + ETH_HLEN,
+                (spkt->size - ETH_HLEN));
+
+    if (res != 0 && trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+        sdebug = strdup(inet_ntoa(ppkt->ip->ip_src));
+        ddebug = strdup(inet_ntoa(ppkt->ip->ip_dst));
+        fprintf(stderr, "%s: src/dst: %s/%s p: seq/ack=%u/%u"
+        " s: seq/ack=%u/%u res=%d flags=%x/%x\n", __func__,
+                   sdebug, ddebug,
+                   ntohl(ptcp->th_seq), ntohl(ptcp->th_ack),
+                   ntohl(stcp->th_seq), ntohl(stcp->th_ack),
+                   res, ptcp->th_flags, stcp->th_flags);
+
+        trace_colo_compare_tcp_miscompare("Primary len", ppkt->size);
+        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size);
+        trace_colo_compare_tcp_miscompare("Secondary len", spkt->size);
+        qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size);
+
+        g_free(sdebug);
+        g_free(ddebug);
+    }
+
+    return res;
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare udp packet
+ */
+static int colo_packet_compare_udp(Packet *spkt, Packet *ppkt)
+{
+    int ret;
+
+    trace_colo_compare_main("compare udp");
+    ret = colo_packet_compare(ppkt, spkt);
+
+    if (ret) {
+        trace_colo_compare_udp_miscompare("primary pkt size", ppkt->size);
+        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size);
+        trace_colo_compare_udp_miscompare("Secondary pkt size", spkt->size);
+        qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size);
+    }
+
+    return ret;
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare icmp packet
+ */
+static int colo_packet_compare_icmp(Packet *spkt, Packet *ppkt)
+{
+    int network_length;
+
+    trace_colo_compare_main("compare icmp");
+    network_length = ppkt->ip->ip_hl * 4;
+    if (ppkt->size != spkt->size ||
+        ppkt->size < network_length + ETH_HLEN) {
+        return -1;
+    }
+
+    if (colo_packet_compare(ppkt, spkt)) {
+        trace_colo_compare_icmp_miscompare("primary pkt size",
+                                           ppkt->size);
+        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare",
+                     ppkt->size);
+        trace_colo_compare_icmp_miscompare("Secondary pkt size",
+                                           spkt->size);
+        qemu_hexdump((char *)spkt->data, stderr, "colo-compare",
+                     spkt->size);
+        return -1;
+    } else {
+        return 0;
+    }
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare other packet
+ */
+static int colo_packet_compare_other(Packet *spkt, Packet *ppkt)
 {
-    trace_colo_compare_main("compare all");
+    trace_colo_compare_main("compare other");
+    trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
+                               inet_ntoa(ppkt->ip->ip_dst), spkt->size,
+                               inet_ntoa(spkt->ip->ip_src),
+                               inet_ntoa(spkt->ip->ip_dst));
     return colo_packet_compare(ppkt, spkt);
 }
 
@@ -245,8 +373,24 @@ static void colo_compare_connection(void *opaque, void *user_data)
     while (!g_queue_is_empty(&conn->primary_list) &&
            !g_queue_is_empty(&conn->secondary_list)) {
         pkt = g_queue_pop_tail(&conn->primary_list);
-        result = g_queue_find_custom(&conn->secondary_list,
-                              pkt, (GCompareFunc)colo_packet_compare_all);
+        switch (conn->ip_proto) {
+        case IPPROTO_TCP:
+            result = g_queue_find_custom(&conn->secondary_list,
+                     pkt, (GCompareFunc)colo_packet_compare_tcp);
+            break;
+        case IPPROTO_UDP:
+            result = g_queue_find_custom(&conn->secondary_list,
+                     pkt, (GCompareFunc)colo_packet_compare_udp);
+            break;
+        case IPPROTO_ICMP:
+            result = g_queue_find_custom(&conn->secondary_list,
+                     pkt, (GCompareFunc)colo_packet_compare_icmp);
+            break;
+        default:
+            result = g_queue_find_custom(&conn->secondary_list,
+                     pkt, (GCompareFunc)colo_packet_compare_other);
+            break;
+        }
 
         if (result) {
             ret = compare_chr_send(s->chr_out, pkt->data, pkt->size);
diff --git a/trace-events b/trace-events
index 1537e91..ab22eb2 100644
--- a/trace-events
+++ b/trace-events
@@ -1919,5 +1919,9 @@ aspeed_vic_write(uint64_t offset, unsigned size, uint32_t data) "To 0x%" PRIx64
 
 # net/colo-compare.c
 colo_compare_main(const char *chr) ": %s"
+colo_compare_tcp_miscompare(const char *sta, int size) ": %s = %d"
+colo_compare_udp_miscompare(const char *sta, int size) ": %s = %d"
+colo_compare_icmp_miscompare(const char *sta, int size) ": %s = %d"
 colo_compare_ip_info(int psize, const char *sta, const char *stb, int ssize, const char *stc, const char *std) "ppkt size = %d, ip_src = %s, ip_dst = %s, spkt size = %d, ip_src = %s, ip_dst = %s"
 colo_old_packet_check_found(int64_t old_time) "%" PRId64
+colo_compare_miscompare(void) ""
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 08/10] filter-rewriter: introduce filter-rewriter initialization
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (6 preceding siblings ...)
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 09/10] filter-rewriter: track connection and parse packet Zhang Chen
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

Filter-rewriter is a part of COLO project.
It will rewrite some of secondary packet to make
secondary guest's tcp connection established successfully.
In this module we will rewrite tcp packet's ack to the secondary
from primary,and rewrite tcp packet's seq to the primary from
secondary.

usage:

colo secondary:
-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
-object filter-rewriter,id=rew0,netdev=hn0,queue=all

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/Makefile.objs     |   1 +
 net/filter-rewriter.c | 110 ++++++++++++++++++++++++++++++++++++++++++++++++++
 qemu-options.hx       |  13 ++++++
 vl.c                  |   3 +-
 4 files changed, 126 insertions(+), 1 deletion(-)
 create mode 100644 net/filter-rewriter.c

diff --git a/net/Makefile.objs b/net/Makefile.objs
index beb504b..2a80df5 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -18,3 +18,4 @@ common-obj-y += filter-buffer.o
 common-obj-y += filter-mirror.o
 common-obj-y += colo-compare.o
 common-obj-y += colo.o
+common-obj-y += filter-rewriter.o
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
new file mode 100644
index 0000000..e23c21d
--- /dev/null
+++ b/net/filter-rewriter.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "net/colo.h"
+#include "net/filter.h"
+#include "net/net.h"
+#include "qemu-common.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
+#include "qapi-visit.h"
+#include "qom/object.h"
+#include "qemu/main-loop.h"
+#include "qemu/iov.h"
+#include "net/checksum.h"
+
+#define FILTER_COLO_REWRITER(obj) \
+    OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
+
+#define TYPE_FILTER_REWRITER "filter-rewriter"
+
+enum {
+    PRIMARY = 0,
+    SECONDARY,
+};
+
+typedef struct RewriterState {
+    NetFilterState parent_obj;
+    NetQueue *incoming_queue;
+    /* hashtable to save connection */
+    GHashTable *connection_track_table;
+} RewriterState;
+
+static void filter_rewriter_flush(NetFilterState *nf)
+{
+    RewriterState *s = FILTER_COLO_REWRITER(nf);
+
+    if (!qemu_net_queue_flush(s->incoming_queue)) {
+        /* Unable to empty the queue, purge remaining packets */
+        qemu_net_queue_purge(s->incoming_queue, nf->netdev);
+    }
+}
+
+static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
+                                         NetClientState *sender,
+                                         unsigned flags,
+                                         const struct iovec *iov,
+                                         int iovcnt,
+                                         NetPacketSent *sent_cb)
+{
+    /*
+     * if we get tcp packet
+     * we will rewrite it to make secondary guest's
+     * connection established successfully
+     */
+    return 0;
+}
+
+static void colo_rewriter_cleanup(NetFilterState *nf)
+{
+    RewriterState *s = FILTER_COLO_REWRITER(nf);
+
+    /* flush packets */
+    if (s->incoming_queue) {
+        filter_rewriter_flush(nf);
+        g_free(s->incoming_queue);
+    }
+}
+
+static void colo_rewriter_setup(NetFilterState *nf, Error **errp)
+{
+    RewriterState *s = FILTER_COLO_REWRITER(nf);
+
+    s->connection_track_table = g_hash_table_new_full(connection_key_hash,
+                                                      connection_key_equal,
+                                                      g_free,
+                                                      connection_destroy);
+    s->incoming_queue = qemu_new_net_queue(qemu_netfilter_pass_to_next, nf);
+}
+
+static void colo_rewriter_class_init(ObjectClass *oc, void *data)
+{
+    NetFilterClass *nfc = NETFILTER_CLASS(oc);
+
+    nfc->setup = colo_rewriter_setup;
+    nfc->cleanup = colo_rewriter_cleanup;
+    nfc->receive_iov = colo_rewriter_receive_iov;
+}
+
+static const TypeInfo colo_rewriter_info = {
+    .name = TYPE_FILTER_REWRITER,
+    .parent = TYPE_NETFILTER,
+    .class_init = colo_rewriter_class_init,
+    .instance_size = sizeof(RewriterState),
+};
+
+static void register_types(void)
+{
+    type_register_static(&colo_rewriter_info);
+}
+
+type_init(register_types);
diff --git a/qemu-options.hx b/qemu-options.hx
index 33d5d0b..8952361 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3859,6 +3859,19 @@ Create a filter-redirector we need to differ outdev id from indev id, id can not
 be the same. we can just use indev or outdev, but at least one of indev or outdev
 need to be specified.
 
+@item -object filter-rewriter,id=@var{id},netdev=@var{netdevid},rewriter-mode=@var{mode}[,queue=@var{all|rx|tx}]
+
+Filter-rewriter is a part of COLO project.It will rewrite tcp packet to
+secondary from primary to keep secondary tcp connection,and rewrite
+tcp packet to primary from secondary make tcp packet can be handled by
+client.
+
+usage:
+colo secondary:
+-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
+-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
+-object filter-rewriter,id=rew0,netdev=hn0,queue=all
+
 @item -object filter-dump,id=@var{id},netdev=@var{dev},file=@var{filename}][,maxlen=@var{len}]
 
 Dump the network traffic on netdev @var{dev} to the file specified by
diff --git a/vl.c b/vl.c
index c6b9a6f..b47be6a 100644
--- a/vl.c
+++ b/vl.c
@@ -2866,7 +2866,8 @@ static bool object_create_initial(const char *type)
         g_str_equal(type, "filter-dump") ||
         g_str_equal(type, "filter-mirror") ||
         g_str_equal(type, "filter-redirector") ||
-        g_str_equal(type, "colo-compare")) {
+        g_str_equal(type, "colo-compare") ||
+        g_str_equal(type, "filter-rewriter")) {
         return false;
     }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 09/10] filter-rewriter: track connection and parse packet
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (7 preceding siblings ...)
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 08/10] filter-rewriter: introduce filter-rewriter initialization Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 10/10] filter-rewriter: rewrite tcp packet to keep secondary connection Zhang Chen
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

We use net/colo.h to track connection and parse packet

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/colo.c            | 14 ++++++++++++++
 net/colo.h            |  1 +
 net/filter-rewriter.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 65 insertions(+)

diff --git a/net/colo.c b/net/colo.c
index da4b771..667df56 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -99,6 +99,20 @@ void fill_connection_key(Packet *pkt, ConnectionKey *key)
     }
 }
 
+void reverse_connection_key(ConnectionKey *key)
+{
+    struct in_addr tmp_ip;
+    uint16_t tmp_port;
+
+    tmp_ip = key->src;
+    key->src = key->dst;
+    key->dst = tmp_ip;
+
+    tmp_port = key->src_port;
+    key->src_port = key->dst_port;
+    key->dst_port = tmp_port;
+}
+
 Connection *connection_new(ConnectionKey *key)
 {
     Connection *conn = g_slice_new(Connection);
diff --git a/net/colo.h b/net/colo.h
index 6b395a3..0efaa6d 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -56,6 +56,7 @@ uint32_t connection_key_hash(const void *opaque);
 int connection_key_equal(const void *opaque1, const void *opaque2);
 int parse_packet_early(Packet *pkt);
 void fill_connection_key(Packet *pkt, ConnectionKey *key);
+void reverse_connection_key(ConnectionKey *key);
 Connection *connection_new(ConnectionKey *key);
 void connection_destroy(void *opaque);
 Connection *connection_get(GHashTable *connection_track_table,
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index e23c21d..0cb3cef 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -49,6 +49,20 @@ static void filter_rewriter_flush(NetFilterState *nf)
     }
 }
 
+/*
+ * Return 1 on success, if return 0 means the pkt
+ * is not TCP packet
+ */
+static int is_tcp_packet(Packet *pkt)
+{
+    if (!parse_packet_early(pkt) &&
+        pkt->ip->ip_p == IPPROTO_TCP) {
+        return 1;
+    } else {
+        return 0;
+    }
+}
+
 static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
                                          NetClientState *sender,
                                          unsigned flags,
@@ -56,11 +70,47 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
                                          int iovcnt,
                                          NetPacketSent *sent_cb)
 {
+    RewriterState *s = FILTER_COLO_REWRITER(nf);
+    Connection *conn;
+    ConnectionKey key = {{ 0 } };
+    Packet *pkt;
+    ssize_t size = iov_size(iov, iovcnt);
+    char *buf = g_malloc0(size);
+
+    iov_to_buf(iov, iovcnt, 0, buf, size);
+    pkt = packet_new(buf, size);
+
     /*
      * if we get tcp packet
      * we will rewrite it to make secondary guest's
      * connection established successfully
      */
+    if (pkt && is_tcp_packet(pkt)) {
+
+        fill_connection_key(pkt, &key);
+
+        if (sender == nf->netdev) {
+            /*
+             * We need make tcp TX and RX packet
+             * into one connection.
+             */
+            reverse_connection_key(&key);
+        }
+        conn = connection_get(s->connection_track_table,
+                              &key,
+                              NULL);
+
+        if (sender == nf->netdev) {
+            /* NET_FILTER_DIRECTION_TX */
+            /* handle_primary_tcp_pkt */
+        } else {
+            /* NET_FILTER_DIRECTION_RX */
+            /* handle_secondary_tcp_pkt */
+        }
+    }
+
+    packet_destroy(pkt, NULL);
+    pkt = NULL;
     return 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Qemu-devel] [PATCH V12 10/10] filter-rewriter: rewrite tcp packet to keep secondary connection
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (8 preceding siblings ...)
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 09/10] filter-rewriter: track connection and parse packet Zhang Chen
@ 2016-08-17  8:10 ` Zhang Chen
  2016-08-25  3:44 ` [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
  2016-08-31  9:39 ` Jason Wang
  11 siblings, 0 replies; 32+ messages in thread
From: Zhang Chen @ 2016-08-17  8:10 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Zhang Chen, Li Zhijian, Wen Congyang, zhanghailiang,
	eddie . dong, Dr . David Alan Gilbert

We will rewrite tcp packet secondary received and sent.
When colo guest is a tcp server.

Firstly, client start a tcp handshake. the packet's seq=client_seq,
ack=0,flag=SYN. COLO primary guest get this pkt and mirror(filter-mirror)
to secondary guest, secondary get it use filter-redirector.
Then,primary guest response pkt
(seq=primary_seq,ack=client_seq+1,flag=ACK|SYN).
secondary guest response pkt
(seq=secondary_seq,ack=client_seq+1,flag=ACK|SYN).
In here,we use filter-rewriter save the secondary_seq to it's tcp connection.
Finally handshake,client send pkt
(seq=client_seq+1,ack=primary_seq+1,flag=ACK).
Here,filter-rewriter can get primary_seq, and rewrite ack from primary_seq+1
to secondary_seq+1, recalculate checksum. So the secondary tcp connection
kept good.

When we send/recv packet.
client send pkt(seq=client_seq+1+data_len,ack=primary_seq+1,flag=ACK|PSH).
filter-rewriter rewrite ack and send to secondary guest.

primary guest response pkt
(seq=primary_seq+1,ack=client_seq+1+data_len,flag=ACK)
secondary guest response pkt
(seq=secondary_seq+1,ack=client_seq+1+data_len,flag=ACK)
we rewrite secondary guest seq from secondary_seq+1 to primary_seq+1.
So tcp connection kept good.

In code We use offset( = secondary_seq - primary_seq )
to rewrite seq or ack.
handle_primary_tcp_pkt: tcp_pkt->th_ack += offset;
handle_secondary_tcp_pkt: tcp_pkt->th_seq -= offset;

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/colo.c            |   2 +
 net/colo.h            |   7 ++++
 net/filter-rewriter.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++-
 trace-events          |   5 +++
 4 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/net/colo.c b/net/colo.c
index 667df56..828a201 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -119,6 +119,8 @@ Connection *connection_new(ConnectionKey *key)
 
     conn->ip_proto = key->ip_proto;
     conn->processing = false;
+    conn->offset = 0;
+    conn->syn_flag = 0;
     g_queue_init(&conn->primary_list);
     g_queue_init(&conn->secondary_list);
 
diff --git a/net/colo.h b/net/colo.h
index 0efaa6d..5c3d003 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -50,6 +50,13 @@ typedef struct Connection {
     /* flag to enqueue unprocessed_connections */
     bool processing;
     uint8_t ip_proto;
+    /* offset = secondary_seq - primary_seq */
+    tcp_seq  offset;
+    /*
+     * we use this flag update offset func
+     * run once in independent tcp connection
+     */
+    int syn_flag;
 } Connection;
 
 uint32_t connection_key_hash(const void *opaque);
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index 0cb3cef..c1cb7b2 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -21,6 +21,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/iov.h"
 #include "net/checksum.h"
+#include "trace.h"
 
 #define FILTER_COLO_REWRITER(obj) \
     OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
@@ -63,6 +64,93 @@ static int is_tcp_packet(Packet *pkt)
     }
 }
 
+/* handle tcp packet from primary guest */
+static int handle_primary_tcp_pkt(NetFilterState *nf,
+                                  Connection *conn,
+                                  Packet *pkt)
+{
+    struct tcphdr *tcp_pkt;
+
+    tcp_pkt = (struct tcphdr *)pkt->transport_header;
+    if (trace_event_get_state(TRACE_COLO_FILTER_REWRITER_DEBUG)) {
+        char *sdebug, *ddebug;
+        sdebug = strdup(inet_ntoa(pkt->ip->ip_src));
+        ddebug = strdup(inet_ntoa(pkt->ip->ip_dst));
+        trace_colo_filter_rewriter_pkt_info(__func__, sdebug, ddebug,
+                    ntohl(tcp_pkt->th_seq), ntohl(tcp_pkt->th_ack),
+                    tcp_pkt->th_flags);
+        trace_colo_filter_rewriter_conn_offset(conn->offset);
+        g_free(sdebug);
+        g_free(ddebug);
+    }
+
+    if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_SYN)) {
+        /*
+         * we use this flag update offset func
+         * run once in independent tcp connection
+         */
+        conn->syn_flag = 1;
+    }
+
+    if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK)) {
+        if (conn->syn_flag) {
+            /*
+             * offset = secondary_seq - primary seq
+             * ack packet sent by guest from primary node,
+             * so we use th_ack - 1 get primary_seq
+             */
+            conn->offset -= (ntohl(tcp_pkt->th_ack) - 1);
+            conn->syn_flag = 0;
+        }
+        /* handle packets to the secondary from the primary */
+        tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
+
+        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+    }
+
+    return 0;
+}
+
+/* handle tcp packet from secondary guest */
+static int handle_secondary_tcp_pkt(NetFilterState *nf,
+                                    Connection *conn,
+                                    Packet *pkt)
+{
+    struct tcphdr *tcp_pkt;
+
+    tcp_pkt = (struct tcphdr *)pkt->transport_header;
+
+    if (trace_event_get_state(TRACE_COLO_FILTER_REWRITER_DEBUG)) {
+        char *sdebug, *ddebug;
+        sdebug = strdup(inet_ntoa(pkt->ip->ip_src));
+        ddebug = strdup(inet_ntoa(pkt->ip->ip_dst));
+        trace_colo_filter_rewriter_pkt_info(__func__, sdebug, ddebug,
+                    ntohl(tcp_pkt->th_seq), ntohl(tcp_pkt->th_ack),
+                    tcp_pkt->th_flags);
+        trace_colo_filter_rewriter_conn_offset(conn->offset);
+        g_free(sdebug);
+        g_free(ddebug);
+    }
+
+    if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == (TH_ACK | TH_SYN))) {
+        /*
+         * save offset = secondary_seq and then
+         * in handle_primary_tcp_pkt make offset
+         * = secondary_seq - primary_seq
+         */
+        conn->offset = ntohl(tcp_pkt->th_seq);
+    }
+
+    if ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK) {
+        /* handle packets to the primary from the secondary*/
+        tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
+
+        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+    }
+
+    return 0;
+}
+
 static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
                                          NetClientState *sender,
                                          unsigned flags,
@@ -102,10 +190,30 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
 
         if (sender == nf->netdev) {
             /* NET_FILTER_DIRECTION_TX */
-            /* handle_primary_tcp_pkt */
+            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
+                qemu_net_queue_send(s->incoming_queue, sender, 0,
+                (const uint8_t *)pkt->data, pkt->size, NULL);
+                packet_destroy(pkt, NULL);
+                pkt = NULL;
+                /*
+                 * We block the packet here,after rewrite pkt
+                 * and will send it
+                 */
+                return 1;
+            }
         } else {
             /* NET_FILTER_DIRECTION_RX */
-            /* handle_secondary_tcp_pkt */
+            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
+                qemu_net_queue_send(s->incoming_queue, sender, 0,
+                (const uint8_t *)pkt->data, pkt->size, NULL);
+                packet_destroy(pkt, NULL);
+                pkt = NULL;
+                /*
+                 * We block the packet here,after rewrite pkt
+                 * and will send it
+                 */
+                return 1;
+            }
         }
     }
 
diff --git a/trace-events b/trace-events
index ab22eb2..a12279c 100644
--- a/trace-events
+++ b/trace-events
@@ -1925,3 +1925,8 @@ colo_compare_icmp_miscompare(const char *sta, int size) ": %s = %d"
 colo_compare_ip_info(int psize, const char *sta, const char *stb, int ssize, const char *stc, const char *std) "ppkt size = %d, ip_src = %s, ip_dst = %s, spkt size = %d, ip_src = %s, ip_dst = %s"
 colo_old_packet_check_found(int64_t old_time) "%" PRId64
 colo_compare_miscompare(void) ""
+
+# net/filter-rewriter.c
+colo_filter_rewriter_debug(void) ""
+colo_filter_rewriter_pkt_info(const char *func, const char *src, const char *dst, uint32_t seq, uint32_t ack, uint32_t flag) "%s: src/dst: %s/%s p: seq/ack=%u/%u  flags=%x\n"
+colo_filter_rewriter_conn_offset(uint32_t offset) ": offset=%u\n"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (9 preceding siblings ...)
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 10/10] filter-rewriter: rewrite tcp packet to keep secondary connection Zhang Chen
@ 2016-08-25  3:44 ` Zhang Chen
  2016-08-25  4:07   ` Jason Wang
  2016-08-31  9:39 ` Jason Wang
  11 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-25  3:44 UTC (permalink / raw)
  To: qemu devel, Jason Wang
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert

Hi~~ Jason.

If you have time, can you give me some feedback for this series?


Thanks

Zhang Chen


On 08/17/2016 04:10 PM, Zhang Chen wrote:
> COLO-compare is a part of COLO project. It is used
> to compare the network package to help COLO decide
> whether to do checkpoint.
>
> Filter-rewriter is a part of COLO project too.
> It will rewrite some of secondary packet to make
> secondary guest's connection established successfully.
> In this module we will rewrite tcp packet's ack to the secondary
> from primary,and rewrite tcp packet's seq to the primary from
> secondary.
>
> The full version in this github:
> https://github.com/zhangckid/qemu/tree/colo-v2.7-proxy-mode-compare-and-rewriter-aug16
>
>
> v12:
>    - add qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>      to this series as the first patch.
>    - update COLO net ascii figure.
>    - add chardev socket check.
>    - fix some typo.
>    - add some comments.
>    - rename net/colo-base.c to net/colo.c
>    - rename network/transport_layer to network/transport_header.
>    - move the job that clear coon_list when hashtable_size oversize
>      to connection_get.
>    - reuse connection_destroy() do colo_rm_connection().
>    - fix pkt mem leak in colo_compare_connection().
>      (result be released in g_queue_remove(), so it were not leak)
>    - rename thread_name "compare" to "colo-compare".
>    - change icmp compare to memcmp().
>
> v11:
>    - Make patch 5 to a independent patch series.
>      [PATCH V3] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>    - For Jason's comments, merge filter-rewriter to this series.
>      (patch 7,8,9)
>    - Add reverse_connection_key()
>    - remove conn_list in filter-rewriter
>    - remove unprocessed_connections
>    - add some comments
>
> v10:
>    - fix typo
>    - Should we make patch 5 independent with this series?
>      This patch just add a API for qemu-char.
>
> v9:
>   p5:
>    - use chr_update_read_handler_full() replace
>      the chr_update_read_handler()
>    - use io_watch_poll_prepare_full() replace
>      the io_watch_poll_prepare()
>    - use io_watch_poll_funcs_full replace
>      the io_watch_poll_funcs
>    - avoid code duplication
>
> v8:
>   p5:
>    - add new patch:
>      qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>
> v7:
>   p5:
>     - add [PATCH]qemu-char: Fix context for g_source_attach()
>       in this patch series.
>
> v6:
>   p6:
>     - add more commit log.
>     - fix icmp comparison to compare all packet.
>
>   p5:
>     - add more cpmments in commit log.
>     - change REGULAR_CHECK_MS to REGULAR_PACKET_CHECK_MS
>     - make check old packet independent to compare thread
>     - remove thread_status
>
>   p4:
>     - change this patch only about
>       Connection and ConnectionKey.
>     - add some comments in commit log.
>     - remove mode in fill_connection_key().
>     - fix some comments and bug.
>     - move colo_conn_state to patch of
>       "work with colo-frame"
>     - remove conn_list_lock.
>     - add MAX_QUEUE_SIZE, if primary_list or
>       secondary_list biger than MAX_QUEUE_SIZE
>       we will drop packet.
>
>   p3:
>     - add new independent kernel jhash patch.
>
>   p2:
>     - add new independent colo-base patch.
>
>   p1:
>     - add a ascii figure and some comments to explain it
>     - move trace.h to p2
>     - move QTAILQ_HEAD(, CompareState) net_compares to
>       patch of "work with colo-frame"
>     - add some comments in qemu-option.hx
>
>
> v5:
>   p3:
>      - comments from Jason
>        we poll and handle chardev in comapre thread,
>        Through this way, there's no need for extra
>        synchronization with main loop
>        this depend on another patch:
>        qemu-char: Fix context for g_source_attach()
>      - remove QemuEvent
>   p2:
>      - remove conn->list_lock
>   p1:
>      - move compare_pri/sec_chr_in to p3
>      - move compare_chr_send to p2
>
> v4:
>   p4:
>      - add some comments
>      - fix some trace-events
>      - fix tcp compare error
>   p3:
>      - add rcu_read_lock().
>      - fix trace name
>      - fix jason's other comments
>      - rebase some Dave's branch function
>   p2:
>      - colo_compare_connection() change g_queue_push_head() to
>      - g_queue_push_tail() match to sorted order.
>      - remove pkt->s
>      - move data structure to colo-base.h
>      - add colo-base.c reuse codes for filter-rewriter
>      - add some filter-rewriter needs struct
>      - depends on previous SocketReadState patch
>   p1:
>      - except move qemu_chr_add_handlers()
>        to colo thread
>      - remove class_finalize
>      - remove secondary arp codes
>      - depends on previous SocketReadState patch
>
> v3:
>    - rebase colo-compare to colo-frame v2.7
>    - fix most of Dave's comments
>      (except RCU)
>    - add TCP,UDP,ICMP and other packet comparison
>    - add trace-event
>    - add some comments
>    - other bug fix
>    - add RFC index
>    - add usage in patch 1/4
>
> v2:
>    - add jhash.h
>
> v1:
>    - initial patch
>
>
> Zhang Chen (10):
>    qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>    colo-compare: introduce colo compare initialization
>    net/colo.c: add colo.c to define and handle packet
>    Jhash: add linux kernel jhashtable in qemu
>    colo-compare: track connection and enqueue packet
>    colo-compare: introduce packet comparison thread
>    colo-compare: add TCP,UDP,ICMP packet comparison
>    filter-rewriter: introduce filter-rewriter initialization
>    filter-rewriter: track connection and parse packet
>    filter-rewriter: rewrite tcp packet to keep secondary connection
>
>   include/qemu/jhash.h  |  59 ++++
>   include/sysemu/char.h |  11 +-
>   net/Makefile.objs     |   3 +
>   net/colo-compare.c    | 784 ++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/colo.c            | 204 +++++++++++++
>   net/colo.h            |  76 +++++
>   net/filter-rewriter.c | 268 +++++++++++++++++
>   qemu-char.c           |  77 +++--
>   qemu-options.hx       |  52 ++++
>   trace-events          |  14 +
>   vl.c                  |   4 +-
>   11 files changed, 1526 insertions(+), 26 deletions(-)
>   create mode 100644 include/qemu/jhash.h
>   create mode 100644 net/colo-compare.c
>   create mode 100644 net/colo.c
>   create mode 100644 net/colo.h
>   create mode 100644 net/filter-rewriter.c
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter
  2016-08-25  3:44 ` [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
@ 2016-08-25  4:07   ` Jason Wang
  0 siblings, 0 replies; 32+ messages in thread
From: Jason Wang @ 2016-08-25  4:07 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 2016年08月25日 11:44, Zhang Chen wrote:
> Hi~~ Jason.
>
> If you have time, can you give me some feedback for this series?
>
>
> Thanks
>
> Zhang Chen

Yes, a little busy this week, will do it next week.

Thanks

>
>
> On 08/17/2016 04:10 PM, Zhang Chen wrote:
>> COLO-compare is a part of COLO project. It is used
>> to compare the network package to help COLO decide
>> whether to do checkpoint.
>>
>> Filter-rewriter is a part of COLO project too.
>> It will rewrite some of secondary packet to make
>> secondary guest's connection established successfully.
>> In this module we will rewrite tcp packet's ack to the secondary
>> from primary,and rewrite tcp packet's seq to the primary from
>> secondary.
>>
>> The full version in this github:
>> https://github.com/zhangckid/qemu/tree/colo-v2.7-proxy-mode-compare-and-rewriter-aug16 
>>
>>
>>
>> v12:
>>    - add qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>>      to this series as the first patch.
>>    - update COLO net ascii figure.
>>    - add chardev socket check.
>>    - fix some typo.
>>    - add some comments.
>>    - rename net/colo-base.c to net/colo.c
>>    - rename network/transport_layer to network/transport_header.
>>    - move the job that clear coon_list when hashtable_size oversize
>>      to connection_get.
>>    - reuse connection_destroy() do colo_rm_connection().
>>    - fix pkt mem leak in colo_compare_connection().
>>      (result be released in g_queue_remove(), so it were not leak)
>>    - rename thread_name "compare" to "colo-compare".
>>    - change icmp compare to memcmp().
>>
>> v11:
>>    - Make patch 5 to a independent patch series.
>>      [PATCH V3] qemu-char: Add qemu_chr_add_handlers_full() for 
>> GMaincontext
>>    - For Jason's comments, merge filter-rewriter to this series.
>>      (patch 7,8,9)
>>    - Add reverse_connection_key()
>>    - remove conn_list in filter-rewriter
>>    - remove unprocessed_connections
>>    - add some comments
>>
>> v10:
>>    - fix typo
>>    - Should we make patch 5 independent with this series?
>>      This patch just add a API for qemu-char.
>>
>> v9:
>>   p5:
>>    - use chr_update_read_handler_full() replace
>>      the chr_update_read_handler()
>>    - use io_watch_poll_prepare_full() replace
>>      the io_watch_poll_prepare()
>>    - use io_watch_poll_funcs_full replace
>>      the io_watch_poll_funcs
>>    - avoid code duplication
>>
>> v8:
>>   p5:
>>    - add new patch:
>>      qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>>
>> v7:
>>   p5:
>>     - add [PATCH]qemu-char: Fix context for g_source_attach()
>>       in this patch series.
>>
>> v6:
>>   p6:
>>     - add more commit log.
>>     - fix icmp comparison to compare all packet.
>>
>>   p5:
>>     - add more cpmments in commit log.
>>     - change REGULAR_CHECK_MS to REGULAR_PACKET_CHECK_MS
>>     - make check old packet independent to compare thread
>>     - remove thread_status
>>
>>   p4:
>>     - change this patch only about
>>       Connection and ConnectionKey.
>>     - add some comments in commit log.
>>     - remove mode in fill_connection_key().
>>     - fix some comments and bug.
>>     - move colo_conn_state to patch of
>>       "work with colo-frame"
>>     - remove conn_list_lock.
>>     - add MAX_QUEUE_SIZE, if primary_list or
>>       secondary_list biger than MAX_QUEUE_SIZE
>>       we will drop packet.
>>
>>   p3:
>>     - add new independent kernel jhash patch.
>>
>>   p2:
>>     - add new independent colo-base patch.
>>
>>   p1:
>>     - add a ascii figure and some comments to explain it
>>     - move trace.h to p2
>>     - move QTAILQ_HEAD(, CompareState) net_compares to
>>       patch of "work with colo-frame"
>>     - add some comments in qemu-option.hx
>>
>>
>> v5:
>>   p3:
>>      - comments from Jason
>>        we poll and handle chardev in comapre thread,
>>        Through this way, there's no need for extra
>>        synchronization with main loop
>>        this depend on another patch:
>>        qemu-char: Fix context for g_source_attach()
>>      - remove QemuEvent
>>   p2:
>>      - remove conn->list_lock
>>   p1:
>>      - move compare_pri/sec_chr_in to p3
>>      - move compare_chr_send to p2
>>
>> v4:
>>   p4:
>>      - add some comments
>>      - fix some trace-events
>>      - fix tcp compare error
>>   p3:
>>      - add rcu_read_lock().
>>      - fix trace name
>>      - fix jason's other comments
>>      - rebase some Dave's branch function
>>   p2:
>>      - colo_compare_connection() change g_queue_push_head() to
>>      - g_queue_push_tail() match to sorted order.
>>      - remove pkt->s
>>      - move data structure to colo-base.h
>>      - add colo-base.c reuse codes for filter-rewriter
>>      - add some filter-rewriter needs struct
>>      - depends on previous SocketReadState patch
>>   p1:
>>      - except move qemu_chr_add_handlers()
>>        to colo thread
>>      - remove class_finalize
>>      - remove secondary arp codes
>>      - depends on previous SocketReadState patch
>>
>> v3:
>>    - rebase colo-compare to colo-frame v2.7
>>    - fix most of Dave's comments
>>      (except RCU)
>>    - add TCP,UDP,ICMP and other packet comparison
>>    - add trace-event
>>    - add some comments
>>    - other bug fix
>>    - add RFC index
>>    - add usage in patch 1/4
>>
>> v2:
>>    - add jhash.h
>>
>> v1:
>>    - initial patch
>>
>>
>> Zhang Chen (10):
>>    qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>>    colo-compare: introduce colo compare initialization
>>    net/colo.c: add colo.c to define and handle packet
>>    Jhash: add linux kernel jhashtable in qemu
>>    colo-compare: track connection and enqueue packet
>>    colo-compare: introduce packet comparison thread
>>    colo-compare: add TCP,UDP,ICMP packet comparison
>>    filter-rewriter: introduce filter-rewriter initialization
>>    filter-rewriter: track connection and parse packet
>>    filter-rewriter: rewrite tcp packet to keep secondary connection
>>
>>   include/qemu/jhash.h  |  59 ++++
>>   include/sysemu/char.h |  11 +-
>>   net/Makefile.objs     |   3 +
>>   net/colo-compare.c    | 784 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/colo.c            | 204 +++++++++++++
>>   net/colo.h            |  76 +++++
>>   net/filter-rewriter.c | 268 +++++++++++++++++
>>   qemu-char.c           |  77 +++--
>>   qemu-options.hx       |  52 ++++
>>   trace-events          |  14 +
>>   vl.c                  |   4 +-
>>   11 files changed, 1526 insertions(+), 26 deletions(-)
>>   create mode 100644 include/qemu/jhash.h
>>   create mode 100644 net/colo-compare.c
>>   create mode 100644 net/colo.c
>>   create mode 100644 net/colo.h
>>   create mode 100644 net/filter-rewriter.c
>>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization Zhang Chen
@ 2016-08-31  7:53   ` Jason Wang
  2016-08-31  8:06     ` Hailiang Zhang
  2016-08-31  9:03     ` Zhang Chen
  0 siblings, 2 replies; 32+ messages in thread
From: Jason Wang @ 2016-08-31  7:53 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 2016年08月17日 16:10, Zhang Chen wrote:
> This a COLO net ascii figure:
>
>   Primary qemu                                                           Secondary qemu
> +--------------------------------------------------------------+       +----------------------------------------------------------------+
> | +----------------------------------------------------------+ |       |  +-----------------------------------------------------------+ |
> | |                                                          | |       |  |                                                           | |
> | |                        guest                             | |       |  |                        guest                              | |
> | |                                                          | |       |  |                                                           | |
> | +-------^--------------------------+-----------------------+ |       |  +---------------------+--------+----------------------------+ |
> |         |                          |                         |       |                        ^        |                              |
> |         |                          |                         |       |                        |        |                              |
> |         |  +------------------------------------------------------+  |                        |        |                              |
> |netfilter|  |                       |                         |    |  |   netfilter            |        |                              |
> | +----------+ +----------------------------+                  |    |  |  +-----------------------------------------------------------+ |
> | |       |  |                       |      |        out       |    |  |  |                     |        |  filter excute order       | |
> | |       |  |          +-----------------------------+        |    |  |  |                     |        | +------------------->      | |
> | |       |  |          |            |      |         |        |    |  |  |                     |        |   TCP                      | |
> | | +-----+--+-+  +-----v----+ +-----v----+ |pri +----+----+sec|    |  |  | +------------+  +---+----+---v+rewriter++  +------------+ | |
> | | |          |  |          | |          | |in  |         |in |    |  |  | |            |  |        |              |  |            | | |
> | | |  filter  |  |  filter  | |  filter  +------>  colo   <------+ +-------->  filter   +--> adjust |   adjust     +-->   filter   | | |
> | | |  mirror  |  |redirector| |redirector| |    | compare |   |  |    |  | | redirector |  | ack    |   seq        |  | redirector | | |
> | | |          |  |          | |          | |    |         |   |  |    |  | |            |  |        |              |  |            | | |
> | | +----^-----+  +----+-----+ +----------+ |    +---------+   |  |    |  | +------------+  +--------+--------------+  +---+--------+ | |
> | |      |   tx        |   rx           rx  |                  |  |    |  |            tx                        all       |  rx      | |
> | |      |             |                    |                  |  |    |  +-----------------------------------------------------------+ |
> | |      |             +--------------+     |                  |  |    |                                                   |            |
> | |      |   filter excute order      |     |                  |  |    |                                                   |            |
> | |      |  +---------------->        |     |                  |  +--------------------------------------------------------+            |
> | +-----------------------------------------+                  |       |                                                                |
> |        |                            |                        |       |                                                                |
> +--------------------------------------------------------------+       +----------------------------------------------------------------+
>           |guest receive               | guest send
>           |                            |
> +--------+----------------------------v------------------------+
> |                                                              |                          NOTE: filter direction is rx/tx/all
> |                         tap                                  |                          rx:receive packets sent to the netdev
> |                                                              |                          tx:receive packets sent by the netdev
> +--------------------------------------------------------------+

It's better to add a doc under docs to explain this configuration in 
detail on top of this series.

> In COLO-compare, we do packet comparing job.
> Packets coming from the primary char indev will be sent to outdev.
> Packets coming from the secondary char dev will be dropped after comparing.
> colo-comapre need two input chardev and one output chardev:
> primary_in=chardev1-id (source: primary send packet)
> secondary_in=chardev2-id (source: secondary send packet)
> outdev=chardev3-id
>
> usage:
>
> primary:
> -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
> -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
> -chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
> -chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
> -chardev socket,id=compare0-0,host=3.3.3.3,port=9001
> -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
> -chardev socket,id=compare_out0,host=3.3.3.3,port=9005
> -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
> -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
> -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
> -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>
> secondary:
> -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-ifdown
> -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
> -chardev socket,id=red0,host=3.3.3.3,port=9003
> -chardev socket,id=red1,host=3.3.3.3,port=9004
> -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
> -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>   net/Makefile.objs  |   1 +
>   net/colo-compare.c | 284 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>   qemu-options.hx    |  39 ++++++++
>   vl.c               |   3 +-
>   4 files changed, 326 insertions(+), 1 deletion(-)
>   create mode 100644 net/colo-compare.c
>
> diff --git a/net/Makefile.objs b/net/Makefile.objs
> index b7c22fd..ba92f73 100644
> --- a/net/Makefile.objs
> +++ b/net/Makefile.objs
> @@ -16,3 +16,4 @@ common-obj-$(CONFIG_NETMAP) += netmap.o
>   common-obj-y += filter.o
>   common-obj-y += filter-buffer.o
>   common-obj-y += filter-mirror.o
> +common-obj-y += colo-compare.o
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> new file mode 100644
> index 0000000..cdc3e0e
> --- /dev/null
> +++ b/net/colo-compare.c
> @@ -0,0 +1,284 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2016 FUJITSU LIMITED
> + * Copyright (c) 2016 Intel Corporation
> + *
> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "qemu-common.h"
> +#include "qapi/qmp/qerror.h"
> +#include "qapi/error.h"
> +#include "net/net.h"
> +#include "net/vhost_net.h"

Looks unnecessary.

> +#include "qom/object_interfaces.h"
> +#include "qemu/iov.h"
> +#include "qom/object.h"
> +#include "qemu/typedefs.h"
> +#include "net/queue.h"
> +#include "sysemu/char.h"
> +#include "qemu/sockets.h"
> +#include "qapi-visit.h"
> +
> +#define TYPE_COLO_COMPARE "colo-compare"
> +#define COLO_COMPARE(obj) \
> +    OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
> +
> +#define COMPARE_READ_LEN_MAX NET_BUFSIZE
> +
> +typedef struct CompareState {
> +    Object parent;
> +
> +    char *pri_indev;
> +    char *sec_indev;
> +    char *outdev;
> +    CharDriverState *chr_pri_in;
> +    CharDriverState *chr_sec_in;
> +    CharDriverState *chr_out;
> +    QTAILQ_ENTRY(CompareState) next;

This looks not used in this series but in commit "colo-compare and 
filter-rewriter work with colo-frame". We'd better delay the introducing 
to that patch.

> +    SocketReadState pri_rs;
> +    SocketReadState sec_rs;
> +} CompareState;
> +
> +typedef struct CompareClass {
> +    ObjectClass parent_class;
> +} CompareClass;
> +
> +typedef struct CompareChardevProps {
> +    bool is_socket;
> +    bool is_unix;
> +} CompareChardevProps;
> +
> +static char *compare_get_pri_indev(Object *obj, Error **errp)
> +{
> +    CompareState *s = COLO_COMPARE(obj);
> +
> +    return g_strdup(s->pri_indev);
> +}
> +
> +static void compare_set_pri_indev(Object *obj, const char *value, Error **errp)
> +{
> +    CompareState *s = COLO_COMPARE(obj);
> +
> +    g_free(s->pri_indev);
> +    s->pri_indev = g_strdup(value);
> +}
> +
> +static char *compare_get_sec_indev(Object *obj, Error **errp)
> +{
> +    CompareState *s = COLO_COMPARE(obj);
> +
> +    return g_strdup(s->sec_indev);
> +}
> +
> +static void compare_set_sec_indev(Object *obj, const char *value, Error **errp)
> +{
> +    CompareState *s = COLO_COMPARE(obj);
> +
> +    g_free(s->sec_indev);
> +    s->sec_indev = g_strdup(value);
> +}
> +
> +static char *compare_get_outdev(Object *obj, Error **errp)
> +{
> +    CompareState *s = COLO_COMPARE(obj);
> +
> +    return g_strdup(s->outdev);
> +}
> +
> +static void compare_set_outdev(Object *obj, const char *value, Error **errp)
> +{
> +    CompareState *s = COLO_COMPARE(obj);
> +
> +    g_free(s->outdev);
> +    s->outdev = g_strdup(value);
> +}
> +
> +static void compare_pri_rs_finalize(SocketReadState *pri_rs)
> +{
> +    /* if packet_enqueue pri pkt failed we will send unsupported packet */
> +}
> +
> +static void compare_sec_rs_finalize(SocketReadState *sec_rs)
> +{
> +    /* if packet_enqueue sec pkt failed we will notify trace */
> +}
> +
> +static int compare_chardev_opts(void *opaque,
> +                                const char *name, const char *value,
> +                                Error **errp)
> +{
> +    CompareChardevProps *props = opaque;
> +
> +    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") == 0) {
> +        props->is_socket = true;
> +    } else if (strcmp(name, "host") == 0) {

Typo? net_vhost_chardev_opts() did:

     } else if (strcmp(name, "path") == 0) {
         props->is_unix = true;
     }



> +        props->is_unix = true;
> +    } else if (strcmp(name, "port") == 0) {
> +    } else if (strcmp(name, "server") == 0) {
> +    } else if (strcmp(name, "wait") == 0) {
> +    } else {
> +        error_setg(errp,
> +                   "COLO-compare does not support a chardev with option %s=%s",
> +                   name, value);
> +        return -1;
> +    }
> +    return 0;
> +}
> +
> +/*
> + * called from the main thread on the primary
> + * to setup colo-compare.
> + */
> +static void colo_compare_complete(UserCreatable *uc, Error **errp)
> +{
> +    CompareState *s = COLO_COMPARE(uc);
> +    CompareChardevProps props;
> +
> +    if (!s->pri_indev || !s->sec_indev || !s->outdev) {
> +        error_setg(errp, "colo compare needs 'primary_in' ,"
> +                   "'secondary_in','outdev' property set");
> +        return;
> +    } else if (!strcmp(s->pri_indev, s->outdev) ||
> +               !strcmp(s->sec_indev, s->outdev) ||
> +               !strcmp(s->pri_indev, s->sec_indev)) {
> +        error_setg(errp, "'indev' and 'outdev' could not be same "
> +                   "for compare module");
> +        return;
> +    }
> +
> +    s->chr_pri_in = qemu_chr_find(s->pri_indev);
> +    if (s->chr_pri_in == NULL) {
> +        error_setg(errp, "Primary IN Device '%s' not found",
> +                   s->pri_indev);
> +        return;
> +    }
> +
> +    /* inspect chardev opts */
> +    memset(&props, 0, sizeof(props));
> +    if (qemu_opt_foreach(s->chr_pri_in->opts, compare_chardev_opts, &props, errp)) {
> +        return;
> +    }
> +
> +    if (!props.is_socket || !props.is_unix) {
> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
> +                   s->pri_indev);
> +        return;
> +    }
> +
> +    s->chr_sec_in = qemu_chr_find(s->sec_indev);
> +    if (s->chr_sec_in == NULL) {
> +        error_setg(errp, "Secondary IN Device '%s' not found",
> +                   s->sec_indev);
> +        return;
> +    }
> +
> +    memset(&props, 0, sizeof(props));
> +    if (qemu_opt_foreach(s->chr_sec_in->opts, compare_chardev_opts, &props, errp)) {
> +        return;
> +    }
> +
> +    if (!props.is_socket || !props.is_unix) {
> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
> +                   s->sec_indev);

I believe tcp socket is also supported?

> +        return;
> +    }
> +
> +    s->chr_out = qemu_chr_find(s->outdev);
> +    if (s->chr_out == NULL) {
> +        error_setg(errp, "OUT Device '%s' not found", s->outdev);
> +        return;
> +    }
> +
> +    memset(&props, 0, sizeof(props));
> +    if (qemu_opt_foreach(s->chr_out->opts, compare_chardev_opts, &props, errp)) {
> +        return;
> +    }
> +
> +    if (!props.is_socket || !props.is_unix) {
> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
> +                   s->outdev);

Ditto, and there's code duplication, please introduce a helper to do above.

> +        return;
> +    }
> +
> +    qemu_chr_fe_claim_no_fail(s->chr_pri_in);
> +
> +    qemu_chr_fe_claim_no_fail(s->chr_sec_in);
> +
> +    qemu_chr_fe_claim_no_fail(s->chr_out);
> +
> +    net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
> +    net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
> +
> +    return;
> +}
> +
> +static void colo_compare_class_init(ObjectClass *oc, void *data)
> +{
> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
> +
> +    ucc->complete = colo_compare_complete;
> +}
> +
> +static void colo_compare_init(Object *obj)
> +{
> +    object_property_add_str(obj, "primary_in",
> +                            compare_get_pri_indev, compare_set_pri_indev,
> +                            NULL);
> +    object_property_add_str(obj, "secondary_in",
> +                            compare_get_sec_indev, compare_set_sec_indev,
> +                            NULL);
> +    object_property_add_str(obj, "outdev",
> +                            compare_get_outdev, compare_set_outdev,
> +                            NULL);
> +}
> +
> +static void colo_compare_finalize(Object *obj)
> +{
> +    CompareState *s = COLO_COMPARE(obj);
> +
> +    if (s->chr_pri_in) {
> +        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
> +        qemu_chr_fe_release(s->chr_pri_in);
> +    }
> +    if (s->chr_sec_in) {
> +        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
> +        qemu_chr_fe_release(s->chr_sec_in);
> +    }
> +    if (s->chr_out) {
> +        qemu_chr_fe_release(s->chr_out);
> +    }
> +
> +    g_free(s->pri_indev);
> +    g_free(s->sec_indev);
> +    g_free(s->outdev);
> +}
> +
> +static const TypeInfo colo_compare_info = {
> +    .name = TYPE_COLO_COMPARE,
> +    .parent = TYPE_OBJECT,
> +    .instance_size = sizeof(CompareState),
> +    .instance_init = colo_compare_init,
> +    .instance_finalize = colo_compare_finalize,
> +    .class_size = sizeof(CompareClass),
> +    .class_init = colo_compare_class_init,
> +    .interfaces = (InterfaceInfo[]) {
> +        { TYPE_USER_CREATABLE },
> +        { }
> +    }
> +};
> +
> +static void register_types(void)
> +{
> +    type_register_static(&colo_compare_info);
> +}
> +
> +type_init(register_types);
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 587de8f..33d5d0b 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -3866,6 +3866,45 @@ Dump the network traffic on netdev @var{dev} to the file specified by
>   The file format is libpcap, so it can be analyzed with tools such as tcpdump
>   or Wireshark.
>   
> +@item -object colo-compare,id=@var{id},primary_in=@var{chardevid},secondary_in=@var{chardevid},
> +outdev=@var{chardevid}
> +
> +Colo-compare gets packet from primary_in@var{chardevid} and secondary_in@var{chardevid}, than compare primary packet with
> +secondary packet. If the packet same, we will output primary

s/If the packet same/If the packets are same/.

> +packet to outdev@var{chardevid}, else we will notify colo-frame
> +do checkpoint and send primary packet to outdev@var{chardevid}.
> +
> +we can use it with the help of filter-mirror and filter-redirector.

s/we/We/ and looks like colo compare must be used with the help of 
mirror and redirector?

> +
> +@example
> +
> +primary:
> +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
> +-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
> +-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
> +-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
> +-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
> +-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
> +-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
> +-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
> +-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
> +-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
> +-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
> +-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
> +
> +secondary:
> +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-ifdown
> +-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
> +-chardev socket,id=red0,host=3.3.3.3,port=9003
> +-chardev socket,id=red1,host=3.3.3.3,port=9004
> +-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
> +-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
> +
> +@end example
> +
> +If you want to know the detail of above command line, you can read
> +the colo-compare git log.
> +
>   @item -object secret,id=@var{id},data=@var{string},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>   @item -object secret,id=@var{id},file=@var{filename},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>   
> diff --git a/vl.c b/vl.c
> index cbe51ac..c6b9a6f 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2865,7 +2865,8 @@ static bool object_create_initial(const char *type)
>       if (g_str_equal(type, "filter-buffer") ||
>           g_str_equal(type, "filter-dump") ||
>           g_str_equal(type, "filter-mirror") ||
> -        g_str_equal(type, "filter-redirector")) {
> +        g_str_equal(type, "filter-redirector") ||
> +        g_str_equal(type, "colo-compare")) {
>           return false;
>       }
>   

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 03/10] net/colo.c: add colo.c to define and handle packet
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 03/10] net/colo.c: add colo.c to define and handle packet Zhang Chen
@ 2016-08-31  8:04   ` Jason Wang
  2016-08-31  9:19     ` Zhang Chen
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2016-08-31  8:04 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, eddie . dong, Dr . David Alan Gilbert, zhanghailiang



On 2016年08月17日 16:10, Zhang Chen wrote:
> The net/colo.c is used by colo-compare and filter-rewriter.
> this can share common data structure like net packet,
> and other functions.
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>   net/Makefile.objs  |   1 +
>   net/colo-compare.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>   net/colo.c         |  70 +++++++++++++++++++++++++++++++++
>   net/colo.h         |  38 ++++++++++++++++++
>   trace-events       |   3 ++
>   5 files changed, 223 insertions(+), 2 deletions(-)
>   create mode 100644 net/colo.c
>   create mode 100644 net/colo.h
>
> diff --git a/net/Makefile.objs b/net/Makefile.objs
> index ba92f73..beb504b 100644
> --- a/net/Makefile.objs
> +++ b/net/Makefile.objs
> @@ -17,3 +17,4 @@ common-obj-y += filter.o
>   common-obj-y += filter-buffer.o
>   common-obj-y += filter-mirror.o
>   common-obj-y += colo-compare.o
> +common-obj-y += colo.o
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index cdc3e0e..d9e4459 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -27,13 +27,38 @@
>   #include "sysemu/char.h"
>   #include "qemu/sockets.h"
>   #include "qapi-visit.h"
> +#include "net/colo.h"
> +#include "trace.h"
>   
>   #define TYPE_COLO_COMPARE "colo-compare"
>   #define COLO_COMPARE(obj) \
>       OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>   
>   #define COMPARE_READ_LEN_MAX NET_BUFSIZE
> +#define MAX_QUEUE_SIZE 1024
>   
> +/*
> +  + CompareState ++
> +  |               |
> +  +---------------+   +---------------+         +---------------+
> +  |conn list      +--->conn           +--------->conn           |
> +  +---------------+   +---------------+         +---------------+
> +  |               |     |           |             |          |
> +  +---------------+ +---v----+  +---v----+    +---v----+ +---v----+
> +                    |primary |  |secondary    |primary | |secondary
> +                    |packet  |  |packet  +    |packet  | |packet  +
> +                    +--------+  +--------+    +--------+ +--------+
> +                        |           |             |          |
> +                    +---v----+  +---v----+    +---v----+ +---v----+
> +                    |primary |  |secondary    |primary | |secondary
> +                    |packet  |  |packet  +    |packet  | |packet  +
> +                    +--------+  +--------+    +--------+ +--------+
> +                        |           |             |          |
> +                    +---v----+  +---v----+    +---v----+ +---v----+
> +                    |primary |  |secondary    |primary | |secondary
> +                    |packet  |  |packet  +    |packet  | |packet  +
> +                    +--------+  +--------+    +--------+ +--------+
> +*/
>   typedef struct CompareState {
>       Object parent;
>   
> @@ -46,6 +71,9 @@ typedef struct CompareState {
>       QTAILQ_ENTRY(CompareState) next;
>       SocketReadState pri_rs;
>       SocketReadState sec_rs;
> +
> +    /* hashtable to save connection */
> +    GHashTable *connection_track_table;
>   } CompareState;
>   
>   typedef struct CompareClass {
> @@ -57,6 +85,76 @@ typedef struct CompareChardevProps {
>       bool is_unix;
>   } CompareChardevProps;
>   
> +enum {
> +    PRIMARY_IN = 0,
> +    SECONDARY_IN,
> +};
> +
> +static int compare_chr_send(CharDriverState *out,
> +                            const uint8_t *buf,
> +                            uint32_t size);
> +
> +/*
> + * Return 0 on success, if return -1 means the pkt
> + * is unsupported(arp and ipv6) and will be sent later
> + */
> +static int packet_enqueue(CompareState *s, int mode)
> +{
> +    Packet *pkt = NULL;
> +
> +    if (mode == PRIMARY_IN) {
> +        pkt = packet_new(s->pri_rs.buf, s->pri_rs.packet_len);
> +    } else {
> +        pkt = packet_new(s->sec_rs.buf, s->sec_rs.packet_len);
> +    }
> +
> +    if (parse_packet_early(pkt)) {
> +        packet_destroy(pkt, NULL);
> +        pkt = NULL;
> +        return -1;
> +    }
> +    /* TODO: get connection key from pkt */
> +
> +    /*
> +     * TODO: use connection key get conn from
> +     * connection_track_table
> +     */
> +
> +    /*
> +     * TODO: insert pkt to it's conn->primary_list
> +     * or conn->secondary_list
> +     */
> +
> +    return 0;
> +}
> +
> +static int compare_chr_send(CharDriverState *out,
> +                            const uint8_t *buf,
> +                            uint32_t size)
> +{
> +    int ret = 0;
> +    uint32_t len = htonl(size);
> +
> +    if (!size) {
> +        return 0;
> +    }
> +
> +    ret = qemu_chr_fe_write_all(out, (uint8_t *)&len, sizeof(len));
> +    if (ret != sizeof(len)) {
> +        goto err;
> +    }
> +
> +    ret = qemu_chr_fe_write_all(out, (uint8_t *)buf, size);
> +    if (ret != size) {
> +        goto err;
> +    }
> +
> +    return 0;
> +
> +err:
> +    return ret < 0 ? ret : -EIO;
> +}
> +
>   static char *compare_get_pri_indev(Object *obj, Error **errp)
>   {
>       CompareState *s = COLO_COMPARE(obj);
> @@ -104,12 +202,21 @@ static void compare_set_outdev(Object *obj, const char *value, Error **errp)
>   
>   static void compare_pri_rs_finalize(SocketReadState *pri_rs)
>   {
> -    /* if packet_enqueue pri pkt failed we will send unsupported packet */
> +    CompareState *s = container_of(pri_rs, CompareState, pri_rs);
> +
> +    if (packet_enqueue(s, PRIMARY_IN)) {
> +        trace_colo_compare_main("primary: unsupported packet in");
> +        compare_chr_send(s->chr_out, pri_rs->buf, pri_rs->packet_len);
> +    }
>   }
>   
>   static void compare_sec_rs_finalize(SocketReadState *sec_rs)
>   {
> -    /* if packet_enqueue sec pkt failed we will notify trace */
> +    CompareState *s = container_of(sec_rs, CompareState, sec_rs);
> +
> +    if (packet_enqueue(s, SECONDARY_IN)) {
> +        trace_colo_compare_main("secondary: unsupported packet in");
> +    }
>   }
>   
>   static int compare_chardev_opts(void *opaque,
> @@ -218,6 +325,8 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
>       net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>       net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>   
> +    /* use g_hash_table_new_full() to new a hashtable */
> +
>       return;
>   }
>   
> diff --git a/net/colo.c b/net/colo.c
> new file mode 100644
> index 0000000..4daedd4
> --- /dev/null
> +++ b/net/colo.c
> @@ -0,0 +1,70 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2016 FUJITSU LIMITED
> + * Copyright (c) 2016 Intel Corporation
> + *
> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "net/colo.h"
> +
> +int parse_packet_early(Packet *pkt)
> +{
> +    int network_length;
> +    uint8_t *data = pkt->data;
> +    uint16_t l3_proto;
> +    ssize_t l2hdr_len = eth_get_l2_hdr_length(data);
> +
> +    if (pkt->size < ETH_HLEN) {
> +        error_report("pkt->size < ETH_HLEN");

Guest triggered, better not use error_report() here.

> +        return 1;
> +    }
> +    pkt->network_header = data + ETH_HLEN;

Need use l2hdr_len here instead of ETH_HLEP?

> +    l3_proto = eth_get_l3_proto(data, l2hdr_len);
> +    if (l3_proto != ETH_P_IP) {
> +        return 1;
> +    }
> +
> +    network_length = pkt->ip->ip_hl * 4;
> +    if (pkt->size < ETH_HLEN + network_length) {

Ditto.

> +        error_report("pkt->size < network_header + network_length");

And better not use error_report() since it was triggered by guest.

> +        return 1;
> +    }
> +    pkt->transport_header = pkt->network_header + network_length;
> +
> +    return 0;
> +}
> +
> +Packet *packet_new(const void *data, int size)
> +{
> +    Packet *pkt = g_slice_new(Packet);
> +
> +    pkt->data = g_memdup(data, size);
> +    pkt->size = size;
> +
> +    return pkt;
> +}
> +
> +void packet_destroy(void *opaque, void *user_data)
> +{
> +    Packet *pkt = opaque;
> +
> +    g_free(pkt->data);
> +    g_slice_free(Packet, pkt);
> +}
> +
> +/*
> + * Clear hashtable, stop this hash growing really huge
> + */
> +void connection_hashtable_reset(GHashTable *connection_track_table)
> +{
> +    g_hash_table_remove_all(connection_track_table);
> +}
> diff --git a/net/colo.h b/net/colo.h
> new file mode 100644
> index 0000000..8559f28
> --- /dev/null
> +++ b/net/colo.h
> @@ -0,0 +1,38 @@
> +/*
> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
> + * (a.k.a. Fault Tolerance or Continuous Replication)
> + *
> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
> + * Copyright (c) 2016 FUJITSU LIMITED
> + * Copyright (c) 2016 Intel Corporation
> + *
> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * later.  See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef QEMU_COLO_BASE_H
> +#define QEMU_COLO_BASE_H
> +
> +#include "slirp/slirp.h"
> +#include "qemu/jhash.h"
> +
> +#define HASHTABLE_MAX_SIZE 16384
> +
> +typedef struct Packet {
> +    void *data;
> +    union {
> +        uint8_t *network_header;
> +        struct ip *ip;
> +    };
> +    uint8_t *transport_header;
> +    int size;
> +} Packet;
> +
> +int parse_packet_early(Packet *pkt);
> +void connection_hashtable_reset(GHashTable *connection_track_table);
> +Packet *packet_new(const void *data, int size);
> +void packet_destroy(void *opaque, void *user_data);
> +
> +#endif /* QEMU_COLO_BASE_H */
> diff --git a/trace-events b/trace-events
> index ca7211b..703de1a 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1916,3 +1916,6 @@ aspeed_vic_update_fiq(int flags) "Raising FIQ: %d"
>   aspeed_vic_update_irq(int flags) "Raising IRQ: %d"
>   aspeed_vic_read(uint64_t offset, unsigned size, uint32_t value) "From 0x%" PRIx64 " of size %u: 0x%" PRIx32
>   aspeed_vic_write(uint64_t offset, unsigned size, uint32_t data) "To 0x%" PRIx64 " of size %u: 0x%" PRIx32
> +
> +# net/colo-compare.c
> +colo_compare_main(const char *chr) ": %s"

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 04/10] Jhash: add linux kernel jhashtable in qemu
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 04/10] Jhash: add linux kernel jhashtable in qemu Zhang Chen
@ 2016-08-31  8:05   ` Jason Wang
  2016-08-31  9:20     ` Zhang Chen
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2016-08-31  8:05 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 2016年08月17日 16:10, Zhang Chen wrote:
> Jhash used by colo-compare and filter-rewriter

s/used/will be used/

> to save and lookup net connection info
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>   include/qemu/jhash.h | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 59 insertions(+)
>   create mode 100644 include/qemu/jhash.h
>
> diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h
> new file mode 100644
> index 0000000..7222242
> --- /dev/null
> +++ b/include/qemu/jhash.h
> @@ -0,0 +1,59 @@
> +/* jhash.h: Jenkins hash support.
> +  *
> +  * Copyright (C) 2006. Bob Jenkins (bob_jenkins@burtleburtle.net)
> +  *
> +  * http://burtleburtle.net/bob/hash/
> +  *
> +  * These are the credits from Bob's sources:
> +  *
> +  * lookup3.c, by Bob Jenkins, May 2006, Public Domain.
> +  *
> +  * These are functions for producing 32-bit hashes for hash table lookup.
> +  * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final()
> +  * are externally useful functions.  Routines to test the hash are included
> +  * if SELF_TEST is defined.  You can use this free for any purpose. It's in
> +  * the public domain.  It has no warranty.
> +  *
> +  * Copyright (C) 2009-2010 Jozsef Kadlecsik (kadlec@blackhole.kfki.hu)
> +  *
> +  * I've modified Bob's hash to be useful in the Linux kernel, and
> +  * any bugs present are my fault.
> +  * Jozsef
> +  */
> +
> +#ifndef QEMU_JHASH_H__
> +#define QEMU_JHASH_H__
> +
> +#include "qemu/bitops.h"
> +
> +/*
> + * hashtable relation copy from linux kernel jhash
> + */
> +
> +/* __jhash_mix -- mix 3 32-bit values reversibly. */
> +#define __jhash_mix(a, b, c)                \
> +{                                           \
> +    a -= c;  a ^= rol32(c, 4);  c += b;     \
> +    b -= a;  b ^= rol32(a, 6);  a += c;     \
> +    c -= b;  c ^= rol32(b, 8);  b += a;     \
> +    a -= c;  a ^= rol32(c, 16); c += b;     \
> +    b -= a;  b ^= rol32(a, 19); a += c;     \
> +    c -= b;  c ^= rol32(b, 4);  b += a;     \
> +}
> +
> +/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
> +#define __jhash_final(a, b, c)  \
> +{                               \
> +    c ^= b; c -= rol32(b, 14);  \
> +    a ^= c; a -= rol32(c, 11);  \
> +    b ^= a; b -= rol32(a, 25);  \
> +    c ^= b; c -= rol32(b, 16);  \
> +    a ^= c; a -= rol32(c, 4);   \
> +    b ^= a; b -= rol32(a, 14);  \
> +    c ^= b; c -= rol32(b, 24);  \
> +}
> +
> +/* An arbitrary initial parameter */
> +#define JHASH_INITVAL           0xdeadbeef
> +
> +#endif /* QEMU_JHASH_H__ */

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization
  2016-08-31  7:53   ` Jason Wang
@ 2016-08-31  8:06     ` Hailiang Zhang
  2016-08-31  9:03     ` Zhang Chen
  1 sibling, 0 replies; 32+ messages in thread
From: Hailiang Zhang @ 2016-08-31  8:06 UTC (permalink / raw)
  To: Jason Wang, Zhang Chen, qemu devel
  Cc: peter.huangpeng, Li Zhijian, Wen Congyang, eddie . dong,
	Dr . David Alan Gilbert

On 2016/8/31 15:53, Jason Wang wrote:
>
>
> On 2016年08月17日 16:10, Zhang Chen wrote:
>> This a COLO net ascii figure:
>>
>>    Primary qemu                                                           Secondary qemu
>> +--------------------------------------------------------------+       +----------------------------------------------------------------+
>> | +----------------------------------------------------------+ |       |  +-----------------------------------------------------------+ |
>> | |                                                          | |       |  |                                                           | |
>> | |                        guest                             | |       |  |                        guest                              | |
>> | |                                                          | |       |  |                                                           | |
>> | +-------^--------------------------+-----------------------+ |       |  +---------------------+--------+----------------------------+ |
>> |         |                          |                         |       |                        ^        |                              |
>> |         |                          |                         |       |                        |        |                              |
>> |         |  +------------------------------------------------------+  |                        |        |                              |
>> |netfilter|  |                       |                         |    |  |   netfilter            |        |                              |
>> | +----------+ +----------------------------+                  |    |  |  +-----------------------------------------------------------+ |
>> | |       |  |                       |      |        out       |    |  |  |                     |        |  filter excute order       | |
>> | |       |  |          +-----------------------------+        |    |  |  |                     |        | +------------------->      | |
>> | |       |  |          |            |      |         |        |    |  |  |                     |        |   TCP                      | |
>> | | +-----+--+-+  +-----v----+ +-----v----+ |pri +----+----+sec|    |  |  | +------------+  +---+----+---v+rewriter++  +------------+ | |
>> | | |          |  |          | |          | |in  |         |in |    |  |  | |            |  |        |              |  |            | | |
>> | | |  filter  |  |  filter  | |  filter  +------>  colo   <------+ +-------->  filter   +--> adjust |   adjust     +-->   filter   | | |
>> | | |  mirror  |  |redirector| |redirector| |    | compare |   |  |    |  | | redirector |  | ack    |   seq        |  | redirector | | |
>> | | |          |  |          | |          | |    |         |   |  |    |  | |            |  |        |              |  |            | | |
>> | | +----^-----+  +----+-----+ +----------+ |    +---------+   |  |    |  | +------------+  +--------+--------------+  +---+--------+ | |
>> | |      |   tx        |   rx           rx  |                  |  |    |  |            tx                        all       |  rx      | |
>> | |      |             |                    |                  |  |    |  +-----------------------------------------------------------+ |
>> | |      |             +--------------+     |                  |  |    |                                                   |            |
>> | |      |   filter excute order      |     |                  |  |    |                                                   |            |
>> | |      |  +---------------->        |     |                  |  +--------------------------------------------------------+            |
>> | +-----------------------------------------+                  |       |                                                                |
>> |        |                            |                        |       |                                                                |
>> +--------------------------------------------------------------+       +----------------------------------------------------------------+
>>            |guest receive               | guest send
>>            |                            |
>> +--------+----------------------------v------------------------+
>> |                                                              |                          NOTE: filter direction is rx/tx/all
>> |                         tap                                  |                          rx:receive packets sent to the netdev
>> |                                                              |                          tx:receive packets sent by the netdev
>> +--------------------------------------------------------------+
>
> It's better to add a doc under docs to explain this configuration in
> detail on top of this series.
>

Agreed! I'm adding a new document to introduce COLO (COLO-FT.txt).
Since COLO proxy is only used in COLO, and IMHO, it is unnecessary to add
a special documentation to describe this, we can add it into the COLO-FT.txt
after it is been merged ...

>> In COLO-compare, we do packet comparing job.
>> Packets coming from the primary char indev will be sent to outdev.
>> Packets coming from the secondary char dev will be dropped after comparing.
>> colo-comapre need two input chardev and one output chardev:
>> primary_in=chardev1-id (source: primary send packet)
>> secondary_in=chardev2-id (source: secondary send packet)
>> outdev=chardev3-id
>>
>> usage:
>>
>> primary:
>> -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
>> -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>> -chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>> -chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>> -chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>> -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>> -chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>> -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>> -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>> -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>> -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>>
>> secondary:
>> -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-ifdown
>> -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>> -chardev socket,id=red0,host=3.3.3.3,port=9003
>> -chardev socket,id=red1,host=3.3.3.3,port=9004
>> -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>> -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>    net/Makefile.objs  |   1 +
>>    net/colo-compare.c | 284 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>    qemu-options.hx    |  39 ++++++++
>>    vl.c               |   3 +-
>>    4 files changed, 326 insertions(+), 1 deletion(-)
>>    create mode 100644 net/colo-compare.c
>>
>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>> index b7c22fd..ba92f73 100644
>> --- a/net/Makefile.objs
>> +++ b/net/Makefile.objs
>> @@ -16,3 +16,4 @@ common-obj-$(CONFIG_NETMAP) += netmap.o
>>    common-obj-y += filter.o
>>    common-obj-y += filter-buffer.o
>>    common-obj-y += filter-mirror.o
>> +common-obj-y += colo-compare.o
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> new file mode 100644
>> index 0000000..cdc3e0e
>> --- /dev/null
>> +++ b/net/colo-compare.c
>> @@ -0,0 +1,284 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2016 FUJITSU LIMITED
>> + * Copyright (c) 2016 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu-common.h"
>> +#include "qapi/qmp/qerror.h"
>> +#include "qapi/error.h"
>> +#include "net/net.h"
>> +#include "net/vhost_net.h"
>
> Looks unnecessary.
>
>> +#include "qom/object_interfaces.h"
>> +#include "qemu/iov.h"
>> +#include "qom/object.h"
>> +#include "qemu/typedefs.h"
>> +#include "net/queue.h"
>> +#include "sysemu/char.h"
>> +#include "qemu/sockets.h"
>> +#include "qapi-visit.h"
>> +
>> +#define TYPE_COLO_COMPARE "colo-compare"
>> +#define COLO_COMPARE(obj) \
>> +    OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>> +
>> +#define COMPARE_READ_LEN_MAX NET_BUFSIZE
>> +
>> +typedef struct CompareState {
>> +    Object parent;
>> +
>> +    char *pri_indev;
>> +    char *sec_indev;
>> +    char *outdev;
>> +    CharDriverState *chr_pri_in;
>> +    CharDriverState *chr_sec_in;
>> +    CharDriverState *chr_out;
>> +    QTAILQ_ENTRY(CompareState) next;
>
> This looks not used in this series but in commit "colo-compare and
> filter-rewriter work with colo-frame". We'd better delay the introducing
> to that patch.
>
>> +    SocketReadState pri_rs;
>> +    SocketReadState sec_rs;
>> +} CompareState;
>> +
>> +typedef struct CompareClass {
>> +    ObjectClass parent_class;
>> +} CompareClass;
>> +
>> +typedef struct CompareChardevProps {
>> +    bool is_socket;
>> +    bool is_unix;
>> +} CompareChardevProps;
>> +
>> +static char *compare_get_pri_indev(Object *obj, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    return g_strdup(s->pri_indev);
>> +}
>> +
>> +static void compare_set_pri_indev(Object *obj, const char *value, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    g_free(s->pri_indev);
>> +    s->pri_indev = g_strdup(value);
>> +}
>> +
>> +static char *compare_get_sec_indev(Object *obj, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    return g_strdup(s->sec_indev);
>> +}
>> +
>> +static void compare_set_sec_indev(Object *obj, const char *value, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    g_free(s->sec_indev);
>> +    s->sec_indev = g_strdup(value);
>> +}
>> +
>> +static char *compare_get_outdev(Object *obj, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    return g_strdup(s->outdev);
>> +}
>> +
>> +static void compare_set_outdev(Object *obj, const char *value, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    g_free(s->outdev);
>> +    s->outdev = g_strdup(value);
>> +}
>> +
>> +static void compare_pri_rs_finalize(SocketReadState *pri_rs)
>> +{
>> +    /* if packet_enqueue pri pkt failed we will send unsupported packet */
>> +}
>> +
>> +static void compare_sec_rs_finalize(SocketReadState *sec_rs)
>> +{
>> +    /* if packet_enqueue sec pkt failed we will notify trace */
>> +}
>> +
>> +static int compare_chardev_opts(void *opaque,
>> +                                const char *name, const char *value,
>> +                                Error **errp)
>> +{
>> +    CompareChardevProps *props = opaque;
>> +
>> +    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") == 0) {
>> +        props->is_socket = true;
>> +    } else if (strcmp(name, "host") == 0) {
>
> Typo? net_vhost_chardev_opts() did:
>
>       } else if (strcmp(name, "path") == 0) {
>           props->is_unix = true;
>       }
>
>
>
>> +        props->is_unix = true;
>> +    } else if (strcmp(name, "port") == 0) {
>> +    } else if (strcmp(name, "server") == 0) {
>> +    } else if (strcmp(name, "wait") == 0) {
>> +    } else {
>> +        error_setg(errp,
>> +                   "COLO-compare does not support a chardev with option %s=%s",
>> +                   name, value);
>> +        return -1;
>> +    }
>> +    return 0;
>> +}
>> +
>> +/*
>> + * called from the main thread on the primary
>> + * to setup colo-compare.
>> + */
>> +static void colo_compare_complete(UserCreatable *uc, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(uc);
>> +    CompareChardevProps props;
>> +
>> +    if (!s->pri_indev || !s->sec_indev || !s->outdev) {
>> +        error_setg(errp, "colo compare needs 'primary_in' ,"
>> +                   "'secondary_in','outdev' property set");
>> +        return;
>> +    } else if (!strcmp(s->pri_indev, s->outdev) ||
>> +               !strcmp(s->sec_indev, s->outdev) ||
>> +               !strcmp(s->pri_indev, s->sec_indev)) {
>> +        error_setg(errp, "'indev' and 'outdev' could not be same "
>> +                   "for compare module");
>> +        return;
>> +    }
>> +
>> +    s->chr_pri_in = qemu_chr_find(s->pri_indev);
>> +    if (s->chr_pri_in == NULL) {
>> +        error_setg(errp, "Primary IN Device '%s' not found",
>> +                   s->pri_indev);
>> +        return;
>> +    }
>> +
>> +    /* inspect chardev opts */
>> +    memset(&props, 0, sizeof(props));
>> +    if (qemu_opt_foreach(s->chr_pri_in->opts, compare_chardev_opts, &props, errp)) {
>> +        return;
>> +    }
>> +
>> +    if (!props.is_socket || !props.is_unix) {
>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>> +                   s->pri_indev);
>> +        return;
>> +    }
>> +
>> +    s->chr_sec_in = qemu_chr_find(s->sec_indev);
>> +    if (s->chr_sec_in == NULL) {
>> +        error_setg(errp, "Secondary IN Device '%s' not found",
>> +                   s->sec_indev);
>> +        return;
>> +    }
>> +
>> +    memset(&props, 0, sizeof(props));
>> +    if (qemu_opt_foreach(s->chr_sec_in->opts, compare_chardev_opts, &props, errp)) {
>> +        return;
>> +    }
>> +
>> +    if (!props.is_socket || !props.is_unix) {
>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>> +                   s->sec_indev);
>
> I believe tcp socket is also supported?
>
>> +        return;
>> +    }
>> +
>> +    s->chr_out = qemu_chr_find(s->outdev);
>> +    if (s->chr_out == NULL) {
>> +        error_setg(errp, "OUT Device '%s' not found", s->outdev);
>> +        return;
>> +    }
>> +
>> +    memset(&props, 0, sizeof(props));
>> +    if (qemu_opt_foreach(s->chr_out->opts, compare_chardev_opts, &props, errp)) {
>> +        return;
>> +    }
>> +
>> +    if (!props.is_socket || !props.is_unix) {
>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>> +                   s->outdev);
>
> Ditto, and there's code duplication, please introduce a helper to do above.
>
>> +        return;
>> +    }
>> +
>> +    qemu_chr_fe_claim_no_fail(s->chr_pri_in);
>> +
>> +    qemu_chr_fe_claim_no_fail(s->chr_sec_in);
>> +
>> +    qemu_chr_fe_claim_no_fail(s->chr_out);
>> +
>> +    net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>> +    net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>> +
>> +    return;
>> +}
>> +
>> +static void colo_compare_class_init(ObjectClass *oc, void *data)
>> +{
>> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
>> +
>> +    ucc->complete = colo_compare_complete;
>> +}
>> +
>> +static void colo_compare_init(Object *obj)
>> +{
>> +    object_property_add_str(obj, "primary_in",
>> +                            compare_get_pri_indev, compare_set_pri_indev,
>> +                            NULL);
>> +    object_property_add_str(obj, "secondary_in",
>> +                            compare_get_sec_indev, compare_set_sec_indev,
>> +                            NULL);
>> +    object_property_add_str(obj, "outdev",
>> +                            compare_get_outdev, compare_set_outdev,
>> +                            NULL);
>> +}
>> +
>> +static void colo_compare_finalize(Object *obj)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    if (s->chr_pri_in) {
>> +        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
>> +        qemu_chr_fe_release(s->chr_pri_in);
>> +    }
>> +    if (s->chr_sec_in) {
>> +        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
>> +        qemu_chr_fe_release(s->chr_sec_in);
>> +    }
>> +    if (s->chr_out) {
>> +        qemu_chr_fe_release(s->chr_out);
>> +    }
>> +
>> +    g_free(s->pri_indev);
>> +    g_free(s->sec_indev);
>> +    g_free(s->outdev);
>> +}
>> +
>> +static const TypeInfo colo_compare_info = {
>> +    .name = TYPE_COLO_COMPARE,
>> +    .parent = TYPE_OBJECT,
>> +    .instance_size = sizeof(CompareState),
>> +    .instance_init = colo_compare_init,
>> +    .instance_finalize = colo_compare_finalize,
>> +    .class_size = sizeof(CompareClass),
>> +    .class_init = colo_compare_class_init,
>> +    .interfaces = (InterfaceInfo[]) {
>> +        { TYPE_USER_CREATABLE },
>> +        { }
>> +    }
>> +};
>> +
>> +static void register_types(void)
>> +{
>> +    type_register_static(&colo_compare_info);
>> +}
>> +
>> +type_init(register_types);
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 587de8f..33d5d0b 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -3866,6 +3866,45 @@ Dump the network traffic on netdev @var{dev} to the file specified by
>>    The file format is libpcap, so it can be analyzed with tools such as tcpdump
>>    or Wireshark.
>>
>> +@item -object colo-compare,id=@var{id},primary_in=@var{chardevid},secondary_in=@var{chardevid},
>> +outdev=@var{chardevid}
>> +
>> +Colo-compare gets packet from primary_in@var{chardevid} and secondary_in@var{chardevid}, than compare primary packet with
>> +secondary packet. If the packet same, we will output primary
>
> s/If the packet same/If the packets are same/.
>
>> +packet to outdev@var{chardevid}, else we will notify colo-frame
>> +do checkpoint and send primary packet to outdev@var{chardevid}.
>> +
>> +we can use it with the help of filter-mirror and filter-redirector.
>
> s/we/We/ and looks like colo compare must be used with the help of
> mirror and redirector?
>
>> +
>> +@example
>> +
>> +primary:
>> +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
>> +-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>> +-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>> +-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>> +-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>> +-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>> +-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>> +-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>> +-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>> +-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>> +-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>> +-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>> +
>> +secondary:
>> +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-ifdown
>> +-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>> +-chardev socket,id=red0,host=3.3.3.3,port=9003
>> +-chardev socket,id=red1,host=3.3.3.3,port=9004
>> +-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>> +-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>> +
>> +@end example
>> +
>> +If you want to know the detail of above command line, you can read
>> +the colo-compare git log.
>> +
>>    @item -object secret,id=@var{id},data=@var{string},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>    @item -object secret,id=@var{id},file=@var{filename},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>
>> diff --git a/vl.c b/vl.c
>> index cbe51ac..c6b9a6f 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -2865,7 +2865,8 @@ static bool object_create_initial(const char *type)
>>        if (g_str_equal(type, "filter-buffer") ||
>>            g_str_equal(type, "filter-dump") ||
>>            g_str_equal(type, "filter-mirror") ||
>> -        g_str_equal(type, "filter-redirector")) {
>> +        g_str_equal(type, "filter-redirector") ||
>> +        g_str_equal(type, "colo-compare")) {
>>            return false;
>>        }
>>
>
>
> .
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 05/10] colo-compare: track connection and enqueue packet
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 05/10] colo-compare: track connection and enqueue packet Zhang Chen
@ 2016-08-31  8:52   ` Jason Wang
  2016-08-31 11:52     ` Zhang Chen
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2016-08-31  8:52 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 2016年08月17日 16:10, Zhang Chen wrote:
> In this patch we use kernel jhash table to track
> connection, and then enqueue net packet like this:
>
> + CompareState ++
> |               |
> +---------------+   +---------------+         +---------------+
> |conn list      +--->conn           +--------->conn           |
> +---------------+   +---------------+         +---------------+
> |               |     |           |             |          |
> +---------------+ +---v----+  +---v----+    +---v----+ +---v----+
>                    |primary |  |secondary    |primary | |secondary
>                    |packet  |  |packet  +    |packet  | |packet  +
>                    +--------+  +--------+    +--------+ +--------+
>                        |           |             |          |
>                    +---v----+  +---v----+    +---v----+ +---v----+
>                    |primary |  |secondary    |primary | |secondary
>                    |packet  |  |packet  +    |packet  | |packet  +
>                    +--------+  +--------+    +--------+ +--------+
>                        |           |             |          |
>                    +---v----+  +---v----+    +---v----+ +---v----+
>                    |primary |  |secondary    |primary | |secondary
>                    |packet  |  |packet  +    |packet  | |packet  +
>                    +--------+  +--------+    +--------+ +--------+
>
> We use conn_list to record connection info.
> When we want to enqueue a packet, firstly get the
> connection from connection_track_table. then push
> the packet to g_queue(pri/sec) in it's own conn.
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>   net/colo-compare.c |  51 ++++++++++++++++++-----
>   net/colo.c         | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/colo.h         |  27 +++++++++++++
>   3 files changed, 185 insertions(+), 10 deletions(-)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index d9e4459..bab215b 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -72,6 +72,11 @@ typedef struct CompareState {
>       SocketReadState pri_rs;
>       SocketReadState sec_rs;
>   
> +    /* connection list: the connections belonged to this NIC could be found
> +     * in this list.
> +     * element type: Connection
> +     */
> +    GQueue conn_list;
>       /* hashtable to save connection */
>       GHashTable *connection_track_table;
>   } CompareState;
> @@ -100,7 +105,9 @@ static int compare_chr_send(CharDriverState *out,
>    */
>   static int packet_enqueue(CompareState *s, int mode)
>   {
> +    ConnectionKey key = {{ 0 } };
>       Packet *pkt = NULL;
> +    Connection *conn;
>   
>       if (mode == PRIMARY_IN) {
>           pkt = packet_new(s->pri_rs.buf, s->pri_rs.packet_len);
> @@ -113,17 +120,34 @@ static int packet_enqueue(CompareState *s, int mode)
>           pkt = NULL;
>           return -1;
>       }
> -    /* TODO: get connection key from pkt */
> +    fill_connection_key(pkt, &key);
>   
> -    /*
> -     * TODO: use connection key get conn from
> -     * connection_track_table
> -     */
> +    conn = connection_get(s->connection_track_table,
> +                          &key,
> +                          &s->conn_list);
>   
> -    /*
> -     * TODO: insert pkt to it's conn->primary_list
> -     * or conn->secondary_list
> -     */
> +    if (!conn->processing) {
> +        g_queue_push_tail(&s->conn_list, conn);
> +        conn->processing = true;
> +    }
> +
> +    if (mode == PRIMARY_IN) {
> +        if (g_queue_get_length(&conn->primary_list) <
> +                               MAX_QUEUE_SIZE) {

Should be "<=" I believe.

> +            g_queue_push_tail(&conn->primary_list, pkt);
> +        } else {
> +            error_report("colo compare primary queue size too big,"
> +            "drop packet");

indentation here looks odd.

> +        }
> +    } else {
> +        if (g_queue_get_length(&conn->secondary_list) <
> +                               MAX_QUEUE_SIZE) {
> +            g_queue_push_tail(&conn->secondary_list, pkt);
> +        } else {
> +            error_report("colo compare secondary queue size too big,"
> +            "drop packet");
> +        }
> +    }
>   
>       return 0;
>   }
> @@ -325,7 +349,12 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
>       net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>       net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>   
> -    /* use g_hash_table_new_full() to new a hashtable */
> +    g_queue_init(&s->conn_list);
> +
> +    s->connection_track_table = g_hash_table_new_full(connection_key_hash,
> +                                                      connection_key_equal,
> +                                                      g_free,
> +                                                      connection_destroy);
>   
>       return;
>   }
> @@ -366,6 +395,8 @@ static void colo_compare_finalize(Object *obj)
>           qemu_chr_fe_release(s->chr_out);
>       }
>   
> +    g_queue_free(&s->conn_list);
> +
>       g_free(s->pri_indev);
>       g_free(s->sec_indev);
>       g_free(s->outdev);
> diff --git a/net/colo.c b/net/colo.c
> index 4daedd4..bc86553 100644
> --- a/net/colo.c
> +++ b/net/colo.c
> @@ -16,6 +16,29 @@
>   #include "qemu/error-report.h"
>   #include "net/colo.h"
>   
> +uint32_t connection_key_hash(const void *opaque)
> +{
> +    const ConnectionKey *key = opaque;
> +    uint32_t a, b, c;
> +
> +    /* Jenkins hash */
> +    a = b = c = JHASH_INITVAL + sizeof(*key);
> +    a += key->src.s_addr;
> +    b += key->dst.s_addr;
> +    c += (key->src_port | key->dst_port << 16);
> +    __jhash_mix(a, b, c);
> +
> +    a += key->ip_proto;
> +    __jhash_final(a, b, c);
> +
> +    return c;
> +}
> +
> +int connection_key_equal(const void *key1, const void *key2)
> +{
> +    return memcmp(key1, key2, sizeof(ConnectionKey)) == 0;
> +}
> +
>   int parse_packet_early(Packet *pkt)
>   {
>       int network_length;
> @@ -43,6 +66,62 @@ int parse_packet_early(Packet *pkt)
>       return 0;
>   }
>   
> +void fill_connection_key(Packet *pkt, ConnectionKey *key)
> +{
> +    uint32_t tmp_ports;
> +
> +    key->ip_proto = pkt->ip->ip_p;
> +
> +    switch (key->ip_proto) {
> +    case IPPROTO_TCP:
> +    case IPPROTO_UDP:
> +    case IPPROTO_DCCP:
> +    case IPPROTO_ESP:
> +    case IPPROTO_SCTP:
> +    case IPPROTO_UDPLITE:
> +        tmp_ports = *(uint32_t *)(pkt->transport_header);
> +        key->src = pkt->ip->ip_src;
> +        key->dst = pkt->ip->ip_dst;
> +        key->src_port = ntohs(tmp_ports & 0xffff);
> +        key->dst_port = ntohs(tmp_ports >> 16);
> +        break;
> +    case IPPROTO_AH:
> +        tmp_ports = *(uint32_t *)(pkt->transport_header + 4);
> +        key->src = pkt->ip->ip_src;
> +        key->dst = pkt->ip->ip_dst;
> +        key->src_port = ntohs(tmp_ports & 0xffff);
> +        key->dst_port = ntohs(tmp_ports >> 16);
> +        break;
> +    default:
> +        key->src_port = 0;
> +        key->dst_port = 0;
> +        break;
> +    }
> +}
> +
> +Connection *connection_new(ConnectionKey *key)
> +{
> +    Connection *conn = g_slice_new(Connection);
> +
> +    conn->ip_proto = key->ip_proto;
> +    conn->processing = false;
> +    g_queue_init(&conn->primary_list);
> +    g_queue_init(&conn->secondary_list);
> +
> +    return conn;
> +}
> +
> +void connection_destroy(void *opaque)
> +{
> +    Connection *conn = opaque;
> +
> +    g_queue_foreach(&conn->primary_list, packet_destroy, NULL);
> +    g_queue_free(&conn->primary_list);
> +    g_queue_foreach(&conn->secondary_list, packet_destroy, NULL);
> +    g_queue_free(&conn->secondary_list);
> +    g_slice_free(Connection, conn);
> +}
> +
>   Packet *packet_new(const void *data, int size)
>   {
>       Packet *pkt = g_slice_new(Packet);
> @@ -68,3 +147,41 @@ void connection_hashtable_reset(GHashTable *connection_track_table)
>   {
>       g_hash_table_remove_all(connection_track_table);
>   }
> +
> +static void colo_rm_connection(void *opaque, void *user_data)
> +{

user_data is unused here.

> +    connection_destroy(opaque);
> +}
> +
> +/* if not found, create a new connection and add to hash table */
> +Connection *connection_get(GHashTable *connection_track_table,
> +                           ConnectionKey *key,
> +                           GQueue *conn_list)
> +{
> +    Connection *conn = g_hash_table_lookup(connection_track_table, key);
> +    static uint32_t hashtable_size;
> +
> +    if (conn == NULL) {
> +        ConnectionKey *new_key = g_memdup(key, sizeof(*key));
> +
> +        conn = connection_new(key);
> +
> +        hashtable_size += 1;

Use of uninitialized variable?

> +        if (hashtable_size > HASHTABLE_MAX_SIZE) {

Should we use g_hash_table_size() here.

> +            error_report("colo proxy connection hashtable full, clear it");
> +            connection_hashtable_reset(connection_track_table);
> +            /*
> +             * clear the conn_list
> +             */
> +            if (conn_list) {
> +                g_queue_foreach(conn_list, colo_rm_connection, NULL);
> +            }
> +
> +            hashtable_size = 0;
> +        }
> +
> +        g_hash_table_insert(connection_track_table, new_key, conn);

Then there's no need for hashtable_size.

> +    }
> +
> +    return conn;
> +}
> diff --git a/net/colo.h b/net/colo.h
> index 8559f28..9cbc14e 100644
> --- a/net/colo.h
> +++ b/net/colo.h
> @@ -30,7 +30,34 @@ typedef struct Packet {
>       int size;
>   } Packet;
>   
> +typedef struct ConnectionKey {
> +    /* (src, dst) must be grouped, in the same way than in IP header */
> +    struct in_addr src;
> +    struct in_addr dst;
> +    uint16_t src_port;
> +    uint16_t dst_port;
> +    uint8_t ip_proto;
> +} QEMU_PACKED ConnectionKey;
> +
> +typedef struct Connection {
> +    /* connection primary send queue: element type: Packet */
> +    GQueue primary_list;
> +    /* connection secondary send queue: element type: Packet */
> +    GQueue secondary_list;
> +    /* flag to enqueue unprocessed_connections */
> +    bool processing;
> +    uint8_t ip_proto;
> +} Connection;
> +
> +uint32_t connection_key_hash(const void *opaque);
> +int connection_key_equal(const void *opaque1, const void *opaque2);
>   int parse_packet_early(Packet *pkt);
> +void fill_connection_key(Packet *pkt, ConnectionKey *key);
> +Connection *connection_new(ConnectionKey *key);
> +void connection_destroy(void *opaque);
> +Connection *connection_get(GHashTable *connection_track_table,
> +                           ConnectionKey *key,
> +                           GQueue *conn_list);
>   void connection_hashtable_reset(GHashTable *connection_track_table);
>   Packet *packet_new(const void *data, int size);
>   void packet_destroy(void *opaque, void *user_data);

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization
  2016-08-31  7:53   ` Jason Wang
  2016-08-31  8:06     ` Hailiang Zhang
@ 2016-08-31  9:03     ` Zhang Chen
  2016-08-31  9:20       ` Jason Wang
  1 sibling, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-31  9:03 UTC (permalink / raw)
  To: Jason Wang, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 08/31/2016 03:53 PM, Jason Wang wrote:
>
>
> On 2016年08月17日 16:10, Zhang Chen wrote:
>> This a COLO net ascii figure:
>>
>>   Primary qemu Secondary qemu
>> +--------------------------------------------------------------+ 
>> +----------------------------------------------------------------+
>> | +----------------------------------------------------------+ 
>> |       | 
>> +-----------------------------------------------------------+ |
>> | |                                                          | 
>> |       | |                                                           
>> | |
>> | |                        guest                             | 
>> |       |  | guest                              | |
>> | |                                                          | 
>> |       | |                                                           
>> | |
>> | +-------^--------------------------+-----------------------+ 
>> |       | 
>> +---------------------+--------+----------------------------+ |
>> |         |                          | |       
>> |                        ^ |                              |
>> |         |                          | |       
>> |                        | |                              |
>> |         | +------------------------------------------------------+ 
>> |                        |        | |
>> |netfilter|  |                       | |    |  |   
>> netfilter            | |                              |
>> | +----------+ +----------------------------+ |    |  | 
>> +-----------------------------------------------------------+ |
>> | |       |  |                       |      |        out |    |  |  
>> |                     |        |  filter excute order       | |
>> | |       |  |          +-----------------------------+ |    |  |  
>> |                     |        | +------------------->      | |
>> | |       |  |          |            |      |         | |    |  |  
>> |                     |        | TCP                      | |
>> | | +-----+--+-+  +-----v----+ +-----v----+ |pri +----+----+sec|    
>> |  |  | +------------+ +---+----+---v+rewriter++  +------------+ | |
>> | | |          |  |          | |          | |in  |         |in |    
>> |  |  | |            |  |        |              | |            | | |
>> | | |  filter  |  |  filter  | |  filter  +------>  colo <------+ 
>> +-------->  filter   +--> adjust | adjust     +-->   filter   | | |
>> | | |  mirror  |  |redirector| |redirector| |    | compare | |  |    
>> |  | | redirector |  | ack    |   seq        |  | redirector | | |
>> | | |          |  |          | |          | |    |         | |  |    
>> |  | |            |  |        |              | |            | | |
>> | | +----^-----+  +----+-----+ +----------+ |    +---------+ |  |    
>> |  | +------------+  +--------+--------------+ +---+--------+ | |
>> | |      |   tx        |   rx           rx  | |  |    |  |            
>> tx                        all       | rx      | |
>> | |      |             |                    | |  |    | 
>> +-----------------------------------------------------------+ |
>> | |      |             +--------------+     | |  |    | |            |
>> | |      |   filter excute order      |     | |  |    | |            |
>> | |      |  +---------------->        | |                  | 
>> +--------------------------------------------------------+ |
>> | +-----------------------------------------+ | | |
>> |        |                            | | | |
>> +--------------------------------------------------------------+ 
>> +----------------------------------------------------------------+
>>           |guest receive               | guest send
>>           |                            |
>> +--------+----------------------------v------------------------+
>> | |                          NOTE: filter direction is rx/tx/all
>> |                         tap |                          rx:receive 
>> packets sent to the netdev
>> | |                          tx:receive packets sent by the netdev
>> +--------------------------------------------------------------+
>
> It's better to add a doc under docs to explain this configuration in 
> detail on top of this series.
>

As you say, Am I add /docs/colo-proxy.txt to explain it or add this in 
hailiang's COLO-FT.txt after merge?

>> In COLO-compare, we do packet comparing job.
>> Packets coming from the primary char indev will be sent to outdev.
>> Packets coming from the secondary char dev will be dropped after 
>> comparing.
>> colo-comapre need two input chardev and one output chardev:
>> primary_in=chardev1-id (source: primary send packet)
>> secondary_in=chardev2-id (source: secondary send packet)
>> outdev=chardev3-id
>>
>> usage:
>>
>> primary:
>> -netdev 
>> tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
>> -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>> -chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>> -chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>> -chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>> -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>> -chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>> -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>> -object 
>> filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>> -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>> -object 
>> colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>>
>> secondary:
>> -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down 
>> script=/etc/qemu-ifdown
>> -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>> -chardev socket,id=red0,host=3.3.3.3,port=9003
>> -chardev socket,id=red1,host=3.3.3.3,port=9004
>> -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>> -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>   net/Makefile.objs  |   1 +
>>   net/colo-compare.c | 284 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   qemu-options.hx    |  39 ++++++++
>>   vl.c               |   3 +-
>>   4 files changed, 326 insertions(+), 1 deletion(-)
>>   create mode 100644 net/colo-compare.c
>>
>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>> index b7c22fd..ba92f73 100644
>> --- a/net/Makefile.objs
>> +++ b/net/Makefile.objs
>> @@ -16,3 +16,4 @@ common-obj-$(CONFIG_NETMAP) += netmap.o
>>   common-obj-y += filter.o
>>   common-obj-y += filter-buffer.o
>>   common-obj-y += filter-mirror.o
>> +common-obj-y += colo-compare.o
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> new file mode 100644
>> index 0000000..cdc3e0e
>> --- /dev/null
>> +++ b/net/colo-compare.c
>> @@ -0,0 +1,284 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service 
>> (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2016 FUJITSU LIMITED
>> + * Copyright (c) 2016 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu-common.h"
>> +#include "qapi/qmp/qerror.h"
>> +#include "qapi/error.h"
>> +#include "net/net.h"
>> +#include "net/vhost_net.h"
>
> Looks unnecessary.

I will remove it.

>
>> +#include "qom/object_interfaces.h"
>> +#include "qemu/iov.h"
>> +#include "qom/object.h"
>> +#include "qemu/typedefs.h"
>> +#include "net/queue.h"
>> +#include "sysemu/char.h"
>> +#include "qemu/sockets.h"
>> +#include "qapi-visit.h"
>> +
>> +#define TYPE_COLO_COMPARE "colo-compare"
>> +#define COLO_COMPARE(obj) \
>> +    OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>> +
>> +#define COMPARE_READ_LEN_MAX NET_BUFSIZE
>> +
>> +typedef struct CompareState {
>> +    Object parent;
>> +
>> +    char *pri_indev;
>> +    char *sec_indev;
>> +    char *outdev;
>> +    CharDriverState *chr_pri_in;
>> +    CharDriverState *chr_sec_in;
>> +    CharDriverState *chr_out;
>> +    QTAILQ_ENTRY(CompareState) next;
>
> This looks not used in this series but in commit "colo-compare and 
> filter-rewriter work with colo-frame". We'd better delay the 
> introducing to that patch.

OK~ I got your point.

>
>> +    SocketReadState pri_rs;
>> +    SocketReadState sec_rs;
>> +} CompareState;
>> +
>> +typedef struct CompareClass {
>> +    ObjectClass parent_class;
>> +} CompareClass;
>> +
>> +typedef struct CompareChardevProps {
>> +    bool is_socket;
>> +    bool is_unix;
>> +} CompareChardevProps;
>> +
>> +static char *compare_get_pri_indev(Object *obj, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    return g_strdup(s->pri_indev);
>> +}
>> +
>> +static void compare_set_pri_indev(Object *obj, const char *value, 
>> Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    g_free(s->pri_indev);
>> +    s->pri_indev = g_strdup(value);
>> +}
>> +
>> +static char *compare_get_sec_indev(Object *obj, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    return g_strdup(s->sec_indev);
>> +}
>> +
>> +static void compare_set_sec_indev(Object *obj, const char *value, 
>> Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    g_free(s->sec_indev);
>> +    s->sec_indev = g_strdup(value);
>> +}
>> +
>> +static char *compare_get_outdev(Object *obj, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    return g_strdup(s->outdev);
>> +}
>> +
>> +static void compare_set_outdev(Object *obj, const char *value, Error 
>> **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    g_free(s->outdev);
>> +    s->outdev = g_strdup(value);
>> +}
>> +
>> +static void compare_pri_rs_finalize(SocketReadState *pri_rs)
>> +{
>> +    /* if packet_enqueue pri pkt failed we will send unsupported 
>> packet */
>> +}
>> +
>> +static void compare_sec_rs_finalize(SocketReadState *sec_rs)
>> +{
>> +    /* if packet_enqueue sec pkt failed we will notify trace */
>> +}
>> +
>> +static int compare_chardev_opts(void *opaque,
>> +                                const char *name, const char *value,
>> +                                Error **errp)
>> +{
>> +    CompareChardevProps *props = opaque;
>> +
>> +    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") == 0) {
>> +        props->is_socket = true;
>> +    } else if (strcmp(name, "host") == 0) {
>
> Typo? net_vhost_chardev_opts() did:
>
>     } else if (strcmp(name, "path") == 0) {
>         props->is_unix = true;
>     }
>
>

No, In colo-compare we use chardev like this:

-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait

If we only use "path" here will trigger a error.
Should I add anthor "path" here?


>
>> +        props->is_unix = true;
>> +    } else if (strcmp(name, "port") == 0) {
>> +    } else if (strcmp(name, "server") == 0) {
>> +    } else if (strcmp(name, "wait") == 0) {
>> +    } else {
>> +        error_setg(errp,
>> +                   "COLO-compare does not support a chardev with 
>> option %s=%s",
>> +                   name, value);
>> +        return -1;
>> +    }
>> +    return 0;
>> +}
>> +
>> +/*
>> + * called from the main thread on the primary
>> + * to setup colo-compare.
>> + */
>> +static void colo_compare_complete(UserCreatable *uc, Error **errp)
>> +{
>> +    CompareState *s = COLO_COMPARE(uc);
>> +    CompareChardevProps props;
>> +
>> +    if (!s->pri_indev || !s->sec_indev || !s->outdev) {
>> +        error_setg(errp, "colo compare needs 'primary_in' ,"
>> +                   "'secondary_in','outdev' property set");
>> +        return;
>> +    } else if (!strcmp(s->pri_indev, s->outdev) ||
>> +               !strcmp(s->sec_indev, s->outdev) ||
>> +               !strcmp(s->pri_indev, s->sec_indev)) {
>> +        error_setg(errp, "'indev' and 'outdev' could not be same "
>> +                   "for compare module");
>> +        return;
>> +    }
>> +
>> +    s->chr_pri_in = qemu_chr_find(s->pri_indev);
>> +    if (s->chr_pri_in == NULL) {
>> +        error_setg(errp, "Primary IN Device '%s' not found",
>> +                   s->pri_indev);
>> +        return;
>> +    }
>> +
>> +    /* inspect chardev opts */
>> +    memset(&props, 0, sizeof(props));
>> +    if (qemu_opt_foreach(s->chr_pri_in->opts, compare_chardev_opts, 
>> &props, errp)) {
>> +        return;
>> +    }
>> +
>> +    if (!props.is_socket || !props.is_unix) {
>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>> +                   s->pri_indev);
>> +        return;
>> +    }
>> +
>> +    s->chr_sec_in = qemu_chr_find(s->sec_indev);
>> +    if (s->chr_sec_in == NULL) {
>> +        error_setg(errp, "Secondary IN Device '%s' not found",
>> +                   s->sec_indev);
>> +        return;
>> +    }
>> +
>> +    memset(&props, 0, sizeof(props));
>> +    if (qemu_opt_foreach(s->chr_sec_in->opts, compare_chardev_opts, 
>> &props, errp)) {
>> +        return;
>> +    }
>> +
>> +    if (!props.is_socket || !props.is_unix) {
>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>> +                   s->sec_indev);
>
> I believe tcp socket is also supported?

If I understand correctly, "tcp socket" in here is the "-chardev socket".
I will rename "unix socket" to "tcp socket".

>
>> +        return;
>> +    }
>> +
>> +    s->chr_out = qemu_chr_find(s->outdev);
>> +    if (s->chr_out == NULL) {
>> +        error_setg(errp, "OUT Device '%s' not found", s->outdev);
>> +        return;
>> +    }
>> +
>> +    memset(&props, 0, sizeof(props));
>> +    if (qemu_opt_foreach(s->chr_out->opts, compare_chardev_opts, 
>> &props, errp)) {
>> +        return;
>> +    }
>> +
>> +    if (!props.is_socket || !props.is_unix) {
>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>> +                   s->outdev);
>
> Ditto, and there's code duplication, please introduce a helper to do 
> above.

I don't understand what the "helper"?
In here we check each chardev, will I change to "goto error;" ?

>
>> +        return;
>> +    }
>> +
>> +    qemu_chr_fe_claim_no_fail(s->chr_pri_in);
>> +
>> +    qemu_chr_fe_claim_no_fail(s->chr_sec_in);
>> +
>> +    qemu_chr_fe_claim_no_fail(s->chr_out);
>> +
>> +    net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>> +    net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>> +
>> +    return;
>> +}
>> +
>> +static void colo_compare_class_init(ObjectClass *oc, void *data)
>> +{
>> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
>> +
>> +    ucc->complete = colo_compare_complete;
>> +}
>> +
>> +static void colo_compare_init(Object *obj)
>> +{
>> +    object_property_add_str(obj, "primary_in",
>> +                            compare_get_pri_indev, 
>> compare_set_pri_indev,
>> +                            NULL);
>> +    object_property_add_str(obj, "secondary_in",
>> +                            compare_get_sec_indev, 
>> compare_set_sec_indev,
>> +                            NULL);
>> +    object_property_add_str(obj, "outdev",
>> +                            compare_get_outdev, compare_set_outdev,
>> +                            NULL);
>> +}
>> +
>> +static void colo_compare_finalize(Object *obj)
>> +{
>> +    CompareState *s = COLO_COMPARE(obj);
>> +
>> +    if (s->chr_pri_in) {
>> +        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
>> +        qemu_chr_fe_release(s->chr_pri_in);
>> +    }
>> +    if (s->chr_sec_in) {
>> +        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
>> +        qemu_chr_fe_release(s->chr_sec_in);
>> +    }
>> +    if (s->chr_out) {
>> +        qemu_chr_fe_release(s->chr_out);
>> +    }
>> +
>> +    g_free(s->pri_indev);
>> +    g_free(s->sec_indev);
>> +    g_free(s->outdev);
>> +}
>> +
>> +static const TypeInfo colo_compare_info = {
>> +    .name = TYPE_COLO_COMPARE,
>> +    .parent = TYPE_OBJECT,
>> +    .instance_size = sizeof(CompareState),
>> +    .instance_init = colo_compare_init,
>> +    .instance_finalize = colo_compare_finalize,
>> +    .class_size = sizeof(CompareClass),
>> +    .class_init = colo_compare_class_init,
>> +    .interfaces = (InterfaceInfo[]) {
>> +        { TYPE_USER_CREATABLE },
>> +        { }
>> +    }
>> +};
>> +
>> +static void register_types(void)
>> +{
>> +    type_register_static(&colo_compare_info);
>> +}
>> +
>> +type_init(register_types);
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 587de8f..33d5d0b 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -3866,6 +3866,45 @@ Dump the network traffic on netdev @var{dev} 
>> to the file specified by
>>   The file format is libpcap, so it can be analyzed with tools such 
>> as tcpdump
>>   or Wireshark.
>>   +@item -object 
>> colo-compare,id=@var{id},primary_in=@var{chardevid},secondary_in=@var{chardevid},
>> +outdev=@var{chardevid}
>> +
>> +Colo-compare gets packet from primary_in@var{chardevid} and 
>> secondary_in@var{chardevid}, than compare primary packet with
>> +secondary packet. If the packet same, we will output primary
>
> s/If the packet same/If the packets are same/.

OK.

>
>> +packet to outdev@var{chardevid}, else we will notify colo-frame
>> +do checkpoint and send primary packet to outdev@var{chardevid}.
>> +
>> +we can use it with the help of filter-mirror and filter-redirector.
>
> s/we/We/ and looks like colo compare must be used with the help of 
> mirror and redirector?

Currently yes.

>
>> +
>> +@example
>> +
>> +primary:
>> +-netdev 
>> tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
>> +-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>> +-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>> +-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>> +-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>> +-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>> +-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>> +-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>> +-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>> +-object 
>> filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>> +-object 
>> filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>> +-object 
>> colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>> +
>> +secondary:
>> +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down 
>> script=/etc/qemu-ifdown
>> +-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>> +-chardev socket,id=red0,host=3.3.3.3,port=9003
>> +-chardev socket,id=red1,host=3.3.3.3,port=9004
>> +-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>> +-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>> +
>> +@end example
>> +
>> +If you want to know the detail of above command line, you can read
>> +the colo-compare git log.
>> +
>>   @item -object 
>> secret,id=@var{id},data=@var{string},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>   @item -object 
>> secret,id=@var{id},file=@var{filename},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>   diff --git a/vl.c b/vl.c
>> index cbe51ac..c6b9a6f 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -2865,7 +2865,8 @@ static bool object_create_initial(const char 
>> *type)
>>       if (g_str_equal(type, "filter-buffer") ||
>>           g_str_equal(type, "filter-dump") ||
>>           g_str_equal(type, "filter-mirror") ||
>> -        g_str_equal(type, "filter-redirector")) {
>> +        g_str_equal(type, "filter-redirector") ||
>> +        g_str_equal(type, "colo-compare")) {
>>           return false;
>>       }
>
>
>
> .
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread Zhang Chen
@ 2016-08-31  9:13   ` Jason Wang
  2016-09-01  4:50     ` Zhang Chen
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2016-08-31  9:13 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 2016年08月17日 16:10, Zhang Chen wrote:
> If primary packet is same with secondary packet,
> we will send primary packet and drop secondary
> packet, otherwise notify COLO frame to do checkpoint.
> If primary packet comes but secondary packet does not,
> after REGULAR_PACKET_CHECK_MS milliseconds we set
> the primary packet as old_packet,then do a checkpoint.
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>   net/colo-compare.c | 216 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/colo.c         |   1 +
>   net/colo.h         |   3 +
>   trace-events       |   2 +
>   4 files changed, 222 insertions(+)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index bab215b..b90cf1f 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -36,6 +36,8 @@
>   
>   #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>   #define MAX_QUEUE_SIZE 1024
> +/* TODO: Should be configurable */
> +#define REGULAR_PACKET_CHECK_MS 3000
>   
>   /*
>     + CompareState ++
> @@ -79,6 +81,10 @@ typedef struct CompareState {
>       GQueue conn_list;
>       /* hashtable to save connection */
>       GHashTable *connection_track_table;
> +    /* compare thread, a thread for each NIC */
> +    QemuThread thread;
> +    /* Timer used on the primary to find packets that are never matched */
> +    QEMUTimer *timer;
>   } CompareState;
>   
>   typedef struct CompareClass {
> @@ -152,6 +158,113 @@ static int packet_enqueue(CompareState *s, int mode)
>       return 0;
>   }
>   
> +/*
> + * The IP packets sent by primary and secondary
> + * will be compared in here
> + * TODO support ip fragment, Out-Of-Order
> + * return:    0  means packet same
> + *            > 0 || < 0 means packet different
> + */
> +static int colo_packet_compare(Packet *ppkt, Packet *spkt)
> +{
> +    trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
> +                               inet_ntoa(ppkt->ip->ip_dst), spkt->size,
> +                               inet_ntoa(spkt->ip->ip_src),
> +                               inet_ntoa(spkt->ip->ip_dst));
> +
> +    if (ppkt->size == spkt->size) {
> +        return memcmp(ppkt->data, spkt->data, spkt->size);
> +    } else {
> +        return -1;
> +    }
> +}
> +
> +static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
> +{
> +    trace_colo_compare_main("compare all");
> +    return colo_packet_compare(ppkt, spkt);
> +}
> +
> +static void colo_old_packet_check_one(void *opaque_packet,
> +                                      void *opaque_found)
> +{
> +    int64_t now;
> +    bool *found_old = (bool *)opaque_found;
> +    Packet *ppkt = (Packet *)opaque_packet;
> +
> +    if (*found_old) {
> +        /* Someone found an old packet earlier in the queue */
> +        return;
> +    }
> +
> +    now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> +    if ((now - ppkt->creation_ms) > REGULAR_PACKET_CHECK_MS) {
> +        trace_colo_old_packet_check_found(ppkt->creation_ms);
> +        *found_old = true;
> +    }
> +}
> +
> +static void colo_old_packet_check_one_conn(void *opaque,
> +                                           void *user_data)
> +{
> +    bool found_old = false;
> +    Connection *conn = opaque;
> +
> +    g_queue_foreach(&conn->primary_list, colo_old_packet_check_one,
> +                    &found_old);

As I mentioned in last version, can we avoid iterating all packets by 
using g_queue_find_custom() here?

> +    if (found_old) {
> +        /* do checkpoint will flush old packet */
> +        /* TODO: colo_notify_checkpoint();*/
> +    }
> +}
> +
> +/*
> + * Look for old packets that the secondary hasn't matched,
> + * if we have some then we have to checkpoint to wake
> + * the secondary up.
> + */
> +static void colo_old_packet_check(void *opaque)
> +{
> +    CompareState *s = opaque;
> +
> +    g_queue_foreach(&s->conn_list, colo_old_packet_check_one_conn, NULL);
> +}
> +
> +/*
> + * called from the compare thread on the primary
> + * for compare connection
> + */
> +static void colo_compare_connection(void *opaque, void *user_data)
> +{
> +    CompareState *s = user_data;
> +    Connection *conn = opaque;
> +    Packet *pkt = NULL;
> +    GList *result = NULL;
> +    int ret;
> +
> +    while (!g_queue_is_empty(&conn->primary_list) &&
> +           !g_queue_is_empty(&conn->secondary_list)) {
> +        pkt = g_queue_pop_tail(&conn->primary_list);
> +        result = g_queue_find_custom(&conn->secondary_list,
> +                              pkt, (GCompareFunc)colo_packet_compare_all);
> +
> +        if (result) {
> +            ret = compare_chr_send(s->chr_out, pkt->data, pkt->size);
> +            if (ret < 0) {
> +                error_report("colo_send_primary_packet failed");
> +            }
> +            trace_colo_compare_main("packet same and release packet");
> +            g_queue_remove(&conn->secondary_list, result->data);
> +            packet_destroy(pkt, NULL);
> +        } else {

Better add a comment to explain the case when secondary packet comes a 
little bit late here.

> +            trace_colo_compare_main("packet different");
> +            g_queue_push_tail(&conn->primary_list, pkt);
> +            /* TODO: colo_notify_checkpoint();*/
> +            break;
> +        }
> +    }
> +}
> +
>   static int compare_chr_send(CharDriverState *out,
>                               const uint8_t *buf,
>                               uint32_t size)
> @@ -179,6 +292,65 @@ err:
>       return ret < 0 ? ret : -EIO;
>   }
>   
> +static int compare_chr_can_read(void *opaque)
> +{
> +    return COMPARE_READ_LEN_MAX;
> +}
> +
> +/*
> + * called from the main thread on the primary for packets
> + * arriving over the socket from the primary.
> + */
> +static void compare_pri_chr_in(void *opaque, const uint8_t *buf, int size)
> +{
> +    CompareState *s = COLO_COMPARE(opaque);
> +    int ret;
> +
> +    ret = net_fill_rstate(&s->pri_rs, buf, size);
> +    if (ret == -1) {
> +        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
> +        error_report("colo-compare primary_in error");
> +    }
> +}
> +
> +/*
> + * called from the main thread on the primary for packets
> + * arriving over the socket from the secondary.
> + */
> +static void compare_sec_chr_in(void *opaque, const uint8_t *buf, int size)
> +{
> +    CompareState *s = COLO_COMPARE(opaque);
> +    int ret;
> +
> +    ret = net_fill_rstate(&s->sec_rs, buf, size);
> +    if (ret == -1) {
> +        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
> +        error_report("colo-compare secondary_in error");
> +    }
> +}
> +
> +static void *colo_compare_thread(void *opaque)
> +{
> +    GMainContext *worker_context;
> +    GMainLoop *compare_loop;
> +    CompareState *s = opaque;
> +
> +    worker_context = g_main_context_new();
> +
> +    qemu_chr_add_handlers_full(s->chr_pri_in, compare_chr_can_read,
> +                          compare_pri_chr_in, NULL, s, worker_context);
> +    qemu_chr_add_handlers_full(s->chr_sec_in, compare_chr_can_read,
> +                          compare_sec_chr_in, NULL, s, worker_context);
> +
> +    compare_loop = g_main_loop_new(worker_context, FALSE);
> +
> +    g_main_loop_run(compare_loop);
> +
> +    g_main_loop_unref(compare_loop);
> +    g_main_context_unref(worker_context);
> +    return NULL;
> +}
> +
>   static char *compare_get_pri_indev(Object *obj, Error **errp)
>   {
>       CompareState *s = COLO_COMPARE(obj);
> @@ -231,6 +403,9 @@ static void compare_pri_rs_finalize(SocketReadState *pri_rs)
>       if (packet_enqueue(s, PRIMARY_IN)) {
>           trace_colo_compare_main("primary: unsupported packet in");
>           compare_chr_send(s->chr_out, pri_rs->buf, pri_rs->packet_len);
> +    } else {
> +        /* compare connection */
> +        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
>       }
>   }
>   
> @@ -240,6 +415,9 @@ static void compare_sec_rs_finalize(SocketReadState *sec_rs)
>   
>       if (packet_enqueue(s, SECONDARY_IN)) {
>           trace_colo_compare_main("secondary: unsupported packet in");
> +    } else {
> +        /* compare connection */
> +        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
>       }
>   }
>   
> @@ -266,6 +444,20 @@ static int compare_chardev_opts(void *opaque,
>   }
>   
>   /*
> + * Check old packet regularly so it can watch for any packets
> + * that the secondary hasn't produced equivalents of.
> + */
> +static void check_old_packet_regular(void *opaque)
> +{
> +    CompareState *s = opaque;
> +
> +    timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
> +              REGULAR_PACKET_CHECK_MS);
> +    /* if have old packet we will notify checkpoint */
> +    colo_old_packet_check(s);
> +}
> +
> +/*
>    * called from the main thread on the primary
>    * to setup colo-compare.
>    */
> @@ -273,6 +465,8 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
>   {
>       CompareState *s = COLO_COMPARE(uc);
>       CompareChardevProps props;
> +    char thread_name[64];
> +    static int compare_id;
>   
>       if (!s->pri_indev || !s->sec_indev || !s->outdev) {
>           error_setg(errp, "colo compare needs 'primary_in' ,"
> @@ -356,6 +550,18 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp)
>                                                         g_free,
>                                                         connection_destroy);
>   
> +    sprintf(thread_name, "colo-compare %d", compare_id);
> +    qemu_thread_create(&s->thread, thread_name,
> +                       colo_compare_thread, s,
> +                       QEMU_THREAD_JOINABLE);
> +    compare_id++;
> +
> +    /* A regular timer to kick any packets that the secondary doesn't match */
> +    s->timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, /* Only when guest runs */
> +                            check_old_packet_regular, s);
> +    timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
> +                        REGULAR_PACKET_CHECK_MS);

I still think we need to make sure the timer were processed in colo 
thread. Since check_old_packet_regular may iterate conn_list which may 
be modified by colo thread at the same time.

> +
>       return;
>   }
>   
> @@ -397,6 +603,16 @@ static void colo_compare_finalize(Object *obj)
>   
>       g_queue_free(&s->conn_list);
>   
> +    if (s->thread.thread) {
> +        /* compare connection */
> +        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
> +        qemu_thread_join(&s->thread);
> +    }
> +
> +    if (s->timer) {
> +        timer_del(s->timer);
> +    }
> +
>       g_free(s->pri_indev);
>       g_free(s->sec_indev);
>       g_free(s->outdev);
> diff --git a/net/colo.c b/net/colo.c
> index bc86553..da4b771 100644
> --- a/net/colo.c
> +++ b/net/colo.c
> @@ -128,6 +128,7 @@ Packet *packet_new(const void *data, int size)
>   
>       pkt->data = g_memdup(data, size);
>       pkt->size = size;
> +    pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>   
>       return pkt;
>   }
> diff --git a/net/colo.h b/net/colo.h
> index 9cbc14e..6b395a3 100644
> --- a/net/colo.h
> +++ b/net/colo.h
> @@ -17,6 +17,7 @@
>   
>   #include "slirp/slirp.h"
>   #include "qemu/jhash.h"
> +#include "qemu/timer.h"
>   
>   #define HASHTABLE_MAX_SIZE 16384
>   
> @@ -28,6 +29,8 @@ typedef struct Packet {
>       };
>       uint8_t *transport_header;
>       int size;
> +    /* Time of packet creation, in wall clock ms */
> +    int64_t creation_ms;
>   } Packet;
>   
>   typedef struct ConnectionKey {
> diff --git a/trace-events b/trace-events
> index 703de1a..1537e91 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1919,3 +1919,5 @@ aspeed_vic_write(uint64_t offset, unsigned size, uint32_t data) "To 0x%" PRIx64
>   
>   # net/colo-compare.c
>   colo_compare_main(const char *chr) ": %s"
> +colo_compare_ip_info(int psize, const char *sta, const char *stb, int ssize, const char *stc, const char *std) "ppkt size = %d, ip_src = %s, ip_dst = %s, spkt size = %d, ip_src = %s, ip_dst = %s"
> +colo_old_packet_check_found(int64_t old_time) "%" PRId64

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 03/10] net/colo.c: add colo.c to define and handle packet
  2016-08-31  8:04   ` Jason Wang
@ 2016-08-31  9:19     ` Zhang Chen
  0 siblings, 0 replies; 32+ messages in thread
From: Zhang Chen @ 2016-08-31  9:19 UTC (permalink / raw)
  To: Jason Wang, qemu devel
  Cc: Li Zhijian, eddie . dong, Dr . David Alan Gilbert, zhanghailiang



On 08/31/2016 04:04 PM, Jason Wang wrote:
>
>
> On 2016年08月17日 16:10, Zhang Chen wrote:
>> The net/colo.c is used by colo-compare and filter-rewriter.
>> this can share common data structure like net packet,
>> and other functions.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>   net/Makefile.objs  |   1 +
>>   net/colo-compare.c | 113 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   net/colo.c         |  70 +++++++++++++++++++++++++++++++++
>>   net/colo.h         |  38 ++++++++++++++++++
>>   trace-events       |   3 ++
>>   5 files changed, 223 insertions(+), 2 deletions(-)
>>   create mode 100644 net/colo.c
>>   create mode 100644 net/colo.h
>>
>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>> index ba92f73..beb504b 100644
>> --- a/net/Makefile.objs
>> +++ b/net/Makefile.objs
>> @@ -17,3 +17,4 @@ common-obj-y += filter.o
>>   common-obj-y += filter-buffer.o
>>   common-obj-y += filter-mirror.o
>>   common-obj-y += colo-compare.o
>> +common-obj-y += colo.o
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index cdc3e0e..d9e4459 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -27,13 +27,38 @@
>>   #include "sysemu/char.h"
>>   #include "qemu/sockets.h"
>>   #include "qapi-visit.h"
>> +#include "net/colo.h"
>> +#include "trace.h"
>>     #define TYPE_COLO_COMPARE "colo-compare"
>>   #define COLO_COMPARE(obj) \
>>       OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>>     #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>> +#define MAX_QUEUE_SIZE 1024
>>   +/*
>> +  + CompareState ++
>> +  |               |
>> +  +---------------+   +---------------+ +---------------+
>> +  |conn list      +--->conn +--------->conn           |
>> +  +---------------+   +---------------+ +---------------+
>> +  |               |     |           |             |          |
>> +  +---------------+ +---v----+  +---v----+    +---v----+ +---v----+
>> +                    |primary |  |secondary    |primary | |secondary
>> +                    |packet  |  |packet  +    |packet  | |packet  +
>> +                    +--------+  +--------+    +--------+ +--------+
>> +                        |           |             |          |
>> +                    +---v----+  +---v----+    +---v----+ +---v----+
>> +                    |primary |  |secondary    |primary | |secondary
>> +                    |packet  |  |packet  +    |packet  | |packet  +
>> +                    +--------+  +--------+    +--------+ +--------+
>> +                        |           |             |          |
>> +                    +---v----+  +---v----+    +---v----+ +---v----+
>> +                    |primary |  |secondary    |primary | |secondary
>> +                    |packet  |  |packet  +    |packet  | |packet  +
>> +                    +--------+  +--------+    +--------+ +--------+
>> +*/
>>   typedef struct CompareState {
>>       Object parent;
>>   @@ -46,6 +71,9 @@ typedef struct CompareState {
>>       QTAILQ_ENTRY(CompareState) next;
>>       SocketReadState pri_rs;
>>       SocketReadState sec_rs;
>> +
>> +    /* hashtable to save connection */
>> +    GHashTable *connection_track_table;
>>   } CompareState;
>>     typedef struct CompareClass {
>> @@ -57,6 +85,76 @@ typedef struct CompareChardevProps {
>>       bool is_unix;
>>   } CompareChardevProps;
>>   +enum {
>> +    PRIMARY_IN = 0,
>> +    SECONDARY_IN,
>> +};
>> +
>> +static int compare_chr_send(CharDriverState *out,
>> +                            const uint8_t *buf,
>> +                            uint32_t size);
>> +
>> +/*
>> + * Return 0 on success, if return -1 means the pkt
>> + * is unsupported(arp and ipv6) and will be sent later
>> + */
>> +static int packet_enqueue(CompareState *s, int mode)
>> +{
>> +    Packet *pkt = NULL;
>> +
>> +    if (mode == PRIMARY_IN) {
>> +        pkt = packet_new(s->pri_rs.buf, s->pri_rs.packet_len);
>> +    } else {
>> +        pkt = packet_new(s->sec_rs.buf, s->sec_rs.packet_len);
>> +    }
>> +
>> +    if (parse_packet_early(pkt)) {
>> +        packet_destroy(pkt, NULL);
>> +        pkt = NULL;
>> +        return -1;
>> +    }
>> +    /* TODO: get connection key from pkt */
>> +
>> +    /*
>> +     * TODO: use connection key get conn from
>> +     * connection_track_table
>> +     */
>> +
>> +    /*
>> +     * TODO: insert pkt to it's conn->primary_list
>> +     * or conn->secondary_list
>> +     */
>> +
>> +    return 0;
>> +}
>> +
>> +static int compare_chr_send(CharDriverState *out,
>> +                            const uint8_t *buf,
>> +                            uint32_t size)
>> +{
>> +    int ret = 0;
>> +    uint32_t len = htonl(size);
>> +
>> +    if (!size) {
>> +        return 0;
>> +    }
>> +
>> +    ret = qemu_chr_fe_write_all(out, (uint8_t *)&len, sizeof(len));
>> +    if (ret != sizeof(len)) {
>> +        goto err;
>> +    }
>> +
>> +    ret = qemu_chr_fe_write_all(out, (uint8_t *)buf, size);
>> +    if (ret != size) {
>> +        goto err;
>> +    }
>> +
>> +    return 0;
>> +
>> +err:
>> +    return ret < 0 ? ret : -EIO;
>> +}
>> +
>>   static char *compare_get_pri_indev(Object *obj, Error **errp)
>>   {
>>       CompareState *s = COLO_COMPARE(obj);
>> @@ -104,12 +202,21 @@ static void compare_set_outdev(Object *obj, 
>> const char *value, Error **errp)
>>     static void compare_pri_rs_finalize(SocketReadState *pri_rs)
>>   {
>> -    /* if packet_enqueue pri pkt failed we will send unsupported 
>> packet */
>> +    CompareState *s = container_of(pri_rs, CompareState, pri_rs);
>> +
>> +    if (packet_enqueue(s, PRIMARY_IN)) {
>> +        trace_colo_compare_main("primary: unsupported packet in");
>> +        compare_chr_send(s->chr_out, pri_rs->buf, pri_rs->packet_len);
>> +    }
>>   }
>>     static void compare_sec_rs_finalize(SocketReadState *sec_rs)
>>   {
>> -    /* if packet_enqueue sec pkt failed we will notify trace */
>> +    CompareState *s = container_of(sec_rs, CompareState, sec_rs);
>> +
>> +    if (packet_enqueue(s, SECONDARY_IN)) {
>> +        trace_colo_compare_main("secondary: unsupported packet in");
>> +    }
>>   }
>>     static int compare_chardev_opts(void *opaque,
>> @@ -218,6 +325,8 @@ static void colo_compare_complete(UserCreatable 
>> *uc, Error **errp)
>>       net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>>       net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>>   +    /* use g_hash_table_new_full() to new a hashtable */
>> +
>>       return;
>>   }
>>   diff --git a/net/colo.c b/net/colo.c
>> new file mode 100644
>> index 0000000..4daedd4
>> --- /dev/null
>> +++ b/net/colo.c
>> @@ -0,0 +1,70 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service 
>> (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2016 FUJITSU LIMITED
>> + * Copyright (c) 2016 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/error-report.h"
>> +#include "net/colo.h"
>> +
>> +int parse_packet_early(Packet *pkt)
>> +{
>> +    int network_length;
>> +    uint8_t *data = pkt->data;
>> +    uint16_t l3_proto;
>> +    ssize_t l2hdr_len = eth_get_l2_hdr_length(data);
>> +
>> +    if (pkt->size < ETH_HLEN) {
>> +        error_report("pkt->size < ETH_HLEN");
>
> Guest triggered, better not use error_report() here.

OK, will change to error_setg().

>
>> +        return 1;
>> +    }
>> +    pkt->network_header = data + ETH_HLEN;
>
> Need use l2hdr_len here instead of ETH_HLEP?

OK~

>
>> +    l3_proto = eth_get_l3_proto(data, l2hdr_len);
>> +    if (l3_proto != ETH_P_IP) {
>> +        return 1;
>> +    }
>> +
>> +    network_length = pkt->ip->ip_hl * 4;
>> +    if (pkt->size < ETH_HLEN + network_length) {
>
> Ditto.

OK~

>
>> +        error_report("pkt->size < network_header + network_length");
>
> And better not use error_report() since it was triggered by guest.

OK, will change to error_setg().

Thanks
Zhang Chen

>
>> +        return 1;
>> +    }
>> +    pkt->transport_header = pkt->network_header + network_length;
>> +
>> +    return 0;
>> +}
>> +
>> +Packet *packet_new(const void *data, int size)
>> +{
>> +    Packet *pkt = g_slice_new(Packet);
>> +
>> +    pkt->data = g_memdup(data, size);
>> +    pkt->size = size;
>> +
>> +    return pkt;
>> +}
>> +
>> +void packet_destroy(void *opaque, void *user_data)
>> +{
>> +    Packet *pkt = opaque;
>> +
>> +    g_free(pkt->data);
>> +    g_slice_free(Packet, pkt);
>> +}
>> +
>> +/*
>> + * Clear hashtable, stop this hash growing really huge
>> + */
>> +void connection_hashtable_reset(GHashTable *connection_track_table)
>> +{
>> +    g_hash_table_remove_all(connection_track_table);
>> +}
>> diff --git a/net/colo.h b/net/colo.h
>> new file mode 100644
>> index 0000000..8559f28
>> --- /dev/null
>> +++ b/net/colo.h
>> @@ -0,0 +1,38 @@
>> +/*
>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service 
>> (COLO)
>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>> + *
>> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
>> + * Copyright (c) 2016 FUJITSU LIMITED
>> + * Copyright (c) 2016 Intel Corporation
>> + *
>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>> + * later.  See the COPYING file in the top-level directory.
>> + */
>> +
>> +#ifndef QEMU_COLO_BASE_H
>> +#define QEMU_COLO_BASE_H
>> +
>> +#include "slirp/slirp.h"
>> +#include "qemu/jhash.h"
>> +
>> +#define HASHTABLE_MAX_SIZE 16384
>> +
>> +typedef struct Packet {
>> +    void *data;
>> +    union {
>> +        uint8_t *network_header;
>> +        struct ip *ip;
>> +    };
>> +    uint8_t *transport_header;
>> +    int size;
>> +} Packet;
>> +
>> +int parse_packet_early(Packet *pkt);
>> +void connection_hashtable_reset(GHashTable *connection_track_table);
>> +Packet *packet_new(const void *data, int size);
>> +void packet_destroy(void *opaque, void *user_data);
>> +
>> +#endif /* QEMU_COLO_BASE_H */
>> diff --git a/trace-events b/trace-events
>> index ca7211b..703de1a 100644
>> --- a/trace-events
>> +++ b/trace-events
>> @@ -1916,3 +1916,6 @@ aspeed_vic_update_fiq(int flags) "Raising FIQ: %d"
>>   aspeed_vic_update_irq(int flags) "Raising IRQ: %d"
>>   aspeed_vic_read(uint64_t offset, unsigned size, uint32_t value) 
>> "From 0x%" PRIx64 " of size %u: 0x%" PRIx32
>>   aspeed_vic_write(uint64_t offset, unsigned size, uint32_t data) "To 
>> 0x%" PRIx64 " of size %u: 0x%" PRIx32
>> +
>> +# net/colo-compare.c
>> +colo_compare_main(const char *chr) ": %s"
>
>
>
> .
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 04/10] Jhash: add linux kernel jhashtable in qemu
  2016-08-31  8:05   ` Jason Wang
@ 2016-08-31  9:20     ` Zhang Chen
  0 siblings, 0 replies; 32+ messages in thread
From: Zhang Chen @ 2016-08-31  9:20 UTC (permalink / raw)
  To: Jason Wang, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 08/31/2016 04:05 PM, Jason Wang wrote:
>
>
> On 2016年08月17日 16:10, Zhang Chen wrote:
>> Jhash used by colo-compare and filter-rewriter
>
> s/used/will be used/
>

OK~~

Thanks
Zhang Chen

>> to save and lookup net connection info
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>   include/qemu/jhash.h | 59 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 59 insertions(+)
>>   create mode 100644 include/qemu/jhash.h
>>
>> diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h
>> new file mode 100644
>> index 0000000..7222242
>> --- /dev/null
>> +++ b/include/qemu/jhash.h
>> @@ -0,0 +1,59 @@
>> +/* jhash.h: Jenkins hash support.
>> +  *
>> +  * Copyright (C) 2006. Bob Jenkins (bob_jenkins@burtleburtle.net)
>> +  *
>> +  * http://burtleburtle.net/bob/hash/
>> +  *
>> +  * These are the credits from Bob's sources:
>> +  *
>> +  * lookup3.c, by Bob Jenkins, May 2006, Public Domain.
>> +  *
>> +  * These are functions for producing 32-bit hashes for hash table 
>> lookup.
>> +  * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and 
>> final()
>> +  * are externally useful functions.  Routines to test the hash are 
>> included
>> +  * if SELF_TEST is defined.  You can use this free for any purpose. 
>> It's in
>> +  * the public domain.  It has no warranty.
>> +  *
>> +  * Copyright (C) 2009-2010 Jozsef Kadlecsik (kadlec@blackhole.kfki.hu)
>> +  *
>> +  * I've modified Bob's hash to be useful in the Linux kernel, and
>> +  * any bugs present are my fault.
>> +  * Jozsef
>> +  */
>> +
>> +#ifndef QEMU_JHASH_H__
>> +#define QEMU_JHASH_H__
>> +
>> +#include "qemu/bitops.h"
>> +
>> +/*
>> + * hashtable relation copy from linux kernel jhash
>> + */
>> +
>> +/* __jhash_mix -- mix 3 32-bit values reversibly. */
>> +#define __jhash_mix(a, b, c)                \
>> +{                                           \
>> +    a -= c;  a ^= rol32(c, 4);  c += b;     \
>> +    b -= a;  b ^= rol32(a, 6);  a += c;     \
>> +    c -= b;  c ^= rol32(b, 8);  b += a;     \
>> +    a -= c;  a ^= rol32(c, 16); c += b;     \
>> +    b -= a;  b ^= rol32(a, 19); a += c;     \
>> +    c -= b;  c ^= rol32(b, 4);  b += a;     \
>> +}
>> +
>> +/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
>> +#define __jhash_final(a, b, c)  \
>> +{                               \
>> +    c ^= b; c -= rol32(b, 14);  \
>> +    a ^= c; a -= rol32(c, 11);  \
>> +    b ^= a; b -= rol32(a, 25);  \
>> +    c ^= b; c -= rol32(b, 16);  \
>> +    a ^= c; a -= rol32(c, 4);   \
>> +    b ^= a; b -= rol32(a, 14);  \
>> +    c ^= b; c -= rol32(b, 24);  \
>> +}
>> +
>> +/* An arbitrary initial parameter */
>> +#define JHASH_INITVAL           0xdeadbeef
>> +
>> +#endif /* QEMU_JHASH_H__ */
>
>
>
> .
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization
  2016-08-31  9:03     ` Zhang Chen
@ 2016-08-31  9:20       ` Jason Wang
  2016-08-31  9:39         ` Zhang Chen
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2016-08-31  9:20 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Dr . David Alan Gilbert, eddie . dong, Li Zhijian, zhanghailiang



On 2016年08月31日 17:03, Zhang Chen wrote:
>
>
> On 08/31/2016 03:53 PM, Jason Wang wrote:
>>
>>
>> On 2016年08月17日 16:10, Zhang Chen wrote:
>>> This a COLO net ascii figure:
>>>
>>>   Primary qemu Secondary qemu
>>> +--------------------------------------------------------------+ 
>>> +----------------------------------------------------------------+
>>> | +----------------------------------------------------------+ 
>>> |       | 
>>> +-----------------------------------------------------------+ |
>>> | |                                                          | 
>>> |       | 
>>> |                                                           | |
>>> | |                        guest                             | 
>>> |       |  | guest                              | |
>>> | |                                                          | 
>>> |       | 
>>> |                                                           | |
>>> | +-------^--------------------------+-----------------------+ 
>>> |       | 
>>> +---------------------+--------+----------------------------+ |
>>> |         |                          | | |                        ^ 
>>> |                              |
>>> |         |                          | | |                        | 
>>> |                              |
>>> |         | +------------------------------------------------------+ 
>>> |                        |        | |
>>> |netfilter|  |                       | |    |  | 
>>> netfilter            | |                              |
>>> | +----------+ +----------------------------+ |    |  | 
>>> +-----------------------------------------------------------+ |
>>> | |       |  |                       |      |        out | |  |  
>>> |                     |        |  filter excute order       | |
>>> | |       |  |          +-----------------------------+ | |  |  
>>> |                     |        | +------------------->      | |
>>> | |       |  |          |            |      |         | | |  |  
>>> |                     |        | TCP                      | |
>>> | | +-----+--+-+  +-----v----+ +-----v----+ |pri +----+----+sec|    
>>> |  |  | +------------+ +---+----+---v+rewriter++  +------------+ | |
>>> | | |          |  |          | |          | |in  |         |in |    
>>> |  |  | |            |  |        |              | |            | | |
>>> | | |  filter  |  |  filter  | |  filter  +------>  colo <------+ 
>>> +-------->  filter   +--> adjust | adjust     +-->   filter   | | |
>>> | | |  mirror  |  |redirector| |redirector| |    | compare | |  |    
>>> |  | | redirector |  | ack    |   seq        |  | redirector | | |
>>> | | |          |  |          | |          | |    |         | |  |    
>>> |  | |            |  |        |              | |            | | |
>>> | | +----^-----+  +----+-----+ +----------+ |    +---------+ |  |    
>>> |  | +------------+  +--------+--------------+ +---+--------+ | |
>>> | |      |   tx        |   rx           rx  | |  |    | |            
>>> tx                        all       | rx      | |
>>> | |      |             |                    | |  |    | 
>>> +-----------------------------------------------------------+ |
>>> | |      |             +--------------+     | |  |    | |            |
>>> | |      |   filter excute order      |     | |  |    | |            |
>>> | |      |  +---------------->        | | | 
>>> +--------------------------------------------------------+ |
>>> | +-----------------------------------------+ | | |
>>> |        |                            | | | |
>>> +--------------------------------------------------------------+ 
>>> +----------------------------------------------------------------+
>>>           |guest receive               | guest send
>>>           |                            |
>>> +--------+----------------------------v------------------------+
>>> | |                          NOTE: filter direction is rx/tx/all
>>> |                         tap | rx:receive packets sent to the netdev
>>> | |                          tx:receive packets sent by the netdev
>>> +--------------------------------------------------------------+
>>
>> It's better to add a doc under docs to explain this configuration in 
>> detail on top of this series.
>>
>
> As you say, Am I add /docs/colo-proxy.txt to explain it or add this in 
> hailiang's COLO-FT.txt after merge?
>
>>> In COLO-compare, we do packet comparing job.
>>> Packets coming from the primary char indev will be sent to outdev.
>>> Packets coming from the secondary char dev will be dropped after 
>>> comparing.
>>> colo-comapre need two input chardev and one output chardev:
>>> primary_in=chardev1-id (source: primary send packet)
>>> secondary_in=chardev2-id (source: secondary send packet)
>>> outdev=chardev3-id
>>>
>>> usage:
>>>
>>> primary:
>>> -netdev 
>>> tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
>>> -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>>> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>>> -chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>>> -chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>>> -chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>>> -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>>> -chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>>> -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>>> -object 
>>> filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>>> -object 
>>> filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>>> -object 
>>> colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>>>
>>> secondary:
>>> -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down 
>>> script=/etc/qemu-ifdown
>>> -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>>> -chardev socket,id=red0,host=3.3.3.3,port=9003
>>> -chardev socket,id=red1,host=3.3.3.3,port=9004
>>> -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>>> -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>>>
>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>> ---
>>>   net/Makefile.objs  |   1 +
>>>   net/colo-compare.c | 284 
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>   qemu-options.hx    |  39 ++++++++
>>>   vl.c               |   3 +-
>>>   4 files changed, 326 insertions(+), 1 deletion(-)
>>>   create mode 100644 net/colo-compare.c
>>>
>>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>>> index b7c22fd..ba92f73 100644
>>> --- a/net/Makefile.objs
>>> +++ b/net/Makefile.objs
>>> @@ -16,3 +16,4 @@ common-obj-$(CONFIG_NETMAP) += netmap.o
>>>   common-obj-y += filter.o
>>>   common-obj-y += filter-buffer.o
>>>   common-obj-y += filter-mirror.o
>>> +common-obj-y += colo-compare.o
>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>> new file mode 100644
>>> index 0000000..cdc3e0e
>>> --- /dev/null
>>> +++ b/net/colo-compare.c
>>> @@ -0,0 +1,284 @@
>>> +/*
>>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop Service 
>>> (COLO)
>>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>>> + *
>>> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
>>> + * Copyright (c) 2016 FUJITSU LIMITED
>>> + * Copyright (c) 2016 Intel Corporation
>>> + *
>>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>>> + * later.  See the COPYING file in the top-level directory.
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +#include "qemu/error-report.h"
>>> +#include "qemu-common.h"
>>> +#include "qapi/qmp/qerror.h"
>>> +#include "qapi/error.h"
>>> +#include "net/net.h"
>>> +#include "net/vhost_net.h"
>>
>> Looks unnecessary.
>
> I will remove it.
>
>>
>>> +#include "qom/object_interfaces.h"
>>> +#include "qemu/iov.h"
>>> +#include "qom/object.h"
>>> +#include "qemu/typedefs.h"
>>> +#include "net/queue.h"
>>> +#include "sysemu/char.h"
>>> +#include "qemu/sockets.h"
>>> +#include "qapi-visit.h"
>>> +
>>> +#define TYPE_COLO_COMPARE "colo-compare"
>>> +#define COLO_COMPARE(obj) \
>>> +    OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>>> +
>>> +#define COMPARE_READ_LEN_MAX NET_BUFSIZE
>>> +
>>> +typedef struct CompareState {
>>> +    Object parent;
>>> +
>>> +    char *pri_indev;
>>> +    char *sec_indev;
>>> +    char *outdev;
>>> +    CharDriverState *chr_pri_in;
>>> +    CharDriverState *chr_sec_in;
>>> +    CharDriverState *chr_out;
>>> +    QTAILQ_ENTRY(CompareState) next;
>>
>> This looks not used in this series but in commit "colo-compare and 
>> filter-rewriter work with colo-frame". We'd better delay the 
>> introducing to that patch.
>
> OK~ I got your point.
>
>>
>>> +    SocketReadState pri_rs;
>>> +    SocketReadState sec_rs;
>>> +} CompareState;
>>> +
>>> +typedef struct CompareClass {
>>> +    ObjectClass parent_class;
>>> +} CompareClass;
>>> +
>>> +typedef struct CompareChardevProps {
>>> +    bool is_socket;
>>> +    bool is_unix;
>>> +} CompareChardevProps;
>>> +
>>> +static char *compare_get_pri_indev(Object *obj, Error **errp)
>>> +{
>>> +    CompareState *s = COLO_COMPARE(obj);
>>> +
>>> +    return g_strdup(s->pri_indev);
>>> +}
>>> +
>>> +static void compare_set_pri_indev(Object *obj, const char *value, 
>>> Error **errp)
>>> +{
>>> +    CompareState *s = COLO_COMPARE(obj);
>>> +
>>> +    g_free(s->pri_indev);
>>> +    s->pri_indev = g_strdup(value);
>>> +}
>>> +
>>> +static char *compare_get_sec_indev(Object *obj, Error **errp)
>>> +{
>>> +    CompareState *s = COLO_COMPARE(obj);
>>> +
>>> +    return g_strdup(s->sec_indev);
>>> +}
>>> +
>>> +static void compare_set_sec_indev(Object *obj, const char *value, 
>>> Error **errp)
>>> +{
>>> +    CompareState *s = COLO_COMPARE(obj);
>>> +
>>> +    g_free(s->sec_indev);
>>> +    s->sec_indev = g_strdup(value);
>>> +}
>>> +
>>> +static char *compare_get_outdev(Object *obj, Error **errp)
>>> +{
>>> +    CompareState *s = COLO_COMPARE(obj);
>>> +
>>> +    return g_strdup(s->outdev);
>>> +}
>>> +
>>> +static void compare_set_outdev(Object *obj, const char *value, 
>>> Error **errp)
>>> +{
>>> +    CompareState *s = COLO_COMPARE(obj);
>>> +
>>> +    g_free(s->outdev);
>>> +    s->outdev = g_strdup(value);
>>> +}
>>> +
>>> +static void compare_pri_rs_finalize(SocketReadState *pri_rs)
>>> +{
>>> +    /* if packet_enqueue pri pkt failed we will send unsupported 
>>> packet */
>>> +}
>>> +
>>> +static void compare_sec_rs_finalize(SocketReadState *sec_rs)
>>> +{
>>> +    /* if packet_enqueue sec pkt failed we will notify trace */
>>> +}
>>> +
>>> +static int compare_chardev_opts(void *opaque,
>>> +                                const char *name, const char *value,
>>> +                                Error **errp)
>>> +{
>>> +    CompareChardevProps *props = opaque;
>>> +
>>> +    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") == 
>>> 0) {
>>> +        props->is_socket = true;
>>> +    } else if (strcmp(name, "host") == 0) {
>>
>> Typo? net_vhost_chardev_opts() did:
>>
>>     } else if (strcmp(name, "path") == 0) {
>>         props->is_unix = true;
>>     }
>>
>>
>
> No, In colo-compare we use chardev like this:
>
> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>
> If we only use "path" here will trigger a error.
> Should I add anthor "path" here?

If I understand the code correctly, "is_unix" means "is unix domain 
socket"? If yes, according to the help:

-chardev 
socket,id=id[,host=host],port=port[,to=to][,ipv4][,ipv6][,nodelay][,reconnect=seconds]
[,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off]
          [,logfile=PATH][,logappend=on|off][,tls-creds=ID] (tcp)
-chardev 
socket,id=id,path=path[,server][,nowait][,telnet][,reconnect=seconds]
          [,mux=on|off][,logfile=PATH][,logappend=on|off] (unix)

"host" will not be used for UNIX domain socket.

And if UNIX domain socket is not supported, there's probably no need to 
differentiate it from other types.

>
>
>>
>>> +        props->is_unix = true;
>>> +    } else if (strcmp(name, "port") == 0) {
>>> +    } else if (strcmp(name, "server") == 0) {
>>> +    } else if (strcmp(name, "wait") == 0) {
>>> +    } else {
>>> +        error_setg(errp,
>>> +                   "COLO-compare does not support a chardev with 
>>> option %s=%s",
>>> +                   name, value);
>>> +        return -1;
>>> +    }
>>> +    return 0;
>>> +}
>>> +
>>> +/*
>>> + * called from the main thread on the primary
>>> + * to setup colo-compare.
>>> + */
>>> +static void colo_compare_complete(UserCreatable *uc, Error **errp)
>>> +{
>>> +    CompareState *s = COLO_COMPARE(uc);
>>> +    CompareChardevProps props;
>>> +
>>> +    if (!s->pri_indev || !s->sec_indev || !s->outdev) {
>>> +        error_setg(errp, "colo compare needs 'primary_in' ,"
>>> +                   "'secondary_in','outdev' property set");
>>> +        return;
>>> +    } else if (!strcmp(s->pri_indev, s->outdev) ||
>>> +               !strcmp(s->sec_indev, s->outdev) ||
>>> +               !strcmp(s->pri_indev, s->sec_indev)) {
>>> +        error_setg(errp, "'indev' and 'outdev' could not be same "
>>> +                   "for compare module");
>>> +        return;
>>> +    }
>>> +
>>> +    s->chr_pri_in = qemu_chr_find(s->pri_indev);
>>> +    if (s->chr_pri_in == NULL) {
>>> +        error_setg(errp, "Primary IN Device '%s' not found",
>>> +                   s->pri_indev);
>>> +        return;
>>> +    }
>>> +
>>> +    /* inspect chardev opts */
>>> +    memset(&props, 0, sizeof(props));
>>> +    if (qemu_opt_foreach(s->chr_pri_in->opts, compare_chardev_opts, 
>>> &props, errp)) {
>>> +        return;
>>> +    }
>>> +
>>> +    if (!props.is_socket || !props.is_unix) {
>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>> +                   s->pri_indev);
>>> +        return;
>>> +    }
>>> +
>>> +    s->chr_sec_in = qemu_chr_find(s->sec_indev);
>>> +    if (s->chr_sec_in == NULL) {
>>> +        error_setg(errp, "Secondary IN Device '%s' not found",
>>> +                   s->sec_indev);
>>> +        return;
>>> +    }
>>> +
>>> +    memset(&props, 0, sizeof(props));
>>> +    if (qemu_opt_foreach(s->chr_sec_in->opts, compare_chardev_opts, 
>>> &props, errp)) {
>>> +        return;
>>> +    }
>>> +
>>> +    if (!props.is_socket || !props.is_unix) {
>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>> +                   s->sec_indev);
>>
>> I believe tcp socket is also supported?
>
> If I understand correctly, "tcp socket" in here is the "-chardev socket".
> I will rename "unix socket" to "tcp socket".
>
>>
>>> +        return;
>>> +    }
>>> +
>>> +    s->chr_out = qemu_chr_find(s->outdev);
>>> +    if (s->chr_out == NULL) {
>>> +        error_setg(errp, "OUT Device '%s' not found", s->outdev);
>>> +        return;
>>> +    }
>>> +
>>> +    memset(&props, 0, sizeof(props));
>>> +    if (qemu_opt_foreach(s->chr_out->opts, compare_chardev_opts, 
>>> &props, errp)) {
>>> +        return;
>>> +    }
>>> +
>>> +    if (!props.is_socket || !props.is_unix) {
>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>> +                   s->outdev);
>>
>> Ditto, and there's code duplication, please introduce a helper to do 
>> above.
>
> I don't understand what the "helper"?
> In here we check each chardev, will I change to "goto error;" ?

A helper to avoid the code duplication for socket type inspection for 
pri_in,scr_in and chr_out.

>
>>
>>> +        return;
>>> +    }
>>> +
>>> +    qemu_chr_fe_claim_no_fail(s->chr_pri_in);
>>> +
>>> +    qemu_chr_fe_claim_no_fail(s->chr_sec_in);
>>> +
>>> +    qemu_chr_fe_claim_no_fail(s->chr_out);
>>> +
>>> +    net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>>> +    net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>>> +
>>> +    return;
>>> +}
>>> +
>>> +static void colo_compare_class_init(ObjectClass *oc, void *data)
>>> +{
>>> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
>>> +
>>> +    ucc->complete = colo_compare_complete;
>>> +}
>>> +
>>> +static void colo_compare_init(Object *obj)
>>> +{
>>> +    object_property_add_str(obj, "primary_in",
>>> +                            compare_get_pri_indev, 
>>> compare_set_pri_indev,
>>> +                            NULL);
>>> +    object_property_add_str(obj, "secondary_in",
>>> +                            compare_get_sec_indev, 
>>> compare_set_sec_indev,
>>> +                            NULL);
>>> +    object_property_add_str(obj, "outdev",
>>> +                            compare_get_outdev, compare_set_outdev,
>>> +                            NULL);
>>> +}
>>> +
>>> +static void colo_compare_finalize(Object *obj)
>>> +{
>>> +    CompareState *s = COLO_COMPARE(obj);
>>> +
>>> +    if (s->chr_pri_in) {
>>> +        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
>>> +        qemu_chr_fe_release(s->chr_pri_in);
>>> +    }
>>> +    if (s->chr_sec_in) {
>>> +        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
>>> +        qemu_chr_fe_release(s->chr_sec_in);
>>> +    }
>>> +    if (s->chr_out) {
>>> +        qemu_chr_fe_release(s->chr_out);
>>> +    }
>>> +
>>> +    g_free(s->pri_indev);
>>> +    g_free(s->sec_indev);
>>> +    g_free(s->outdev);
>>> +}
>>> +
>>> +static const TypeInfo colo_compare_info = {
>>> +    .name = TYPE_COLO_COMPARE,
>>> +    .parent = TYPE_OBJECT,
>>> +    .instance_size = sizeof(CompareState),
>>> +    .instance_init = colo_compare_init,
>>> +    .instance_finalize = colo_compare_finalize,
>>> +    .class_size = sizeof(CompareClass),
>>> +    .class_init = colo_compare_class_init,
>>> +    .interfaces = (InterfaceInfo[]) {
>>> +        { TYPE_USER_CREATABLE },
>>> +        { }
>>> +    }
>>> +};
>>> +
>>> +static void register_types(void)
>>> +{
>>> +    type_register_static(&colo_compare_info);
>>> +}
>>> +
>>> +type_init(register_types);
>>> diff --git a/qemu-options.hx b/qemu-options.hx
>>> index 587de8f..33d5d0b 100644
>>> --- a/qemu-options.hx
>>> +++ b/qemu-options.hx
>>> @@ -3866,6 +3866,45 @@ Dump the network traffic on netdev @var{dev} 
>>> to the file specified by
>>>   The file format is libpcap, so it can be analyzed with tools such 
>>> as tcpdump
>>>   or Wireshark.
>>>   +@item -object 
>>> colo-compare,id=@var{id},primary_in=@var{chardevid},secondary_in=@var{chardevid},
>>> +outdev=@var{chardevid}
>>> +
>>> +Colo-compare gets packet from primary_in@var{chardevid} and 
>>> secondary_in@var{chardevid}, than compare primary packet with
>>> +secondary packet. If the packet same, we will output primary
>>
>> s/If the packet same/If the packets are same/.
>
> OK.
>
>>
>>> +packet to outdev@var{chardevid}, else we will notify colo-frame
>>> +do checkpoint and send primary packet to outdev@var{chardevid}.
>>> +
>>> +we can use it with the help of filter-mirror and filter-redirector.
>>
>> s/we/We/ and looks like colo compare must be used with the help of 
>> mirror and redirector?
>
> Currently yes.

Then please change the doc here.

Thanks

>
>>
>>> +
>>> +@example
>>> +
>>> +primary:
>>> +-netdev 
>>> tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
>>> +-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>>> +-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>>> +-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>>> +-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>>> +-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>>> +-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>>> +-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>>> +-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>>> +-object 
>>> filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>>> +-object 
>>> filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>>> +-object 
>>> colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>>> +
>>> +secondary:
>>> +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down 
>>> script=/etc/qemu-ifdown
>>> +-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>>> +-chardev socket,id=red0,host=3.3.3.3,port=9003
>>> +-chardev socket,id=red1,host=3.3.3.3,port=9004
>>> +-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>>> +-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>>> +
>>> +@end example
>>> +
>>> +If you want to know the detail of above command line, you can read
>>> +the colo-compare git log.
>>> +
>>>   @item -object 
>>> secret,id=@var{id},data=@var{string},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>>   @item -object 
>>> secret,id=@var{id},file=@var{filename},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>>   diff --git a/vl.c b/vl.c
>>> index cbe51ac..c6b9a6f 100644
>>> --- a/vl.c
>>> +++ b/vl.c
>>> @@ -2865,7 +2865,8 @@ static bool object_create_initial(const char 
>>> *type)
>>>       if (g_str_equal(type, "filter-buffer") ||
>>>           g_str_equal(type, "filter-dump") ||
>>>           g_str_equal(type, "filter-mirror") ||
>>> -        g_str_equal(type, "filter-redirector")) {
>>> +        g_str_equal(type, "filter-redirector") ||
>>> +        g_str_equal(type, "colo-compare")) {
>>>           return false;
>>>       }
>>
>>
>>
>> .
>>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison
  2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison Zhang Chen
@ 2016-08-31  9:33   ` Jason Wang
  2016-09-01  5:00     ` Zhang Chen
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Wang @ 2016-08-31  9:33 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, eddie . dong, Dr . David Alan Gilbert, zhanghailiang



On 2016年08月17日 16:10, Zhang Chen wrote:
> We add TCP,UDP,ICMP packet comparison to replace
> IP packet comparison. This can increase the
> accuracy of the package comparison.
> Less checkpoint more efficiency.
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
>   net/colo-compare.c | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++--
>   trace-events       |   4 ++
>   2 files changed, 152 insertions(+), 4 deletions(-)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index b90cf1f..0daefd9 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -18,6 +18,7 @@
>   #include "qapi/qmp/qerror.h"
>   #include "qapi/error.h"
>   #include "net/net.h"
> +#include "net/eth.h"
>   #include "net/vhost_net.h"
>   #include "qom/object_interfaces.h"
>   #include "qemu/iov.h"
> @@ -179,9 +180,136 @@ static int colo_packet_compare(Packet *ppkt, Packet *spkt)
>       }
>   }
>   
> -static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
> +/*
> + * called from the compare thread on the primary
> + * for compare tcp packet
> + * compare_tcp copied from Dr. David Alan Gilbert's branch
> + */
> +static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
> +{
> +    struct tcphdr *ptcp, *stcp;
> +    int res;
> +    char *sdebug, *ddebug;
> +
> +    trace_colo_compare_main("compare tcp");
> +    if (ppkt->size != spkt->size) {
> +        if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
> +            trace_colo_compare_main("pkt size not same");
> +        }
> +        return -1;
> +    }
> +
> +    ptcp = (struct tcphdr *)ppkt->transport_header;
> +    stcp = (struct tcphdr *)spkt->transport_header;
> +
> +    if (ptcp->th_seq != stcp->th_seq) {
> +        if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
> +            trace_colo_compare_main("pkt tcp seq not same");
> +        }
> +        return -1;
> +    }
> +
> +    /*
> +     * The 'identification' field in the IP header is *very* random
> +     * it almost never matches.  Fudge this by ignoring differences in
> +     * unfragmented packets; they'll normally sort themselves out if different
> +     * anyway, and it should recover at the TCP level.
> +     * An alternative would be to get both the primary and secondary to rewrite
> +     * somehow; but that would need some sync traffic to sync the state
> +     */
> +    if (ntohs(ppkt->ip->ip_off) & IP_DF) {
> +        spkt->ip->ip_id = ppkt->ip->ip_id;
> +        /* and the sum will be different if the IDs were different */
> +        spkt->ip->ip_sum = ppkt->ip->ip_sum;
> +    }
> +
> +    res = memcmp(ppkt->data + ETH_HLEN, spkt->data + ETH_HLEN,
> +                (spkt->size - ETH_HLEN));

This may work but I worry about whether or not tagged packet can work 
here. Looks like parse_packet_early() can recognize vlan tag, but 
fill_connection_key() can not. This looks can result queuing wrong 
packets into wrong connection.

> +
> +    if (res != 0 && trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
> +        sdebug = strdup(inet_ntoa(ppkt->ip->ip_src));
> +        ddebug = strdup(inet_ntoa(ppkt->ip->ip_dst));
> +        fprintf(stderr, "%s: src/dst: %s/%s p: seq/ack=%u/%u"
> +        " s: seq/ack=%u/%u res=%d flags=%x/%x\n", __func__,
> +                   sdebug, ddebug,
> +                   ntohl(ptcp->th_seq), ntohl(ptcp->th_ack),
> +                   ntohl(stcp->th_seq), ntohl(stcp->th_ack),
> +                   res, ptcp->th_flags, stcp->th_flags);

I tend not mix using debug logs with tracepoints.

> +
> +        trace_colo_compare_tcp_miscompare("Primary len", ppkt->size);
> +        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size);
> +        trace_colo_compare_tcp_miscompare("Secondary len", spkt->size);
> +        qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size);
> +
> +        g_free(sdebug);
> +        g_free(ddebug);
> +    }
> +
> +    return res;
> +}
> +
> +/*
> + * called from the compare thread on the primary
> + * for compare udp packet
> + */
> +static int colo_packet_compare_udp(Packet *spkt, Packet *ppkt)
> +{
> +    int ret;
> +
> +    trace_colo_compare_main("compare udp");
> +    ret = colo_packet_compare(ppkt, spkt);
> +
> +    if (ret) {
> +        trace_colo_compare_udp_miscompare("primary pkt size", ppkt->size);
> +        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size);
> +        trace_colo_compare_udp_miscompare("Secondary pkt size", spkt->size);
> +        qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size);
> +    }
> +
> +    return ret;
> +}
> +
> +/*
> + * called from the compare thread on the primary
> + * for compare icmp packet
> + */
> +static int colo_packet_compare_icmp(Packet *spkt, Packet *ppkt)
> +{
> +    int network_length;
> +
> +    trace_colo_compare_main("compare icmp");
> +    network_length = ppkt->ip->ip_hl * 4;
> +    if (ppkt->size != spkt->size ||
> +        ppkt->size < network_length + ETH_HLEN) {
> +        return -1;
> +    }
> +
> +    if (colo_packet_compare(ppkt, spkt)) {
> +        trace_colo_compare_icmp_miscompare("primary pkt size",
> +                                           ppkt->size);
> +        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare",
> +                     ppkt->size);
> +        trace_colo_compare_icmp_miscompare("Secondary pkt size",
> +                                           spkt->size);
> +        qemu_hexdump((char *)spkt->data, stderr, "colo-compare",
> +                     spkt->size);
> +        return -1;
> +    } else {
> +        return 0;
> +    }
> +}
> +
> +/*
> + * called from the compare thread on the primary
> + * for compare other packet
> + */
> +static int colo_packet_compare_other(Packet *spkt, Packet *ppkt)
>   {
> -    trace_colo_compare_main("compare all");
> +    trace_colo_compare_main("compare other");
> +    trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
> +                               inet_ntoa(ppkt->ip->ip_dst), spkt->size,
> +                               inet_ntoa(spkt->ip->ip_src),
> +                               inet_ntoa(spkt->ip->ip_dst));
>       return colo_packet_compare(ppkt, spkt);
>   }
>   
> @@ -245,8 +373,24 @@ static void colo_compare_connection(void *opaque, void *user_data)
>       while (!g_queue_is_empty(&conn->primary_list) &&
>              !g_queue_is_empty(&conn->secondary_list)) {
>           pkt = g_queue_pop_tail(&conn->primary_list);
> -        result = g_queue_find_custom(&conn->secondary_list,
> -                              pkt, (GCompareFunc)colo_packet_compare_all);
> +        switch (conn->ip_proto) {
> +        case IPPROTO_TCP:
> +            result = g_queue_find_custom(&conn->secondary_list,
> +                     pkt, (GCompareFunc)colo_packet_compare_tcp);
> +            break;
> +        case IPPROTO_UDP:
> +            result = g_queue_find_custom(&conn->secondary_list,
> +                     pkt, (GCompareFunc)colo_packet_compare_udp);
> +            break;
> +        case IPPROTO_ICMP:
> +            result = g_queue_find_custom(&conn->secondary_list,
> +                     pkt, (GCompareFunc)colo_packet_compare_icmp);
> +            break;
> +        default:
> +            result = g_queue_find_custom(&conn->secondary_list,
> +                     pkt, (GCompareFunc)colo_packet_compare_other);
> +            break;
> +        }
>   
>           if (result) {
>               ret = compare_chr_send(s->chr_out, pkt->data, pkt->size);
> diff --git a/trace-events b/trace-events
> index 1537e91..ab22eb2 100644
> --- a/trace-events
> +++ b/trace-events
> @@ -1919,5 +1919,9 @@ aspeed_vic_write(uint64_t offset, unsigned size, uint32_t data) "To 0x%" PRIx64
>   
>   # net/colo-compare.c
>   colo_compare_main(const char *chr) ": %s"
> +colo_compare_tcp_miscompare(const char *sta, int size) ": %s = %d"
> +colo_compare_udp_miscompare(const char *sta, int size) ": %s = %d"
> +colo_compare_icmp_miscompare(const char *sta, int size) ": %s = %d"
>   colo_compare_ip_info(int psize, const char *sta, const char *stb, int ssize, const char *stc, const char *std) "ppkt size = %d, ip_src = %s, ip_dst = %s, spkt size = %d, ip_src = %s, ip_dst = %s"
>   colo_old_packet_check_found(int64_t old_time) "%" PRId64
> +colo_compare_miscompare(void) ""

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization
  2016-08-31  9:20       ` Jason Wang
@ 2016-08-31  9:39         ` Zhang Chen
  2016-09-01  6:32           ` Zhang Chen
  0 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-08-31  9:39 UTC (permalink / raw)
  To: Jason Wang, qemu devel
  Cc: Dr . David Alan Gilbert, eddie . dong, Li Zhijian, zhanghailiang



On 08/31/2016 05:20 PM, Jason Wang wrote:
>
>
> On 2016年08月31日 17:03, Zhang Chen wrote:
>>
>>
>> On 08/31/2016 03:53 PM, Jason Wang wrote:
>>>
>>>
>>> On 2016年08月17日 16:10, Zhang Chen wrote:
>>>> This a COLO net ascii figure:
>>>>
>>>>   Primary qemu Secondary qemu
>>>> +--------------------------------------------------------------+ 
>>>> +----------------------------------------------------------------+
>>>> | +----------------------------------------------------------+ 
>>>> |       | 
>>>> +-----------------------------------------------------------+ |
>>>> | | | |       | | | |
>>>> | |                        guest | |       |  | 
>>>> guest                              | |
>>>> | | | |       | | | |
>>>> | +-------^--------------------------+-----------------------+ 
>>>> |       | 
>>>> +---------------------+--------+----------------------------+ |
>>>> |         |                          | | |                        ^ 
>>>> |                              |
>>>> |         |                          | | |                        | 
>>>> |                              |
>>>> |         | 
>>>> +------------------------------------------------------+ 
>>>> |                        |        | |
>>>> |netfilter|  |                       | |    |  | 
>>>> netfilter            | |                              |
>>>> | +----------+ +----------------------------+ |    |  | 
>>>> +-----------------------------------------------------------+ |
>>>> | |       |  |                       |      |        out | |  |  
>>>> |                     |        |  filter excute order       | |
>>>> | |       |  |          +-----------------------------+ | | |  
>>>> |                     |        | +------------------->      | |
>>>> | |       |  |          |            |      |         | | | |  
>>>> |                     |        | TCP | |
>>>> | | +-----+--+-+  +-----v----+ +-----v----+ |pri +----+----+sec|    
>>>> |  |  | +------------+ +---+----+---v+rewriter++  +------------+ | |
>>>> | | |          |  |          | |          | |in  | |in |    |  |  | 
>>>> |            |  |        |              | |            | | |
>>>> | | |  filter  |  |  filter  | |  filter  +------>  colo <------+ 
>>>> +-------->  filter   +--> adjust | adjust     +-->   filter   | | |
>>>> | | |  mirror  |  |redirector| |redirector| |    | compare | |  
>>>> |    |  | | redirector |  | ack    |   seq        |  | redirector | 
>>>> | |
>>>> | | |          |  |          | |          | |    |         | |  
>>>> |    |  | |            |  |        |              | |            | | |
>>>> | | +----^-----+  +----+-----+ +----------+ |    +---------+ |  
>>>> |    |  | +------------+  +--------+--------------+ +---+--------+ | |
>>>> | |      |   tx        |   rx           rx  | |  |    | 
>>>> |            tx                        all       | rx      | |
>>>> | |      |             |                    | |  |    | 
>>>> +-----------------------------------------------------------+ |
>>>> | |      |             +--------------+     | |  |    | |            |
>>>> | |      |   filter excute order      |     | |  |    | |            |
>>>> | |      |  +---------------->        | | | 
>>>> +--------------------------------------------------------+ |
>>>> | +-----------------------------------------+ | | |
>>>> |        |                            | | | |
>>>> +--------------------------------------------------------------+ 
>>>> +----------------------------------------------------------------+
>>>>           |guest receive               | guest send
>>>>           |                            |
>>>> +--------+----------------------------v------------------------+
>>>> | |                          NOTE: filter direction is rx/tx/all
>>>> |                         tap | rx:receive packets sent to the netdev
>>>> | |                          tx:receive packets sent by the netdev
>>>> +--------------------------------------------------------------+
>>>
>>> It's better to add a doc under docs to explain this configuration in 
>>> detail on top of this series.
>>>
>>
>> As you say, Am I add /docs/colo-proxy.txt to explain it or add this 
>> in hailiang's COLO-FT.txt after merge?

Can you give me a way for doc?


>>
>>>> In COLO-compare, we do packet comparing job.
>>>> Packets coming from the primary char indev will be sent to outdev.
>>>> Packets coming from the secondary char dev will be dropped after 
>>>> comparing.
>>>> colo-comapre need two input chardev and one output chardev:
>>>> primary_in=chardev1-id (source: primary send packet)
>>>> secondary_in=chardev2-id (source: secondary send packet)
>>>> outdev=chardev3-id
>>>>
>>>> usage:
>>>>
>>>> primary:
>>>> -netdev 
>>>> tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
>>>> -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>>>> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>>>> -chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>>>> -chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>>>> -chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>>>> -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>>>> -chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>>>> -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>>>> -object 
>>>> filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>>>> -object 
>>>> filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>>>> -object 
>>>> colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>>>>
>>>> secondary:
>>>> -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down 
>>>> script=/etc/qemu-ifdown
>>>> -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>>>> -chardev socket,id=red0,host=3.3.3.3,port=9003
>>>> -chardev socket,id=red1,host=3.3.3.3,port=9004
>>>> -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>>>> -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>>>>
>>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>>> ---
>>>>   net/Makefile.objs  |   1 +
>>>>   net/colo-compare.c | 284 
>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>   qemu-options.hx    |  39 ++++++++
>>>>   vl.c               |   3 +-
>>>>   4 files changed, 326 insertions(+), 1 deletion(-)
>>>>   create mode 100644 net/colo-compare.c
>>>>
>>>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>>>> index b7c22fd..ba92f73 100644
>>>> --- a/net/Makefile.objs
>>>> +++ b/net/Makefile.objs
>>>> @@ -16,3 +16,4 @@ common-obj-$(CONFIG_NETMAP) += netmap.o
>>>>   common-obj-y += filter.o
>>>>   common-obj-y += filter-buffer.o
>>>>   common-obj-y += filter-mirror.o
>>>> +common-obj-y += colo-compare.o
>>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>>> new file mode 100644
>>>> index 0000000..cdc3e0e
>>>> --- /dev/null
>>>> +++ b/net/colo-compare.c
>>>> @@ -0,0 +1,284 @@
>>>> +/*
>>>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop 
>>>> Service (COLO)
>>>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>>>> + *
>>>> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
>>>> + * Copyright (c) 2016 FUJITSU LIMITED
>>>> + * Copyright (c) 2016 Intel Corporation
>>>> + *
>>>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>>> + *
>>>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>>>> + * later.  See the COPYING file in the top-level directory.
>>>> + */
>>>> +
>>>> +#include "qemu/osdep.h"
>>>> +#include "qemu/error-report.h"
>>>> +#include "qemu-common.h"
>>>> +#include "qapi/qmp/qerror.h"
>>>> +#include "qapi/error.h"
>>>> +#include "net/net.h"
>>>> +#include "net/vhost_net.h"
>>>
>>> Looks unnecessary.
>>
>> I will remove it.
>>
>>>
>>>> +#include "qom/object_interfaces.h"
>>>> +#include "qemu/iov.h"
>>>> +#include "qom/object.h"
>>>> +#include "qemu/typedefs.h"
>>>> +#include "net/queue.h"
>>>> +#include "sysemu/char.h"
>>>> +#include "qemu/sockets.h"
>>>> +#include "qapi-visit.h"
>>>> +
>>>> +#define TYPE_COLO_COMPARE "colo-compare"
>>>> +#define COLO_COMPARE(obj) \
>>>> +    OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>>>> +
>>>> +#define COMPARE_READ_LEN_MAX NET_BUFSIZE
>>>> +
>>>> +typedef struct CompareState {
>>>> +    Object parent;
>>>> +
>>>> +    char *pri_indev;
>>>> +    char *sec_indev;
>>>> +    char *outdev;
>>>> +    CharDriverState *chr_pri_in;
>>>> +    CharDriverState *chr_sec_in;
>>>> +    CharDriverState *chr_out;
>>>> +    QTAILQ_ENTRY(CompareState) next;
>>>
>>> This looks not used in this series but in commit "colo-compare and 
>>> filter-rewriter work with colo-frame". We'd better delay the 
>>> introducing to that patch.
>>
>> OK~ I got your point.
>>
>>>
>>>> +    SocketReadState pri_rs;
>>>> +    SocketReadState sec_rs;
>>>> +} CompareState;
>>>> +
>>>> +typedef struct CompareClass {
>>>> +    ObjectClass parent_class;
>>>> +} CompareClass;
>>>> +
>>>> +typedef struct CompareChardevProps {
>>>> +    bool is_socket;
>>>> +    bool is_unix;
>>>> +} CompareChardevProps;
>>>> +
>>>> +static char *compare_get_pri_indev(Object *obj, Error **errp)
>>>> +{
>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>> +
>>>> +    return g_strdup(s->pri_indev);
>>>> +}
>>>> +
>>>> +static void compare_set_pri_indev(Object *obj, const char *value, 
>>>> Error **errp)
>>>> +{
>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>> +
>>>> +    g_free(s->pri_indev);
>>>> +    s->pri_indev = g_strdup(value);
>>>> +}
>>>> +
>>>> +static char *compare_get_sec_indev(Object *obj, Error **errp)
>>>> +{
>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>> +
>>>> +    return g_strdup(s->sec_indev);
>>>> +}
>>>> +
>>>> +static void compare_set_sec_indev(Object *obj, const char *value, 
>>>> Error **errp)
>>>> +{
>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>> +
>>>> +    g_free(s->sec_indev);
>>>> +    s->sec_indev = g_strdup(value);
>>>> +}
>>>> +
>>>> +static char *compare_get_outdev(Object *obj, Error **errp)
>>>> +{
>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>> +
>>>> +    return g_strdup(s->outdev);
>>>> +}
>>>> +
>>>> +static void compare_set_outdev(Object *obj, const char *value, 
>>>> Error **errp)
>>>> +{
>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>> +
>>>> +    g_free(s->outdev);
>>>> +    s->outdev = g_strdup(value);
>>>> +}
>>>> +
>>>> +static void compare_pri_rs_finalize(SocketReadState *pri_rs)
>>>> +{
>>>> +    /* if packet_enqueue pri pkt failed we will send unsupported 
>>>> packet */
>>>> +}
>>>> +
>>>> +static void compare_sec_rs_finalize(SocketReadState *sec_rs)
>>>> +{
>>>> +    /* if packet_enqueue sec pkt failed we will notify trace */
>>>> +}
>>>> +
>>>> +static int compare_chardev_opts(void *opaque,
>>>> +                                const char *name, const char *value,
>>>> +                                Error **errp)
>>>> +{
>>>> +    CompareChardevProps *props = opaque;
>>>> +
>>>> +    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") == 
>>>> 0) {
>>>> +        props->is_socket = true;
>>>> +    } else if (strcmp(name, "host") == 0) {
>>>
>>> Typo? net_vhost_chardev_opts() did:
>>>
>>>     } else if (strcmp(name, "path") == 0) {
>>>         props->is_unix = true;
>>>     }
>>>
>>>
>>
>> No, In colo-compare we use chardev like this:
>>
>> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>>
>> If we only use "path" here will trigger a error.
>> Should I add anthor "path" here?
>
> If I understand the code correctly, "is_unix" means "is unix domain 
> socket"? If yes, according to the help:
>
> -chardev 
> socket,id=id[,host=host],port=port[,to=to][,ipv4][,ipv6][,nodelay][,reconnect=seconds]
> [,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off]
>          [,logfile=PATH][,logappend=on|off][,tls-creds=ID] (tcp)
> -chardev 
> socket,id=id,path=path[,server][,nowait][,telnet][,reconnect=seconds]
>          [,mux=on|off][,logfile=PATH][,logappend=on|off] (unix)
>
> "host" will not be used for UNIX domain socket.
>
> And if UNIX domain socket is not supported, there's probably no need 
> to differentiate it from other types.

OK, I will remove the "is_unix" in next version.


>
>>
>>
>>>
>>>> +        props->is_unix = true;
>>>> +    } else if (strcmp(name, "port") == 0) {
>>>> +    } else if (strcmp(name, "server") == 0) {
>>>> +    } else if (strcmp(name, "wait") == 0) {
>>>> +    } else {
>>>> +        error_setg(errp,
>>>> +                   "COLO-compare does not support a chardev with 
>>>> option %s=%s",
>>>> +                   name, value);
>>>> +        return -1;
>>>> +    }
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +/*
>>>> + * called from the main thread on the primary
>>>> + * to setup colo-compare.
>>>> + */
>>>> +static void colo_compare_complete(UserCreatable *uc, Error **errp)
>>>> +{
>>>> +    CompareState *s = COLO_COMPARE(uc);
>>>> +    CompareChardevProps props;
>>>> +
>>>> +    if (!s->pri_indev || !s->sec_indev || !s->outdev) {
>>>> +        error_setg(errp, "colo compare needs 'primary_in' ,"
>>>> +                   "'secondary_in','outdev' property set");
>>>> +        return;
>>>> +    } else if (!strcmp(s->pri_indev, s->outdev) ||
>>>> +               !strcmp(s->sec_indev, s->outdev) ||
>>>> +               !strcmp(s->pri_indev, s->sec_indev)) {
>>>> +        error_setg(errp, "'indev' and 'outdev' could not be same "
>>>> +                   "for compare module");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    s->chr_pri_in = qemu_chr_find(s->pri_indev);
>>>> +    if (s->chr_pri_in == NULL) {
>>>> +        error_setg(errp, "Primary IN Device '%s' not found",
>>>> +                   s->pri_indev);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* inspect chardev opts */
>>>> +    memset(&props, 0, sizeof(props));
>>>> +    if (qemu_opt_foreach(s->chr_pri_in->opts, 
>>>> compare_chardev_opts, &props, errp)) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (!props.is_socket || !props.is_unix) {
>>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>>> +                   s->pri_indev);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    s->chr_sec_in = qemu_chr_find(s->sec_indev);
>>>> +    if (s->chr_sec_in == NULL) {
>>>> +        error_setg(errp, "Secondary IN Device '%s' not found",
>>>> +                   s->sec_indev);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    memset(&props, 0, sizeof(props));
>>>> +    if (qemu_opt_foreach(s->chr_sec_in->opts, 
>>>> compare_chardev_opts, &props, errp)) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (!props.is_socket || !props.is_unix) {
>>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>>> +                   s->sec_indev);
>>>
>>> I believe tcp socket is also supported?
>>
>> If I understand correctly, "tcp socket" in here is the "-chardev 
>> socket".
>> I will rename "unix socket" to "tcp socket".
>>
>>>
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    s->chr_out = qemu_chr_find(s->outdev);
>>>> +    if (s->chr_out == NULL) {
>>>> +        error_setg(errp, "OUT Device '%s' not found", s->outdev);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    memset(&props, 0, sizeof(props));
>>>> +    if (qemu_opt_foreach(s->chr_out->opts, compare_chardev_opts, 
>>>> &props, errp)) {
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (!props.is_socket || !props.is_unix) {
>>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>>> +                   s->outdev);
>>>
>>> Ditto, and there's code duplication, please introduce a helper to do 
>>> above.
>>
>> I don't understand what the "helper"?
>> In here we check each chardev, will I change to "goto error;" ?
>
> A helper to avoid the code duplication for socket type inspection for 
> pri_in,scr_in and chr_out.

I got it~~
I will add it in next version.


>
>>
>>>
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    qemu_chr_fe_claim_no_fail(s->chr_pri_in);
>>>> +
>>>> +    qemu_chr_fe_claim_no_fail(s->chr_sec_in);
>>>> +
>>>> +    qemu_chr_fe_claim_no_fail(s->chr_out);
>>>> +
>>>> +    net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>>>> +    net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>>>> +
>>>> +    return;
>>>> +}
>>>> +
>>>> +static void colo_compare_class_init(ObjectClass *oc, void *data)
>>>> +{
>>>> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
>>>> +
>>>> +    ucc->complete = colo_compare_complete;
>>>> +}
>>>> +
>>>> +static void colo_compare_init(Object *obj)
>>>> +{
>>>> +    object_property_add_str(obj, "primary_in",
>>>> +                            compare_get_pri_indev, 
>>>> compare_set_pri_indev,
>>>> +                            NULL);
>>>> +    object_property_add_str(obj, "secondary_in",
>>>> +                            compare_get_sec_indev, 
>>>> compare_set_sec_indev,
>>>> +                            NULL);
>>>> +    object_property_add_str(obj, "outdev",
>>>> +                            compare_get_outdev, compare_set_outdev,
>>>> +                            NULL);
>>>> +}
>>>> +
>>>> +static void colo_compare_finalize(Object *obj)
>>>> +{
>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>> +
>>>> +    if (s->chr_pri_in) {
>>>> +        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
>>>> +        qemu_chr_fe_release(s->chr_pri_in);
>>>> +    }
>>>> +    if (s->chr_sec_in) {
>>>> +        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
>>>> +        qemu_chr_fe_release(s->chr_sec_in);
>>>> +    }
>>>> +    if (s->chr_out) {
>>>> +        qemu_chr_fe_release(s->chr_out);
>>>> +    }
>>>> +
>>>> +    g_free(s->pri_indev);
>>>> +    g_free(s->sec_indev);
>>>> +    g_free(s->outdev);
>>>> +}
>>>> +
>>>> +static const TypeInfo colo_compare_info = {
>>>> +    .name = TYPE_COLO_COMPARE,
>>>> +    .parent = TYPE_OBJECT,
>>>> +    .instance_size = sizeof(CompareState),
>>>> +    .instance_init = colo_compare_init,
>>>> +    .instance_finalize = colo_compare_finalize,
>>>> +    .class_size = sizeof(CompareClass),
>>>> +    .class_init = colo_compare_class_init,
>>>> +    .interfaces = (InterfaceInfo[]) {
>>>> +        { TYPE_USER_CREATABLE },
>>>> +        { }
>>>> +    }
>>>> +};
>>>> +
>>>> +static void register_types(void)
>>>> +{
>>>> +    type_register_static(&colo_compare_info);
>>>> +}
>>>> +
>>>> +type_init(register_types);
>>>> diff --git a/qemu-options.hx b/qemu-options.hx
>>>> index 587de8f..33d5d0b 100644
>>>> --- a/qemu-options.hx
>>>> +++ b/qemu-options.hx
>>>> @@ -3866,6 +3866,45 @@ Dump the network traffic on netdev @var{dev} 
>>>> to the file specified by
>>>>   The file format is libpcap, so it can be analyzed with tools such 
>>>> as tcpdump
>>>>   or Wireshark.
>>>>   +@item -object 
>>>> colo-compare,id=@var{id},primary_in=@var{chardevid},secondary_in=@var{chardevid},
>>>> +outdev=@var{chardevid}
>>>> +
>>>> +Colo-compare gets packet from primary_in@var{chardevid} and 
>>>> secondary_in@var{chardevid}, than compare primary packet with
>>>> +secondary packet. If the packet same, we will output primary
>>>
>>> s/If the packet same/If the packets are same/.
>>
>> OK.
>>
>>>
>>>> +packet to outdev@var{chardevid}, else we will notify colo-frame
>>>> +do checkpoint and send primary packet to outdev@var{chardevid}.
>>>> +
>>>> +we can use it with the help of filter-mirror and filter-redirector.
>>>
>>> s/we/We/ and looks like colo compare must be used with the help of 
>>> mirror and redirector?
>>
>> Currently yes.
>
> Then please change the doc here.

s/We can use it/We must use it.

Thanks
Zhang Chen

>
> Thanks
>
>>
>>>
>>>> +
>>>> +@example
>>>> +
>>>> +primary:
>>>> +-netdev 
>>>> tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
>>>> +-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>>>> +-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>>>> +-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>>>> +-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>>>> +-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>>>> +-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>>>> +-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>>>> +-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>>>> +-object 
>>>> filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>>>> +-object 
>>>> filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>>>> +-object 
>>>> colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>>>> +
>>>> +secondary:
>>>> +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down 
>>>> script=/etc/qemu-ifdown
>>>> +-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>>>> +-chardev socket,id=red0,host=3.3.3.3,port=9003
>>>> +-chardev socket,id=red1,host=3.3.3.3,port=9004
>>>> +-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>>>> +-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>>>> +
>>>> +@end example
>>>> +
>>>> +If you want to know the detail of above command line, you can read
>>>> +the colo-compare git log.
>>>> +
>>>>   @item -object 
>>>> secret,id=@var{id},data=@var{string},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>>>   @item -object 
>>>> secret,id=@var{id},file=@var{filename},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>>>   diff --git a/vl.c b/vl.c
>>>> index cbe51ac..c6b9a6f 100644
>>>> --- a/vl.c
>>>> +++ b/vl.c
>>>> @@ -2865,7 +2865,8 @@ static bool object_create_initial(const char 
>>>> *type)
>>>>       if (g_str_equal(type, "filter-buffer") ||
>>>>           g_str_equal(type, "filter-dump") ||
>>>>           g_str_equal(type, "filter-mirror") ||
>>>> -        g_str_equal(type, "filter-redirector")) {
>>>> +        g_str_equal(type, "filter-redirector") ||
>>>> +        g_str_equal(type, "colo-compare")) {
>>>>           return false;
>>>>       }
>>>
>>>
>>>
>>> .
>>>
>>
>
>
>
> .
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter
  2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
                   ` (10 preceding siblings ...)
  2016-08-25  3:44 ` [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
@ 2016-08-31  9:39 ` Jason Wang
  11 siblings, 0 replies; 32+ messages in thread
From: Jason Wang @ 2016-08-31  9:39 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, eddie . dong, Dr . David Alan Gilbert, zhanghailiang



On 2016年08月17日 16:10, Zhang Chen wrote:
> COLO-compare is a part of COLO project. It is used
> to compare the network package to help COLO decide
> whether to do checkpoint.
>
> Filter-rewriter is a part of COLO project too.
> It will rewrite some of secondary packet to make
> secondary guest's connection established successfully.
> In this module we will rewrite tcp packet's ack to the secondary
> from primary,and rewrite tcp packet's seq to the primary from
> secondary.
>
> The full version in this github:
> https://github.com/zhangckid/qemu/tree/colo-v2.7-proxy-mode-compare-and-rewriter-aug16

Almost there, just few nits.

I will try to merge this in next version if all comments were addressed. 
We can do other fixes or optimization on top

>
>
> v12:
>    - add qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>      to this series as the first patch.
>    - update COLO net ascii figure.
>    - add chardev socket check.
>    - fix some typo.
>    - add some comments.
>    - rename net/colo-base.c to net/colo.c
>    - rename network/transport_layer to network/transport_header.
>    - move the job that clear coon_list when hashtable_size oversize
>      to connection_get.
>    - reuse connection_destroy() do colo_rm_connection().
>    - fix pkt mem leak in colo_compare_connection().
>      (result be released in g_queue_remove(), so it were not leak)
>    - rename thread_name "compare" to "colo-compare".
>    - change icmp compare to memcmp().
>
> v11:
>    - Make patch 5 to a independent patch series.
>      [PATCH V3] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>    - For Jason's comments, merge filter-rewriter to this series.
>      (patch 7,8,9)
>    - Add reverse_connection_key()
>    - remove conn_list in filter-rewriter
>    - remove unprocessed_connections
>    - add some comments
>
> v10:
>    - fix typo
>    - Should we make patch 5 independent with this series?
>      This patch just add a API for qemu-char.
>
> v9:
>   p5:
>    - use chr_update_read_handler_full() replace
>      the chr_update_read_handler()
>    - use io_watch_poll_prepare_full() replace
>      the io_watch_poll_prepare()
>    - use io_watch_poll_funcs_full replace
>      the io_watch_poll_funcs
>    - avoid code duplication
>
> v8:
>   p5:
>    - add new patch:
>      qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>
> v7:
>   p5:
>     - add [PATCH]qemu-char: Fix context for g_source_attach()
>       in this patch series.
>
> v6:
>   p6:
>     - add more commit log.
>     - fix icmp comparison to compare all packet.
>
>   p5:
>     - add more cpmments in commit log.
>     - change REGULAR_CHECK_MS to REGULAR_PACKET_CHECK_MS
>     - make check old packet independent to compare thread
>     - remove thread_status
>
>   p4:
>     - change this patch only about
>       Connection and ConnectionKey.
>     - add some comments in commit log.
>     - remove mode in fill_connection_key().
>     - fix some comments and bug.
>     - move colo_conn_state to patch of
>       "work with colo-frame"
>     - remove conn_list_lock.
>     - add MAX_QUEUE_SIZE, if primary_list or
>       secondary_list biger than MAX_QUEUE_SIZE
>       we will drop packet.
>
>   p3:
>     - add new independent kernel jhash patch.
>
>   p2:
>     - add new independent colo-base patch.
>
>   p1:
>     - add a ascii figure and some comments to explain it
>     - move trace.h to p2
>     - move QTAILQ_HEAD(, CompareState) net_compares to
>       patch of "work with colo-frame"
>     - add some comments in qemu-option.hx
>
>
> v5:
>   p3:
>      - comments from Jason
>        we poll and handle chardev in comapre thread,
>        Through this way, there's no need for extra
>        synchronization with main loop
>        this depend on another patch:
>        qemu-char: Fix context for g_source_attach()
>      - remove QemuEvent
>   p2:
>      - remove conn->list_lock
>   p1:
>      - move compare_pri/sec_chr_in to p3
>      - move compare_chr_send to p2
>
> v4:
>   p4:
>      - add some comments
>      - fix some trace-events
>      - fix tcp compare error
>   p3:
>      - add rcu_read_lock().
>      - fix trace name
>      - fix jason's other comments
>      - rebase some Dave's branch function
>   p2:
>      - colo_compare_connection() change g_queue_push_head() to
>      - g_queue_push_tail() match to sorted order.
>      - remove pkt->s
>      - move data structure to colo-base.h
>      - add colo-base.c reuse codes for filter-rewriter
>      - add some filter-rewriter needs struct
>      - depends on previous SocketReadState patch
>   p1:
>      - except move qemu_chr_add_handlers()
>        to colo thread
>      - remove class_finalize
>      - remove secondary arp codes
>      - depends on previous SocketReadState patch
>
> v3:
>    - rebase colo-compare to colo-frame v2.7
>    - fix most of Dave's comments
>      (except RCU)
>    - add TCP,UDP,ICMP and other packet comparison
>    - add trace-event
>    - add some comments
>    - other bug fix
>    - add RFC index
>    - add usage in patch 1/4
>
> v2:
>    - add jhash.h
>
> v1:
>    - initial patch
>
>
> Zhang Chen (10):
>    qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext
>    colo-compare: introduce colo compare initialization
>    net/colo.c: add colo.c to define and handle packet
>    Jhash: add linux kernel jhashtable in qemu
>    colo-compare: track connection and enqueue packet
>    colo-compare: introduce packet comparison thread
>    colo-compare: add TCP,UDP,ICMP packet comparison
>    filter-rewriter: introduce filter-rewriter initialization
>    filter-rewriter: track connection and parse packet
>    filter-rewriter: rewrite tcp packet to keep secondary connection
>
>   include/qemu/jhash.h  |  59 ++++
>   include/sysemu/char.h |  11 +-
>   net/Makefile.objs     |   3 +
>   net/colo-compare.c    | 784 ++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/colo.c            | 204 +++++++++++++
>   net/colo.h            |  76 +++++
>   net/filter-rewriter.c | 268 +++++++++++++++++
>   qemu-char.c           |  77 +++--
>   qemu-options.hx       |  52 ++++
>   trace-events          |  14 +
>   vl.c                  |   4 +-
>   11 files changed, 1526 insertions(+), 26 deletions(-)
>   create mode 100644 include/qemu/jhash.h
>   create mode 100644 net/colo-compare.c
>   create mode 100644 net/colo.c
>   create mode 100644 net/colo.h
>   create mode 100644 net/filter-rewriter.c
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 05/10] colo-compare: track connection and enqueue packet
  2016-08-31  8:52   ` Jason Wang
@ 2016-08-31 11:52     ` Zhang Chen
  0 siblings, 0 replies; 32+ messages in thread
From: Zhang Chen @ 2016-08-31 11:52 UTC (permalink / raw)
  To: Jason Wang, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 08/31/2016 04:52 PM, Jason Wang wrote:
>
>
> On 2016年08月17日 16:10, Zhang Chen wrote:
>> In this patch we use kernel jhash table to track
>> connection, and then enqueue net packet like this:
>>
>> + CompareState ++
>> |               |
>> +---------------+   +---------------+         +---------------+
>> |conn list      +--->conn +--------->conn           |
>> +---------------+   +---------------+         +---------------+
>> |               |     |           |             |          |
>> +---------------+ +---v----+  +---v----+    +---v----+ +---v----+
>>                    |primary |  |secondary    |primary | |secondary
>>                    |packet  |  |packet  +    |packet  | |packet +
>>                    +--------+  +--------+    +--------+ +--------+
>>                        |           |             |          |
>>                    +---v----+  +---v----+    +---v----+ +---v----+
>>                    |primary |  |secondary    |primary | |secondary
>>                    |packet  |  |packet  +    |packet  | |packet +
>>                    +--------+  +--------+    +--------+ +--------+
>>                        |           |             |          |
>>                    +---v----+  +---v----+    +---v----+ +---v----+
>>                    |primary |  |secondary    |primary | |secondary
>>                    |packet  |  |packet  +    |packet  | |packet +
>>                    +--------+  +--------+    +--------+ +--------+
>>
>> We use conn_list to record connection info.
>> When we want to enqueue a packet, firstly get the
>> connection from connection_track_table. then push
>> the packet to g_queue(pri/sec) in it's own conn.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>   net/colo-compare.c |  51 ++++++++++++++++++-----
>>   net/colo.c         | 117 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/colo.h         |  27 +++++++++++++
>>   3 files changed, 185 insertions(+), 10 deletions(-)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index d9e4459..bab215b 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -72,6 +72,11 @@ typedef struct CompareState {
>>       SocketReadState pri_rs;
>>       SocketReadState sec_rs;
>>   +    /* connection list: the connections belonged to this NIC could 
>> be found
>> +     * in this list.
>> +     * element type: Connection
>> +     */
>> +    GQueue conn_list;
>>       /* hashtable to save connection */
>>       GHashTable *connection_track_table;
>>   } CompareState;
>> @@ -100,7 +105,9 @@ static int compare_chr_send(CharDriverState *out,
>>    */
>>   static int packet_enqueue(CompareState *s, int mode)
>>   {
>> +    ConnectionKey key = {{ 0 } };
>>       Packet *pkt = NULL;
>> +    Connection *conn;
>>         if (mode == PRIMARY_IN) {
>>           pkt = packet_new(s->pri_rs.buf, s->pri_rs.packet_len);
>> @@ -113,17 +120,34 @@ static int packet_enqueue(CompareState *s, int 
>> mode)
>>           pkt = NULL;
>>           return -1;
>>       }
>> -    /* TODO: get connection key from pkt */
>> +    fill_connection_key(pkt, &key);
>>   -    /*
>> -     * TODO: use connection key get conn from
>> -     * connection_track_table
>> -     */
>> +    conn = connection_get(s->connection_track_table,
>> +                          &key,
>> +                          &s->conn_list);
>>   -    /*
>> -     * TODO: insert pkt to it's conn->primary_list
>> -     * or conn->secondary_list
>> -     */
>> +    if (!conn->processing) {
>> +        g_queue_push_tail(&s->conn_list, conn);
>> +        conn->processing = true;
>> +    }
>> +
>> +    if (mode == PRIMARY_IN) {
>> +        if (g_queue_get_length(&conn->primary_list) <
>> +                               MAX_QUEUE_SIZE) {
>
> Should be "<=" I believe.

I will fix it in next version.

>
>> + g_queue_push_tail(&conn->primary_list, pkt);
>> +        } else {
>> +            error_report("colo compare primary queue size too big,"
>> +            "drop packet");
>
> indentation here looks odd.
>

I will fix it~

>> +        }
>> +    } else {
>> +        if (g_queue_get_length(&conn->secondary_list) <
>> +                               MAX_QUEUE_SIZE) {
>> +            g_queue_push_tail(&conn->secondary_list, pkt);
>> +        } else {
>> +            error_report("colo compare secondary queue size too big,"
>> +            "drop packet");
>> +        }
>> +    }
>>         return 0;
>>   }
>> @@ -325,7 +349,12 @@ static void colo_compare_complete(UserCreatable 
>> *uc, Error **errp)
>>       net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>>       net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>>   -    /* use g_hash_table_new_full() to new a hashtable */
>> +    g_queue_init(&s->conn_list);
>> +
>> +    s->connection_track_table = 
>> g_hash_table_new_full(connection_key_hash,
>> + connection_key_equal,
>> +                                                      g_free,
>> + connection_destroy);
>>         return;
>>   }
>> @@ -366,6 +395,8 @@ static void colo_compare_finalize(Object *obj)
>>           qemu_chr_fe_release(s->chr_out);
>>       }
>>   +    g_queue_free(&s->conn_list);
>> +
>>       g_free(s->pri_indev);
>>       g_free(s->sec_indev);
>>       g_free(s->outdev);
>> diff --git a/net/colo.c b/net/colo.c
>> index 4daedd4..bc86553 100644
>> --- a/net/colo.c
>> +++ b/net/colo.c
>> @@ -16,6 +16,29 @@
>>   #include "qemu/error-report.h"
>>   #include "net/colo.h"
>>   +uint32_t connection_key_hash(const void *opaque)
>> +{
>> +    const ConnectionKey *key = opaque;
>> +    uint32_t a, b, c;
>> +
>> +    /* Jenkins hash */
>> +    a = b = c = JHASH_INITVAL + sizeof(*key);
>> +    a += key->src.s_addr;
>> +    b += key->dst.s_addr;
>> +    c += (key->src_port | key->dst_port << 16);
>> +    __jhash_mix(a, b, c);
>> +
>> +    a += key->ip_proto;
>> +    __jhash_final(a, b, c);
>> +
>> +    return c;
>> +}
>> +
>> +int connection_key_equal(const void *key1, const void *key2)
>> +{
>> +    return memcmp(key1, key2, sizeof(ConnectionKey)) == 0;
>> +}
>> +
>>   int parse_packet_early(Packet *pkt)
>>   {
>>       int network_length;
>> @@ -43,6 +66,62 @@ int parse_packet_early(Packet *pkt)
>>       return 0;
>>   }
>>   +void fill_connection_key(Packet *pkt, ConnectionKey *key)
>> +{
>> +    uint32_t tmp_ports;
>> +
>> +    key->ip_proto = pkt->ip->ip_p;
>> +
>> +    switch (key->ip_proto) {
>> +    case IPPROTO_TCP:
>> +    case IPPROTO_UDP:
>> +    case IPPROTO_DCCP:
>> +    case IPPROTO_ESP:
>> +    case IPPROTO_SCTP:
>> +    case IPPROTO_UDPLITE:
>> +        tmp_ports = *(uint32_t *)(pkt->transport_header);
>> +        key->src = pkt->ip->ip_src;
>> +        key->dst = pkt->ip->ip_dst;
>> +        key->src_port = ntohs(tmp_ports & 0xffff);
>> +        key->dst_port = ntohs(tmp_ports >> 16);
>> +        break;
>> +    case IPPROTO_AH:
>> +        tmp_ports = *(uint32_t *)(pkt->transport_header + 4);
>> +        key->src = pkt->ip->ip_src;
>> +        key->dst = pkt->ip->ip_dst;
>> +        key->src_port = ntohs(tmp_ports & 0xffff);
>> +        key->dst_port = ntohs(tmp_ports >> 16);
>> +        break;
>> +    default:
>> +        key->src_port = 0;
>> +        key->dst_port = 0;
>> +        break;
>> +    }
>> +}
>> +
>> +Connection *connection_new(ConnectionKey *key)
>> +{
>> +    Connection *conn = g_slice_new(Connection);
>> +
>> +    conn->ip_proto = key->ip_proto;
>> +    conn->processing = false;
>> +    g_queue_init(&conn->primary_list);
>> +    g_queue_init(&conn->secondary_list);
>> +
>> +    return conn;
>> +}
>> +
>> +void connection_destroy(void *opaque)
>> +{
>> +    Connection *conn = opaque;
>> +
>> +    g_queue_foreach(&conn->primary_list, packet_destroy, NULL);
>> +    g_queue_free(&conn->primary_list);
>> +    g_queue_foreach(&conn->secondary_list, packet_destroy, NULL);
>> +    g_queue_free(&conn->secondary_list);
>> +    g_slice_free(Connection, conn);
>> +}
>> +
>>   Packet *packet_new(const void *data, int size)
>>   {
>>       Packet *pkt = g_slice_new(Packet);
>> @@ -68,3 +147,41 @@ void connection_hashtable_reset(GHashTable 
>> *connection_track_table)
>>   {
>>       g_hash_table_remove_all(connection_track_table);
>>   }
>> +
>> +static void colo_rm_connection(void *opaque, void *user_data)
>> +{
>
> user_data is unused here.

OK, I will fix this in next version.

>
>> +    connection_destroy(opaque);
>> +}
>> +
>> +/* if not found, create a new connection and add to hash table */
>> +Connection *connection_get(GHashTable *connection_track_table,
>> +                           ConnectionKey *key,
>> +                           GQueue *conn_list)
>> +{
>> +    Connection *conn = g_hash_table_lookup(connection_track_table, 
>> key);
>> +    static uint32_t hashtable_size;
>> +
>> +    if (conn == NULL) {
>> +        ConnectionKey *new_key = g_memdup(key, sizeof(*key));
>> +
>> +        conn = connection_new(key);
>> +
>> +        hashtable_size += 1;
>
> Use of uninitialized variable?

static auto initialized the variable as 0.
and in next version I will remove it.

>
>> +        if (hashtable_size > HASHTABLE_MAX_SIZE) {
>
> Should we use g_hash_table_size() here.

good idea~~

>
>> +            error_report("colo proxy connection hashtable full, 
>> clear it");
>> +            connection_hashtable_reset(connection_track_table);
>> +            /*
>> +             * clear the conn_list
>> +             */
>> +            if (conn_list) {
>> +                g_queue_foreach(conn_list, colo_rm_connection, NULL);
>> +            }
>> +
>> +            hashtable_size = 0;
>> +        }
>> +
>> +        g_hash_table_insert(connection_track_table, new_key, conn);
>
> Then there's no need for hashtable_size.

OK, I will remove it.

Thanks
Zhang Chen

>
>> +    }
>> +
>> +    return conn;
>> +}
>> diff --git a/net/colo.h b/net/colo.h
>> index 8559f28..9cbc14e 100644
>> --- a/net/colo.h
>> +++ b/net/colo.h
>> @@ -30,7 +30,34 @@ typedef struct Packet {
>>       int size;
>>   } Packet;
>>   +typedef struct ConnectionKey {
>> +    /* (src, dst) must be grouped, in the same way than in IP header */
>> +    struct in_addr src;
>> +    struct in_addr dst;
>> +    uint16_t src_port;
>> +    uint16_t dst_port;
>> +    uint8_t ip_proto;
>> +} QEMU_PACKED ConnectionKey;
>> +
>> +typedef struct Connection {
>> +    /* connection primary send queue: element type: Packet */
>> +    GQueue primary_list;
>> +    /* connection secondary send queue: element type: Packet */
>> +    GQueue secondary_list;
>> +    /* flag to enqueue unprocessed_connections */
>> +    bool processing;
>> +    uint8_t ip_proto;
>> +} Connection;
>> +
>> +uint32_t connection_key_hash(const void *opaque);
>> +int connection_key_equal(const void *opaque1, const void *opaque2);
>>   int parse_packet_early(Packet *pkt);
>> +void fill_connection_key(Packet *pkt, ConnectionKey *key);
>> +Connection *connection_new(ConnectionKey *key);
>> +void connection_destroy(void *opaque);
>> +Connection *connection_get(GHashTable *connection_track_table,
>> +                           ConnectionKey *key,
>> +                           GQueue *conn_list);
>>   void connection_hashtable_reset(GHashTable *connection_track_table);
>>   Packet *packet_new(const void *data, int size);
>>   void packet_destroy(void *opaque, void *user_data);
>
>
>
> .
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread
  2016-08-31  9:13   ` Jason Wang
@ 2016-09-01  4:50     ` Zhang Chen
  2016-09-01  7:38       ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-09-01  4:50 UTC (permalink / raw)
  To: Jason Wang, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 08/31/2016 05:13 PM, Jason Wang wrote:
>
>
> On 2016年08月17日 16:10, Zhang Chen wrote:
>> If primary packet is same with secondary packet,
>> we will send primary packet and drop secondary
>> packet, otherwise notify COLO frame to do checkpoint.
>> If primary packet comes but secondary packet does not,
>> after REGULAR_PACKET_CHECK_MS milliseconds we set
>> the primary packet as old_packet,then do a checkpoint.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>   net/colo-compare.c | 216 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   net/colo.c         |   1 +
>>   net/colo.h         |   3 +
>>   trace-events       |   2 +
>>   4 files changed, 222 insertions(+)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index bab215b..b90cf1f 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -36,6 +36,8 @@
>>     #define COMPARE_READ_LEN_MAX NET_BUFSIZE
>>   #define MAX_QUEUE_SIZE 1024
>> +/* TODO: Should be configurable */
>> +#define REGULAR_PACKET_CHECK_MS 3000
>>     /*
>>     + CompareState ++
>> @@ -79,6 +81,10 @@ typedef struct CompareState {
>>       GQueue conn_list;
>>       /* hashtable to save connection */
>>       GHashTable *connection_track_table;
>> +    /* compare thread, a thread for each NIC */
>> +    QemuThread thread;
>> +    /* Timer used on the primary to find packets that are never 
>> matched */
>> +    QEMUTimer *timer;
>>   } CompareState;
>>     typedef struct CompareClass {
>> @@ -152,6 +158,113 @@ static int packet_enqueue(CompareState *s, int 
>> mode)
>>       return 0;
>>   }
>>   +/*
>> + * The IP packets sent by primary and secondary
>> + * will be compared in here
>> + * TODO support ip fragment, Out-Of-Order
>> + * return:    0  means packet same
>> + *            > 0 || < 0 means packet different
>> + */
>> +static int colo_packet_compare(Packet *ppkt, Packet *spkt)
>> +{
>> +    trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
>> + inet_ntoa(ppkt->ip->ip_dst), spkt->size,
>> + inet_ntoa(spkt->ip->ip_src),
>> + inet_ntoa(spkt->ip->ip_dst));
>> +
>> +    if (ppkt->size == spkt->size) {
>> +        return memcmp(ppkt->data, spkt->data, spkt->size);
>> +    } else {
>> +        return -1;
>> +    }
>> +}
>> +
>> +static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
>> +{
>> +    trace_colo_compare_main("compare all");
>> +    return colo_packet_compare(ppkt, spkt);
>> +}
>> +
>> +static void colo_old_packet_check_one(void *opaque_packet,
>> +                                      void *opaque_found)
>> +{
>> +    int64_t now;
>> +    bool *found_old = (bool *)opaque_found;
>> +    Packet *ppkt = (Packet *)opaque_packet;
>> +
>> +    if (*found_old) {
>> +        /* Someone found an old packet earlier in the queue */
>> +        return;
>> +    }
>> +
>> +    now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>> +    if ((now - ppkt->creation_ms) > REGULAR_PACKET_CHECK_MS) {
>> + trace_colo_old_packet_check_found(ppkt->creation_ms);
>> +        *found_old = true;
>> +    }
>> +}
>> +
>> +static void colo_old_packet_check_one_conn(void *opaque,
>> +                                           void *user_data)
>> +{
>> +    bool found_old = false;
>> +    Connection *conn = opaque;
>> +
>> +    g_queue_foreach(&conn->primary_list, colo_old_packet_check_one,
>> +                    &found_old);
>
> As I mentioned in last version, can we avoid iterating all packets by 
> using g_queue_find_custom() here?

OK~~ I got it.

>
>> +    if (found_old) {
>> +        /* do checkpoint will flush old packet */
>> +        /* TODO: colo_notify_checkpoint();*/
>> +    }
>> +}
>> +
>> +/*
>> + * Look for old packets that the secondary hasn't matched,
>> + * if we have some then we have to checkpoint to wake
>> + * the secondary up.
>> + */
>> +static void colo_old_packet_check(void *opaque)
>> +{
>> +    CompareState *s = opaque;
>> +
>> +    g_queue_foreach(&s->conn_list, colo_old_packet_check_one_conn, 
>> NULL);
>> +}
>> +
>> +/*
>> + * called from the compare thread on the primary
>> + * for compare connection
>> + */
>> +static void colo_compare_connection(void *opaque, void *user_data)
>> +{
>> +    CompareState *s = user_data;
>> +    Connection *conn = opaque;
>> +    Packet *pkt = NULL;
>> +    GList *result = NULL;
>> +    int ret;
>> +
>> +    while (!g_queue_is_empty(&conn->primary_list) &&
>> +           !g_queue_is_empty(&conn->secondary_list)) {
>> +        pkt = g_queue_pop_tail(&conn->primary_list);
>> +        result = g_queue_find_custom(&conn->secondary_list,
>> +                              pkt, 
>> (GCompareFunc)colo_packet_compare_all);
>> +
>> +        if (result) {
>> +            ret = compare_chr_send(s->chr_out, pkt->data, pkt->size);
>> +            if (ret < 0) {
>> +                error_report("colo_send_primary_packet failed");
>> +            }
>> +            trace_colo_compare_main("packet same and release packet");
>> +            g_queue_remove(&conn->secondary_list, result->data);
>> +            packet_destroy(pkt, NULL);
>> +        } else {
>
> Better add a comment to explain the case when secondary packet comes a 
> little bit late here.

OK~~ I will add comments in next version.

>
>> + trace_colo_compare_main("packet different");
>> +            g_queue_push_tail(&conn->primary_list, pkt);
>> +            /* TODO: colo_notify_checkpoint();*/
>> +            break;
>> +        }
>> +    }
>> +}
>> +
>>   static int compare_chr_send(CharDriverState *out,
>>                               const uint8_t *buf,
>>                               uint32_t size)
>> @@ -179,6 +292,65 @@ err:
>>       return ret < 0 ? ret : -EIO;
>>   }
>>   +static int compare_chr_can_read(void *opaque)
>> +{
>> +    return COMPARE_READ_LEN_MAX;
>> +}
>> +
>> +/*
>> + * called from the main thread on the primary for packets
>> + * arriving over the socket from the primary.
>> + */
>> +static void compare_pri_chr_in(void *opaque, const uint8_t *buf, int 
>> size)
>> +{
>> +    CompareState *s = COLO_COMPARE(opaque);
>> +    int ret;
>> +
>> +    ret = net_fill_rstate(&s->pri_rs, buf, size);
>> +    if (ret == -1) {
>> +        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, NULL);
>> +        error_report("colo-compare primary_in error");
>> +    }
>> +}
>> +
>> +/*
>> + * called from the main thread on the primary for packets
>> + * arriving over the socket from the secondary.
>> + */
>> +static void compare_sec_chr_in(void *opaque, const uint8_t *buf, int 
>> size)
>> +{
>> +    CompareState *s = COLO_COMPARE(opaque);
>> +    int ret;
>> +
>> +    ret = net_fill_rstate(&s->sec_rs, buf, size);
>> +    if (ret == -1) {
>> +        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, NULL);
>> +        error_report("colo-compare secondary_in error");
>> +    }
>> +}
>> +
>> +static void *colo_compare_thread(void *opaque)
>> +{
>> +    GMainContext *worker_context;
>> +    GMainLoop *compare_loop;
>> +    CompareState *s = opaque;
>> +
>> +    worker_context = g_main_context_new();
>> +
>> +    qemu_chr_add_handlers_full(s->chr_pri_in, compare_chr_can_read,
>> +                          compare_pri_chr_in, NULL, s, worker_context);
>> +    qemu_chr_add_handlers_full(s->chr_sec_in, compare_chr_can_read,
>> +                          compare_sec_chr_in, NULL, s, worker_context);
>> +
>> +    compare_loop = g_main_loop_new(worker_context, FALSE);
>> +
>> +    g_main_loop_run(compare_loop);
>> +
>> +    g_main_loop_unref(compare_loop);
>> +    g_main_context_unref(worker_context);
>> +    return NULL;
>> +}
>> +
>>   static char *compare_get_pri_indev(Object *obj, Error **errp)
>>   {
>>       CompareState *s = COLO_COMPARE(obj);
>> @@ -231,6 +403,9 @@ static void 
>> compare_pri_rs_finalize(SocketReadState *pri_rs)
>>       if (packet_enqueue(s, PRIMARY_IN)) {
>>           trace_colo_compare_main("primary: unsupported packet in");
>>           compare_chr_send(s->chr_out, pri_rs->buf, pri_rs->packet_len);
>> +    } else {
>> +        /* compare connection */
>> +        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
>>       }
>>   }
>>   @@ -240,6 +415,9 @@ static void 
>> compare_sec_rs_finalize(SocketReadState *sec_rs)
>>         if (packet_enqueue(s, SECONDARY_IN)) {
>>           trace_colo_compare_main("secondary: unsupported packet in");
>> +    } else {
>> +        /* compare connection */
>> +        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
>>       }
>>   }
>>   @@ -266,6 +444,20 @@ static int compare_chardev_opts(void *opaque,
>>   }
>>     /*
>> + * Check old packet regularly so it can watch for any packets
>> + * that the secondary hasn't produced equivalents of.
>> + */
>> +static void check_old_packet_regular(void *opaque)
>> +{
>> +    CompareState *s = opaque;
>> +
>> +    timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>> +              REGULAR_PACKET_CHECK_MS);
>> +    /* if have old packet we will notify checkpoint */
>> +    colo_old_packet_check(s);
>> +}
>> +
>> +/*
>>    * called from the main thread on the primary
>>    * to setup colo-compare.
>>    */
>> @@ -273,6 +465,8 @@ static void colo_compare_complete(UserCreatable 
>> *uc, Error **errp)
>>   {
>>       CompareState *s = COLO_COMPARE(uc);
>>       CompareChardevProps props;
>> +    char thread_name[64];
>> +    static int compare_id;
>>         if (!s->pri_indev || !s->sec_indev || !s->outdev) {
>>           error_setg(errp, "colo compare needs 'primary_in' ,"
>> @@ -356,6 +550,18 @@ static void colo_compare_complete(UserCreatable 
>> *uc, Error **errp)
>>                                                         g_free,
>> connection_destroy);
>>   +    sprintf(thread_name, "colo-compare %d", compare_id);
>> +    qemu_thread_create(&s->thread, thread_name,
>> +                       colo_compare_thread, s,
>> +                       QEMU_THREAD_JOINABLE);
>> +    compare_id++;
>> +
>> +    /* A regular timer to kick any packets that the secondary 
>> doesn't match */
>> +    s->timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, /* Only when guest 
>> runs */
>> +                            check_old_packet_regular, s);
>> +    timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>> +                        REGULAR_PACKET_CHECK_MS);
>
> I still think we need to make sure the timer were processed in colo 
> thread. Since check_old_packet_regular may iterate conn_list which may 
> be modified by colo thread at the same time.

Make sense, but in here we just read the conn_list, maybe we should add 
a lock for it?
Because of we don't have a easy way to make timer's handler run in colo 
thread,
the handler run in main-loop. Maybe this job we can do it later.

Thanks
Zhang Chen

>
>> +
>>       return;
>>   }
>>   @@ -397,6 +603,16 @@ static void colo_compare_finalize(Object *obj)
>>         g_queue_free(&s->conn_list);
>>   +    if (s->thread.thread) {
>> +        /* compare connection */
>> +        g_queue_foreach(&s->conn_list, colo_compare_connection, s);
>> +        qemu_thread_join(&s->thread);
>> +    }
>> +
>> +    if (s->timer) {
>> +        timer_del(s->timer);
>> +    }
>> +
>>       g_free(s->pri_indev);
>>       g_free(s->sec_indev);
>>       g_free(s->outdev);
>> diff --git a/net/colo.c b/net/colo.c
>> index bc86553..da4b771 100644
>> --- a/net/colo.c
>> +++ b/net/colo.c
>> @@ -128,6 +128,7 @@ Packet *packet_new(const void *data, int size)
>>         pkt->data = g_memdup(data, size);
>>       pkt->size = size;
>> +    pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>>         return pkt;
>>   }
>> diff --git a/net/colo.h b/net/colo.h
>> index 9cbc14e..6b395a3 100644
>> --- a/net/colo.h
>> +++ b/net/colo.h
>> @@ -17,6 +17,7 @@
>>     #include "slirp/slirp.h"
>>   #include "qemu/jhash.h"
>> +#include "qemu/timer.h"
>>     #define HASHTABLE_MAX_SIZE 16384
>>   @@ -28,6 +29,8 @@ typedef struct Packet {
>>       };
>>       uint8_t *transport_header;
>>       int size;
>> +    /* Time of packet creation, in wall clock ms */
>> +    int64_t creation_ms;
>>   } Packet;
>>     typedef struct ConnectionKey {
>> diff --git a/trace-events b/trace-events
>> index 703de1a..1537e91 100644
>> --- a/trace-events
>> +++ b/trace-events
>> @@ -1919,3 +1919,5 @@ aspeed_vic_write(uint64_t offset, unsigned 
>> size, uint32_t data) "To 0x%" PRIx64
>>     # net/colo-compare.c
>>   colo_compare_main(const char *chr) ": %s"
>> +colo_compare_ip_info(int psize, const char *sta, const char *stb, 
>> int ssize, const char *stc, const char *std) "ppkt size = %d, ip_src 
>> = %s, ip_dst = %s, spkt size = %d, ip_src = %s, ip_dst = %s"
>> +colo_old_packet_check_found(int64_t old_time) "%" PRId64
>
>
>
> .
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison
  2016-08-31  9:33   ` Jason Wang
@ 2016-09-01  5:00     ` Zhang Chen
  2016-09-01  7:40       ` Jason Wang
  0 siblings, 1 reply; 32+ messages in thread
From: Zhang Chen @ 2016-09-01  5:00 UTC (permalink / raw)
  To: Jason Wang, qemu devel
  Cc: Li Zhijian, eddie . dong, Dr . David Alan Gilbert, zhanghailiang



On 08/31/2016 05:33 PM, Jason Wang wrote:
>
>
> On 2016年08月17日 16:10, Zhang Chen wrote:
>> We add TCP,UDP,ICMP packet comparison to replace
>> IP packet comparison. This can increase the
>> accuracy of the package comparison.
>> Less checkpoint more efficiency.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> ---
>>   net/colo-compare.c | 152 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++--
>>   trace-events       |   4 ++
>>   2 files changed, 152 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index b90cf1f..0daefd9 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -18,6 +18,7 @@
>>   #include "qapi/qmp/qerror.h"
>>   #include "qapi/error.h"
>>   #include "net/net.h"
>> +#include "net/eth.h"
>>   #include "net/vhost_net.h"
>>   #include "qom/object_interfaces.h"
>>   #include "qemu/iov.h"
>> @@ -179,9 +180,136 @@ static int colo_packet_compare(Packet *ppkt, 
>> Packet *spkt)
>>       }
>>   }
>>   -static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
>> +/*
>> + * called from the compare thread on the primary
>> + * for compare tcp packet
>> + * compare_tcp copied from Dr. David Alan Gilbert's branch
>> + */
>> +static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
>> +{
>> +    struct tcphdr *ptcp, *stcp;
>> +    int res;
>> +    char *sdebug, *ddebug;
>> +
>> +    trace_colo_compare_main("compare tcp");
>> +    if (ppkt->size != spkt->size) {
>> +        if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
>> +            trace_colo_compare_main("pkt size not same");
>> +        }
>> +        return -1;
>> +    }
>> +
>> +    ptcp = (struct tcphdr *)ppkt->transport_header;
>> +    stcp = (struct tcphdr *)spkt->transport_header;
>> +
>> +    if (ptcp->th_seq != stcp->th_seq) {
>> +        if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
>> +            trace_colo_compare_main("pkt tcp seq not same");
>> +        }
>> +        return -1;
>> +    }
>> +
>> +    /*
>> +     * The 'identification' field in the IP header is *very* random
>> +     * it almost never matches.  Fudge this by ignoring differences in
>> +     * unfragmented packets; they'll normally sort themselves out if 
>> different
>> +     * anyway, and it should recover at the TCP level.
>> +     * An alternative would be to get both the primary and secondary 
>> to rewrite
>> +     * somehow; but that would need some sync traffic to sync the state
>> +     */
>> +    if (ntohs(ppkt->ip->ip_off) & IP_DF) {
>> +        spkt->ip->ip_id = ppkt->ip->ip_id;
>> +        /* and the sum will be different if the IDs were different */
>> +        spkt->ip->ip_sum = ppkt->ip->ip_sum;
>> +    }
>> +
>> +    res = memcmp(ppkt->data + ETH_HLEN, spkt->data + ETH_HLEN,
>> +                (spkt->size - ETH_HLEN));
>
> This may work but I worry about whether or not tagged packet can work 
> here. Looks like parse_packet_early() can recognize vlan tag, but 
> fill_connection_key() can not. This looks can result queuing wrong 
> packets into wrong connection.

Currently COLO proxy can't support vlan, we will add this feature in the 
future.

>
>> +
>> +    if (res != 0 && 
>> trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
>> +        sdebug = strdup(inet_ntoa(ppkt->ip->ip_src));
>> +        ddebug = strdup(inet_ntoa(ppkt->ip->ip_dst));
>> +        fprintf(stderr, "%s: src/dst: %s/%s p: seq/ack=%u/%u"
>> +        " s: seq/ack=%u/%u res=%d flags=%x/%x\n", __func__,
>> +                   sdebug, ddebug,
>> +                   ntohl(ptcp->th_seq), ntohl(ptcp->th_ack),
>> +                   ntohl(stcp->th_seq), ntohl(stcp->th_ack),
>> +                   res, ptcp->th_flags, stcp->th_flags);
>
> I tend not mix using debug logs with tracepoints.

OK, I will change trace_colo_compare_tcp_miscompare() to fprintf() here.

Thanks
Zhang Chen

>
>> +
>> +        trace_colo_compare_tcp_miscompare("Primary len", ppkt->size);
>> +        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", 
>> ppkt->size);
>> +        trace_colo_compare_tcp_miscompare("Secondary len", spkt->size);
>> +        qemu_hexdump((char *)spkt->data, stderr, "colo-compare", 
>> spkt->size);
>> +
>> +        g_free(sdebug);
>> +        g_free(ddebug);
>> +    }
>> +
>> +    return res;
>> +}
>> +
>> +/*
>> + * called from the compare thread on the primary
>> + * for compare udp packet
>> + */
>> +static int colo_packet_compare_udp(Packet *spkt, Packet *ppkt)
>> +{
>> +    int ret;
>> +
>> +    trace_colo_compare_main("compare udp");
>> +    ret = colo_packet_compare(ppkt, spkt);
>> +
>> +    if (ret) {
>> +        trace_colo_compare_udp_miscompare("primary pkt size", 
>> ppkt->size);
>> +        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", 
>> ppkt->size);
>> +        trace_colo_compare_udp_miscompare("Secondary pkt size", 
>> spkt->size);
>> +        qemu_hexdump((char *)spkt->data, stderr, "colo-compare", 
>> spkt->size);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +/*
>> + * called from the compare thread on the primary
>> + * for compare icmp packet
>> + */
>> +static int colo_packet_compare_icmp(Packet *spkt, Packet *ppkt)
>> +{
>> +    int network_length;
>> +
>> +    trace_colo_compare_main("compare icmp");
>> +    network_length = ppkt->ip->ip_hl * 4;
>> +    if (ppkt->size != spkt->size ||
>> +        ppkt->size < network_length + ETH_HLEN) {
>> +        return -1;
>> +    }
>> +
>> +    if (colo_packet_compare(ppkt, spkt)) {
>> +        trace_colo_compare_icmp_miscompare("primary pkt size",
>> +                                           ppkt->size);
>> +        qemu_hexdump((char *)ppkt->data, stderr, "colo-compare",
>> +                     ppkt->size);
>> +        trace_colo_compare_icmp_miscompare("Secondary pkt size",
>> +                                           spkt->size);
>> +        qemu_hexdump((char *)spkt->data, stderr, "colo-compare",
>> +                     spkt->size);
>> +        return -1;
>> +    } else {
>> +        return 0;
>> +    }
>> +}
>> +
>> +/*
>> + * called from the compare thread on the primary
>> + * for compare other packet
>> + */
>> +static int colo_packet_compare_other(Packet *spkt, Packet *ppkt)
>>   {
>> -    trace_colo_compare_main("compare all");
>> +    trace_colo_compare_main("compare other");
>> +    trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
>> + inet_ntoa(ppkt->ip->ip_dst), spkt->size,
>> + inet_ntoa(spkt->ip->ip_src),
>> + inet_ntoa(spkt->ip->ip_dst));
>>       return colo_packet_compare(ppkt, spkt);
>>   }
>>   @@ -245,8 +373,24 @@ static void colo_compare_connection(void 
>> *opaque, void *user_data)
>>       while (!g_queue_is_empty(&conn->primary_list) &&
>>              !g_queue_is_empty(&conn->secondary_list)) {
>>           pkt = g_queue_pop_tail(&conn->primary_list);
>> -        result = g_queue_find_custom(&conn->secondary_list,
>> -                              pkt, 
>> (GCompareFunc)colo_packet_compare_all);
>> +        switch (conn->ip_proto) {
>> +        case IPPROTO_TCP:
>> +            result = g_queue_find_custom(&conn->secondary_list,
>> +                     pkt, (GCompareFunc)colo_packet_compare_tcp);
>> +            break;
>> +        case IPPROTO_UDP:
>> +            result = g_queue_find_custom(&conn->secondary_list,
>> +                     pkt, (GCompareFunc)colo_packet_compare_udp);
>> +            break;
>> +        case IPPROTO_ICMP:
>> +            result = g_queue_find_custom(&conn->secondary_list,
>> +                     pkt, (GCompareFunc)colo_packet_compare_icmp);
>> +            break;
>> +        default:
>> +            result = g_queue_find_custom(&conn->secondary_list,
>> +                     pkt, (GCompareFunc)colo_packet_compare_other);
>> +            break;
>> +        }
>>             if (result) {
>>               ret = compare_chr_send(s->chr_out, pkt->data, pkt->size);
>> diff --git a/trace-events b/trace-events
>> index 1537e91..ab22eb2 100644
>> --- a/trace-events
>> +++ b/trace-events
>> @@ -1919,5 +1919,9 @@ aspeed_vic_write(uint64_t offset, unsigned 
>> size, uint32_t data) "To 0x%" PRIx64
>>     # net/colo-compare.c
>>   colo_compare_main(const char *chr) ": %s"
>> +colo_compare_tcp_miscompare(const char *sta, int size) ": %s = %d"
>> +colo_compare_udp_miscompare(const char *sta, int size) ": %s = %d"
>> +colo_compare_icmp_miscompare(const char *sta, int size) ": %s = %d"
>>   colo_compare_ip_info(int psize, const char *sta, const char *stb, 
>> int ssize, const char *stc, const char *std) "ppkt size = %d, ip_src 
>> = %s, ip_dst = %s, spkt size = %d, ip_src = %s, ip_dst = %s"
>>   colo_old_packet_check_found(int64_t old_time) "%" PRId64
>> +colo_compare_miscompare(void) ""
>
>
>
> .
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization
  2016-08-31  9:39         ` Zhang Chen
@ 2016-09-01  6:32           ` Zhang Chen
  0 siblings, 0 replies; 32+ messages in thread
From: Zhang Chen @ 2016-09-01  6:32 UTC (permalink / raw)
  To: qemu-devel



On 08/31/2016 05:39 PM, Zhang Chen wrote:
>
>
> On 08/31/2016 05:20 PM, Jason Wang wrote:
>>
>>
>> On 2016年08月31日 17:03, Zhang Chen wrote:
>>>
>>>
>>> On 08/31/2016 03:53 PM, Jason Wang wrote:
>>>>
>>>>
>>>> On 2016年08月17日 16:10, Zhang Chen wrote:
>>>>> This a COLO net ascii figure:
>>>>>
>>>>>   Primary qemu Secondary qemu
>>>>> +--------------------------------------------------------------+ 
>>>>> +----------------------------------------------------------------+
>>>>> | +----------------------------------------------------------+ 
>>>>> |       | 
>>>>> +-----------------------------------------------------------+ |
>>>>> | | | |       | | | |
>>>>> | |                        guest | |       |  | 
>>>>> guest                              | |
>>>>> | | | |       | | | |
>>>>> | +-------^--------------------------+-----------------------+ 
>>>>> |       | 
>>>>> +---------------------+--------+----------------------------+ |
>>>>> |         |                          | | |                        
>>>>> ^ | |
>>>>> |         |                          | | |                        
>>>>> | | |
>>>>> |         | 
>>>>> +------------------------------------------------------+ 
>>>>> |                        |        | |
>>>>> |netfilter|  |                       | |    |  | 
>>>>> netfilter            | |                              |
>>>>> | +----------+ +----------------------------+ |    |  | 
>>>>> +-----------------------------------------------------------+ |
>>>>> | |       |  |                       |      |        out | |  |  
>>>>> |                     |        |  filter excute order       | |
>>>>> | |       |  |          +-----------------------------+ | | |  
>>>>> |                     |        | +------------------->      | |
>>>>> | |       |  |          |            |      |         | | | |  
>>>>> |                     |        | TCP | |
>>>>> | | +-----+--+-+  +-----v----+ +-----v----+ |pri 
>>>>> +----+----+sec|    |  |  | +------------+ 
>>>>> +---+----+---v+rewriter++  +------------+ | |
>>>>> | | |          |  |          | |          | |in  | |in |    |  |  
>>>>> | |            |  |        |              | |            | | |
>>>>> | | |  filter  |  |  filter  | |  filter  +------> colo <------+ 
>>>>> +-------->  filter   +--> adjust | adjust     +-->   filter   | | |
>>>>> | | |  mirror  |  |redirector| |redirector| |    | compare | |  
>>>>> |    |  | | redirector |  | ack    |   seq        | | redirector | 
>>>>> | |
>>>>> | | |          |  |          | |          | |    | | |  |    |  | 
>>>>> |            |  |        |              | |            | | |
>>>>> | | +----^-----+  +----+-----+ +----------+ | +---------+ |  |    
>>>>> |  | +------------+ +--------+--------------+ +---+--------+ | |
>>>>> | |      |   tx        |   rx           rx  | |  |    | 
>>>>> |            tx                        all       | rx | |
>>>>> | |      |             |                    | |  |    | 
>>>>> +-----------------------------------------------------------+ |
>>>>> | |      |             +--------------+     | |  |    | 
>>>>> |            |
>>>>> | |      |   filter excute order      |     | |  |    | 
>>>>> |            |
>>>>> | |      |  +---------------->        | | | 
>>>>> +--------------------------------------------------------+ |
>>>>> | +-----------------------------------------+ | | |
>>>>> |        |                            | | | |
>>>>> +--------------------------------------------------------------+ 
>>>>> +----------------------------------------------------------------+
>>>>>           |guest receive               | guest send
>>>>>           |                            |
>>>>> +--------+----------------------------v------------------------+
>>>>> | |                          NOTE: filter direction is rx/tx/all
>>>>> |                         tap | rx:receive packets sent to the netdev
>>>>> | |                          tx:receive packets sent by the netdev
>>>>> +--------------------------------------------------------------+
>>>>
>>>> It's better to add a doc under docs to explain this configuration 
>>>> in detail on top of this series.
>>>>
>>>
>>> As you say, Am I add /docs/colo-proxy.txt to explain it or add this 
>>> in hailiang's COLO-FT.txt after merge?
>
> Can you give me a way for doc?
>
>

Sorry, I misunderstand hailiang's mean, I will add /docs/colo-proxy.txt 
in next version.

Thanks
Zhang Chen

>>>
>>>>> In COLO-compare, we do packet comparing job.
>>>>> Packets coming from the primary char indev will be sent to outdev.
>>>>> Packets coming from the secondary char dev will be dropped after 
>>>>> comparing.
>>>>> colo-comapre need two input chardev and one output chardev:
>>>>> primary_in=chardev1-id (source: primary send packet)
>>>>> secondary_in=chardev2-id (source: secondary send packet)
>>>>> outdev=chardev3-id
>>>>>
>>>>> usage:
>>>>>
>>>>> primary:
>>>>> -netdev 
>>>>> tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown 
>>>>>
>>>>> -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>>>>> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>>>>> -chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>>>>> -chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>>>>> -chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>>>>> -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>>>>> -chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>>>>> -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>>>>> -object 
>>>>> filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>>>>> -object 
>>>>> filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>>>>> -object 
>>>>> colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>>>>>
>>>>> secondary:
>>>>> -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down 
>>>>> script=/etc/qemu-ifdown
>>>>> -device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>>>>> -chardev socket,id=red0,host=3.3.3.3,port=9003
>>>>> -chardev socket,id=red1,host=3.3.3.3,port=9004
>>>>> -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>>>>> -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>>>>>
>>>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>>>>> ---
>>>>>   net/Makefile.objs  |   1 +
>>>>>   net/colo-compare.c | 284 
>>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>   qemu-options.hx    |  39 ++++++++
>>>>>   vl.c               |   3 +-
>>>>>   4 files changed, 326 insertions(+), 1 deletion(-)
>>>>>   create mode 100644 net/colo-compare.c
>>>>>
>>>>> diff --git a/net/Makefile.objs b/net/Makefile.objs
>>>>> index b7c22fd..ba92f73 100644
>>>>> --- a/net/Makefile.objs
>>>>> +++ b/net/Makefile.objs
>>>>> @@ -16,3 +16,4 @@ common-obj-$(CONFIG_NETMAP) += netmap.o
>>>>>   common-obj-y += filter.o
>>>>>   common-obj-y += filter-buffer.o
>>>>>   common-obj-y += filter-mirror.o
>>>>> +common-obj-y += colo-compare.o
>>>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>>>> new file mode 100644
>>>>> index 0000000..cdc3e0e
>>>>> --- /dev/null
>>>>> +++ b/net/colo-compare.c
>>>>> @@ -0,0 +1,284 @@
>>>>> +/*
>>>>> + * COarse-grain LOck-stepping Virtual Machines for Non-stop 
>>>>> Service (COLO)
>>>>> + * (a.k.a. Fault Tolerance or Continuous Replication)
>>>>> + *
>>>>> + * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
>>>>> + * Copyright (c) 2016 FUJITSU LIMITED
>>>>> + * Copyright (c) 2016 Intel Corporation
>>>>> + *
>>>>> + * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>>>> + *
>>>>> + * This work is licensed under the terms of the GNU GPL, version 
>>>>> 2 or
>>>>> + * later.  See the COPYING file in the top-level directory.
>>>>> + */
>>>>> +
>>>>> +#include "qemu/osdep.h"
>>>>> +#include "qemu/error-report.h"
>>>>> +#include "qemu-common.h"
>>>>> +#include "qapi/qmp/qerror.h"
>>>>> +#include "qapi/error.h"
>>>>> +#include "net/net.h"
>>>>> +#include "net/vhost_net.h"
>>>>
>>>> Looks unnecessary.
>>>
>>> I will remove it.
>>>
>>>>
>>>>> +#include "qom/object_interfaces.h"
>>>>> +#include "qemu/iov.h"
>>>>> +#include "qom/object.h"
>>>>> +#include "qemu/typedefs.h"
>>>>> +#include "net/queue.h"
>>>>> +#include "sysemu/char.h"
>>>>> +#include "qemu/sockets.h"
>>>>> +#include "qapi-visit.h"
>>>>> +
>>>>> +#define TYPE_COLO_COMPARE "colo-compare"
>>>>> +#define COLO_COMPARE(obj) \
>>>>> +    OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
>>>>> +
>>>>> +#define COMPARE_READ_LEN_MAX NET_BUFSIZE
>>>>> +
>>>>> +typedef struct CompareState {
>>>>> +    Object parent;
>>>>> +
>>>>> +    char *pri_indev;
>>>>> +    char *sec_indev;
>>>>> +    char *outdev;
>>>>> +    CharDriverState *chr_pri_in;
>>>>> +    CharDriverState *chr_sec_in;
>>>>> +    CharDriverState *chr_out;
>>>>> +    QTAILQ_ENTRY(CompareState) next;
>>>>
>>>> This looks not used in this series but in commit "colo-compare and 
>>>> filter-rewriter work with colo-frame". We'd better delay the 
>>>> introducing to that patch.
>>>
>>> OK~ I got your point.
>>>
>>>>
>>>>> +    SocketReadState pri_rs;
>>>>> +    SocketReadState sec_rs;
>>>>> +} CompareState;
>>>>> +
>>>>> +typedef struct CompareClass {
>>>>> +    ObjectClass parent_class;
>>>>> +} CompareClass;
>>>>> +
>>>>> +typedef struct CompareChardevProps {
>>>>> +    bool is_socket;
>>>>> +    bool is_unix;
>>>>> +} CompareChardevProps;
>>>>> +
>>>>> +static char *compare_get_pri_indev(Object *obj, Error **errp)
>>>>> +{
>>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>>> +
>>>>> +    return g_strdup(s->pri_indev);
>>>>> +}
>>>>> +
>>>>> +static void compare_set_pri_indev(Object *obj, const char *value, 
>>>>> Error **errp)
>>>>> +{
>>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>>> +
>>>>> +    g_free(s->pri_indev);
>>>>> +    s->pri_indev = g_strdup(value);
>>>>> +}
>>>>> +
>>>>> +static char *compare_get_sec_indev(Object *obj, Error **errp)
>>>>> +{
>>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>>> +
>>>>> +    return g_strdup(s->sec_indev);
>>>>> +}
>>>>> +
>>>>> +static void compare_set_sec_indev(Object *obj, const char *value, 
>>>>> Error **errp)
>>>>> +{
>>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>>> +
>>>>> +    g_free(s->sec_indev);
>>>>> +    s->sec_indev = g_strdup(value);
>>>>> +}
>>>>> +
>>>>> +static char *compare_get_outdev(Object *obj, Error **errp)
>>>>> +{
>>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>>> +
>>>>> +    return g_strdup(s->outdev);
>>>>> +}
>>>>> +
>>>>> +static void compare_set_outdev(Object *obj, const char *value, 
>>>>> Error **errp)
>>>>> +{
>>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>>> +
>>>>> +    g_free(s->outdev);
>>>>> +    s->outdev = g_strdup(value);
>>>>> +}
>>>>> +
>>>>> +static void compare_pri_rs_finalize(SocketReadState *pri_rs)
>>>>> +{
>>>>> +    /* if packet_enqueue pri pkt failed we will send unsupported 
>>>>> packet */
>>>>> +}
>>>>> +
>>>>> +static void compare_sec_rs_finalize(SocketReadState *sec_rs)
>>>>> +{
>>>>> +    /* if packet_enqueue sec pkt failed we will notify trace */
>>>>> +}
>>>>> +
>>>>> +static int compare_chardev_opts(void *opaque,
>>>>> +                                const char *name, const char *value,
>>>>> +                                Error **errp)
>>>>> +{
>>>>> +    CompareChardevProps *props = opaque;
>>>>> +
>>>>> +    if (strcmp(name, "backend") == 0 && strcmp(value, "socket") 
>>>>> == 0) {
>>>>> +        props->is_socket = true;
>>>>> +    } else if (strcmp(name, "host") == 0) {
>>>>
>>>> Typo? net_vhost_chardev_opts() did:
>>>>
>>>>     } else if (strcmp(name, "path") == 0) {
>>>>         props->is_unix = true;
>>>>     }
>>>>
>>>>
>>>
>>> No, In colo-compare we use chardev like this:
>>>
>>> -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>>>
>>> If we only use "path" here will trigger a error.
>>> Should I add anthor "path" here?
>>
>> If I understand the code correctly, "is_unix" means "is unix domain 
>> socket"? If yes, according to the help:
>>
>> -chardev 
>> socket,id=id[,host=host],port=port[,to=to][,ipv4][,ipv6][,nodelay][,reconnect=seconds]
>> [,server][,nowait][,telnet][,reconnect=seconds][,mux=on|off]
>>          [,logfile=PATH][,logappend=on|off][,tls-creds=ID] (tcp)
>> -chardev 
>> socket,id=id,path=path[,server][,nowait][,telnet][,reconnect=seconds]
>>          [,mux=on|off][,logfile=PATH][,logappend=on|off] (unix)
>>
>> "host" will not be used for UNIX domain socket.
>>
>> And if UNIX domain socket is not supported, there's probably no need 
>> to differentiate it from other types.
>
> OK, I will remove the "is_unix" in next version.
>
>
>>
>>>
>>>
>>>>
>>>>> +        props->is_unix = true;
>>>>> +    } else if (strcmp(name, "port") == 0) {
>>>>> +    } else if (strcmp(name, "server") == 0) {
>>>>> +    } else if (strcmp(name, "wait") == 0) {
>>>>> +    } else {
>>>>> +        error_setg(errp,
>>>>> +                   "COLO-compare does not support a chardev with 
>>>>> option %s=%s",
>>>>> +                   name, value);
>>>>> +        return -1;
>>>>> +    }
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * called from the main thread on the primary
>>>>> + * to setup colo-compare.
>>>>> + */
>>>>> +static void colo_compare_complete(UserCreatable *uc, Error **errp)
>>>>> +{
>>>>> +    CompareState *s = COLO_COMPARE(uc);
>>>>> +    CompareChardevProps props;
>>>>> +
>>>>> +    if (!s->pri_indev || !s->sec_indev || !s->outdev) {
>>>>> +        error_setg(errp, "colo compare needs 'primary_in' ,"
>>>>> +                   "'secondary_in','outdev' property set");
>>>>> +        return;
>>>>> +    } else if (!strcmp(s->pri_indev, s->outdev) ||
>>>>> +               !strcmp(s->sec_indev, s->outdev) ||
>>>>> +               !strcmp(s->pri_indev, s->sec_indev)) {
>>>>> +        error_setg(errp, "'indev' and 'outdev' could not be same "
>>>>> +                   "for compare module");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    s->chr_pri_in = qemu_chr_find(s->pri_indev);
>>>>> +    if (s->chr_pri_in == NULL) {
>>>>> +        error_setg(errp, "Primary IN Device '%s' not found",
>>>>> +                   s->pri_indev);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    /* inspect chardev opts */
>>>>> +    memset(&props, 0, sizeof(props));
>>>>> +    if (qemu_opt_foreach(s->chr_pri_in->opts, 
>>>>> compare_chardev_opts, &props, errp)) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (!props.is_socket || !props.is_unix) {
>>>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>>>> +                   s->pri_indev);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    s->chr_sec_in = qemu_chr_find(s->sec_indev);
>>>>> +    if (s->chr_sec_in == NULL) {
>>>>> +        error_setg(errp, "Secondary IN Device '%s' not found",
>>>>> +                   s->sec_indev);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    memset(&props, 0, sizeof(props));
>>>>> +    if (qemu_opt_foreach(s->chr_sec_in->opts, 
>>>>> compare_chardev_opts, &props, errp)) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (!props.is_socket || !props.is_unix) {
>>>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>>>> +                   s->sec_indev);
>>>>
>>>> I believe tcp socket is also supported?
>>>
>>> If I understand correctly, "tcp socket" in here is the "-chardev 
>>> socket".
>>> I will rename "unix socket" to "tcp socket".
>>>
>>>>
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    s->chr_out = qemu_chr_find(s->outdev);
>>>>> +    if (s->chr_out == NULL) {
>>>>> +        error_setg(errp, "OUT Device '%s' not found", s->outdev);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    memset(&props, 0, sizeof(props));
>>>>> +    if (qemu_opt_foreach(s->chr_out->opts, compare_chardev_opts, 
>>>>> &props, errp)) {
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (!props.is_socket || !props.is_unix) {
>>>>> +        error_setg(errp, "chardev \"%s\" is not a unix socket",
>>>>> +                   s->outdev);
>>>>
>>>> Ditto, and there's code duplication, please introduce a helper to 
>>>> do above.
>>>
>>> I don't understand what the "helper"?
>>> In here we check each chardev, will I change to "goto error;" ?
>>
>> A helper to avoid the code duplication for socket type inspection for 
>> pri_in,scr_in and chr_out.
>
> I got it~~
> I will add it in next version.
>
>
>>
>>>
>>>>
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    qemu_chr_fe_claim_no_fail(s->chr_pri_in);
>>>>> +
>>>>> +    qemu_chr_fe_claim_no_fail(s->chr_sec_in);
>>>>> +
>>>>> +    qemu_chr_fe_claim_no_fail(s->chr_out);
>>>>> +
>>>>> +    net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
>>>>> +    net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
>>>>> +
>>>>> +    return;
>>>>> +}
>>>>> +
>>>>> +static void colo_compare_class_init(ObjectClass *oc, void *data)
>>>>> +{
>>>>> +    UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
>>>>> +
>>>>> +    ucc->complete = colo_compare_complete;
>>>>> +}
>>>>> +
>>>>> +static void colo_compare_init(Object *obj)
>>>>> +{
>>>>> +    object_property_add_str(obj, "primary_in",
>>>>> +                            compare_get_pri_indev, 
>>>>> compare_set_pri_indev,
>>>>> +                            NULL);
>>>>> +    object_property_add_str(obj, "secondary_in",
>>>>> +                            compare_get_sec_indev, 
>>>>> compare_set_sec_indev,
>>>>> +                            NULL);
>>>>> +    object_property_add_str(obj, "outdev",
>>>>> +                            compare_get_outdev, compare_set_outdev,
>>>>> +                            NULL);
>>>>> +}
>>>>> +
>>>>> +static void colo_compare_finalize(Object *obj)
>>>>> +{
>>>>> +    CompareState *s = COLO_COMPARE(obj);
>>>>> +
>>>>> +    if (s->chr_pri_in) {
>>>>> +        qemu_chr_add_handlers(s->chr_pri_in, NULL, NULL, NULL, 
>>>>> NULL);
>>>>> +        qemu_chr_fe_release(s->chr_pri_in);
>>>>> +    }
>>>>> +    if (s->chr_sec_in) {
>>>>> +        qemu_chr_add_handlers(s->chr_sec_in, NULL, NULL, NULL, 
>>>>> NULL);
>>>>> +        qemu_chr_fe_release(s->chr_sec_in);
>>>>> +    }
>>>>> +    if (s->chr_out) {
>>>>> +        qemu_chr_fe_release(s->chr_out);
>>>>> +    }
>>>>> +
>>>>> +    g_free(s->pri_indev);
>>>>> +    g_free(s->sec_indev);
>>>>> +    g_free(s->outdev);
>>>>> +}
>>>>> +
>>>>> +static const TypeInfo colo_compare_info = {
>>>>> +    .name = TYPE_COLO_COMPARE,
>>>>> +    .parent = TYPE_OBJECT,
>>>>> +    .instance_size = sizeof(CompareState),
>>>>> +    .instance_init = colo_compare_init,
>>>>> +    .instance_finalize = colo_compare_finalize,
>>>>> +    .class_size = sizeof(CompareClass),
>>>>> +    .class_init = colo_compare_class_init,
>>>>> +    .interfaces = (InterfaceInfo[]) {
>>>>> +        { TYPE_USER_CREATABLE },
>>>>> +        { }
>>>>> +    }
>>>>> +};
>>>>> +
>>>>> +static void register_types(void)
>>>>> +{
>>>>> +    type_register_static(&colo_compare_info);
>>>>> +}
>>>>> +
>>>>> +type_init(register_types);
>>>>> diff --git a/qemu-options.hx b/qemu-options.hx
>>>>> index 587de8f..33d5d0b 100644
>>>>> --- a/qemu-options.hx
>>>>> +++ b/qemu-options.hx
>>>>> @@ -3866,6 +3866,45 @@ Dump the network traffic on netdev 
>>>>> @var{dev} to the file specified by
>>>>>   The file format is libpcap, so it can be analyzed with tools 
>>>>> such as tcpdump
>>>>>   or Wireshark.
>>>>>   +@item -object 
>>>>> colo-compare,id=@var{id},primary_in=@var{chardevid},secondary_in=@var{chardevid},
>>>>> +outdev=@var{chardevid}
>>>>> +
>>>>> +Colo-compare gets packet from primary_in@var{chardevid} and 
>>>>> secondary_in@var{chardevid}, than compare primary packet with
>>>>> +secondary packet. If the packet same, we will output primary
>>>>
>>>> s/If the packet same/If the packets are same/.
>>>
>>> OK.
>>>
>>>>
>>>>> +packet to outdev@var{chardevid}, else we will notify colo-frame
>>>>> +do checkpoint and send primary packet to outdev@var{chardevid}.
>>>>> +
>>>>> +we can use it with the help of filter-mirror and filter-redirector.
>>>>
>>>> s/we/We/ and looks like colo compare must be used with the help of 
>>>> mirror and redirector?
>>>
>>> Currently yes.
>>
>> Then please change the doc here.
>
> s/We can use it/We must use it.
>
> Thanks
> Zhang Chen
>
>>
>> Thanks
>>
>>>
>>>>
>>>>> +
>>>>> +@example
>>>>> +
>>>>> +primary:
>>>>> +-netdev 
>>>>> tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown 
>>>>>
>>>>> +-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
>>>>> +-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait
>>>>> +-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait
>>>>> +-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait
>>>>> +-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
>>>>> +-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait
>>>>> +-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
>>>>> +-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
>>>>> +-object 
>>>>> filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
>>>>> +-object 
>>>>> filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
>>>>> +-object 
>>>>> colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0
>>>>> +
>>>>> +secondary:
>>>>> +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down 
>>>>> script=/etc/qemu-ifdown
>>>>> +-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
>>>>> +-chardev socket,id=red0,host=3.3.3.3,port=9003
>>>>> +-chardev socket,id=red1,host=3.3.3.3,port=9004
>>>>> +-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
>>>>> +-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
>>>>> +
>>>>> +@end example
>>>>> +
>>>>> +If you want to know the detail of above command line, you can read
>>>>> +the colo-compare git log.
>>>>> +
>>>>>   @item -object 
>>>>> secret,id=@var{id},data=@var{string},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>>>>   @item -object 
>>>>> secret,id=@var{id},file=@var{filename},format=@var{raw|base64}[,keyid=@var{secretid},iv=@var{string}]
>>>>>   diff --git a/vl.c b/vl.c
>>>>> index cbe51ac..c6b9a6f 100644
>>>>> --- a/vl.c
>>>>> +++ b/vl.c
>>>>> @@ -2865,7 +2865,8 @@ static bool object_create_initial(const char 
>>>>> *type)
>>>>>       if (g_str_equal(type, "filter-buffer") ||
>>>>>           g_str_equal(type, "filter-dump") ||
>>>>>           g_str_equal(type, "filter-mirror") ||
>>>>> -        g_str_equal(type, "filter-redirector")) {
>>>>> +        g_str_equal(type, "filter-redirector") ||
>>>>> +        g_str_equal(type, "colo-compare")) {
>>>>>           return false;
>>>>>       }
>>>>
>>>>
>>>>
>>>> .
>>>>
>>>
>>
>>
>>
>> .
>>
>

-- 
Thanks
zhangchen

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread
  2016-09-01  4:50     ` Zhang Chen
@ 2016-09-01  7:38       ` Jason Wang
  0 siblings, 0 replies; 32+ messages in thread
From: Jason Wang @ 2016-09-01  7:38 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, Wen Congyang, zhanghailiang, eddie . dong,
	Dr . David Alan Gilbert



On 2016年09月01日 12:50, Zhang Chen wrote:
>>>   + sprintf(thread_name, "colo-compare %d", compare_id);
>>> +    qemu_thread_create(&s->thread, thread_name,
>>> +                       colo_compare_thread, s,
>>> +                       QEMU_THREAD_JOINABLE);
>>> +    compare_id++;
>>> +
>>> +    /* A regular timer to kick any packets that the secondary 
>>> doesn't match */
>>> +    s->timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, /* Only when guest 
>>> runs */
>>> +                            check_old_packet_regular, s);
>>> +    timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>>> +                        REGULAR_PACKET_CHECK_MS);
>>
>> I still think we need to make sure the timer were processed in colo 
>> thread. Since check_old_packet_regular may iterate conn_list which 
>> may be modified by colo thread at the same time.
>
> Make sense, but in here we just read the conn_list, maybe we should 
> add a lock for it?
> Because of we don't have a easy way to make timer's handler run in 
> colo thread,
> the handler run in main-loop. Maybe this job we can do it later.
>
> Thanks
> Zhang Chen 

A lock is ok for this series. But need to add a TODO here and we 
something like patch 1 to make sure timer could be processed other than 
main loop in the future.

Thanks

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison
  2016-09-01  5:00     ` Zhang Chen
@ 2016-09-01  7:40       ` Jason Wang
  0 siblings, 0 replies; 32+ messages in thread
From: Jason Wang @ 2016-09-01  7:40 UTC (permalink / raw)
  To: Zhang Chen, qemu devel
  Cc: Li Zhijian, eddie . dong, Dr . David Alan Gilbert, zhanghailiang



On 2016年09月01日 13:00, Zhang Chen wrote:
>>> +    /*
>>> +     * The 'identification' field in the IP header is *very* random
>>> +     * it almost never matches.  Fudge this by ignoring differences in
>>> +     * unfragmented packets; they'll normally sort themselves out 
>>> if different
>>> +     * anyway, and it should recover at the TCP level.
>>> +     * An alternative would be to get both the primary and 
>>> secondary to rewrite
>>> +     * somehow; but that would need some sync traffic to sync the 
>>> state
>>> +     */
>>> +    if (ntohs(ppkt->ip->ip_off) & IP_DF) {
>>> +        spkt->ip->ip_id = ppkt->ip->ip_id;
>>> +        /* and the sum will be different if the IDs were different */
>>> +        spkt->ip->ip_sum = ppkt->ip->ip_sum;
>>> +    }
>>> +
>>> +    res = memcmp(ppkt->data + ETH_HLEN, spkt->data + ETH_HLEN,
>>> +                (spkt->size - ETH_HLEN));
>>
>> This may work but I worry about whether or not tagged packet can work 
>> here. Looks like parse_packet_early() can recognize vlan tag, but 
>> fill_connection_key() can not. This looks can result queuing wrong 
>> packets into wrong connection.
>
> Currently COLO proxy can't support vlan, we will add this feature in 
> the future.

Looks like current code can still queue vlan packets, please make sure 
it can't.

Thanks

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-09-01  7:40 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 01/10] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization Zhang Chen
2016-08-31  7:53   ` Jason Wang
2016-08-31  8:06     ` Hailiang Zhang
2016-08-31  9:03     ` Zhang Chen
2016-08-31  9:20       ` Jason Wang
2016-08-31  9:39         ` Zhang Chen
2016-09-01  6:32           ` Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 03/10] net/colo.c: add colo.c to define and handle packet Zhang Chen
2016-08-31  8:04   ` Jason Wang
2016-08-31  9:19     ` Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 04/10] Jhash: add linux kernel jhashtable in qemu Zhang Chen
2016-08-31  8:05   ` Jason Wang
2016-08-31  9:20     ` Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 05/10] colo-compare: track connection and enqueue packet Zhang Chen
2016-08-31  8:52   ` Jason Wang
2016-08-31 11:52     ` Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread Zhang Chen
2016-08-31  9:13   ` Jason Wang
2016-09-01  4:50     ` Zhang Chen
2016-09-01  7:38       ` Jason Wang
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison Zhang Chen
2016-08-31  9:33   ` Jason Wang
2016-09-01  5:00     ` Zhang Chen
2016-09-01  7:40       ` Jason Wang
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 08/10] filter-rewriter: introduce filter-rewriter initialization Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 09/10] filter-rewriter: track connection and parse packet Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 10/10] filter-rewriter: rewrite tcp packet to keep secondary connection Zhang Chen
2016-08-25  3:44 ` [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
2016-08-25  4:07   ` Jason Wang
2016-08-31  9:39 ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.