All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
To: qemu devel <qemu-devel@nongnu.org>, Jason Wang <jasowang@redhat.com>
Cc: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>,
	Li Zhijian <lizhijian@cn.fujitsu.com>,
	Wen Congyang <wency@cn.fujitsu.com>,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	"eddie . dong" <eddie.dong@intel.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: [Qemu-devel] [PATCH V12 10/10] filter-rewriter: rewrite tcp packet to keep secondary connection
Date: Wed, 17 Aug 2016 16:10:28 +0800	[thread overview]
Message-ID: <1471421428-26379-11-git-send-email-zhangchen.fnst@cn.fujitsu.com> (raw)
In-Reply-To: <1471421428-26379-1-git-send-email-zhangchen.fnst@cn.fujitsu.com>

We will rewrite tcp packet secondary received and sent.
When colo guest is a tcp server.

Firstly, client start a tcp handshake. the packet's seq=client_seq,
ack=0,flag=SYN. COLO primary guest get this pkt and mirror(filter-mirror)
to secondary guest, secondary get it use filter-redirector.
Then,primary guest response pkt
(seq=primary_seq,ack=client_seq+1,flag=ACK|SYN).
secondary guest response pkt
(seq=secondary_seq,ack=client_seq+1,flag=ACK|SYN).
In here,we use filter-rewriter save the secondary_seq to it's tcp connection.
Finally handshake,client send pkt
(seq=client_seq+1,ack=primary_seq+1,flag=ACK).
Here,filter-rewriter can get primary_seq, and rewrite ack from primary_seq+1
to secondary_seq+1, recalculate checksum. So the secondary tcp connection
kept good.

When we send/recv packet.
client send pkt(seq=client_seq+1+data_len,ack=primary_seq+1,flag=ACK|PSH).
filter-rewriter rewrite ack and send to secondary guest.

primary guest response pkt
(seq=primary_seq+1,ack=client_seq+1+data_len,flag=ACK)
secondary guest response pkt
(seq=secondary_seq+1,ack=client_seq+1+data_len,flag=ACK)
we rewrite secondary guest seq from secondary_seq+1 to primary_seq+1.
So tcp connection kept good.

In code We use offset( = secondary_seq - primary_seq )
to rewrite seq or ack.
handle_primary_tcp_pkt: tcp_pkt->th_ack += offset;
handle_secondary_tcp_pkt: tcp_pkt->th_seq -= offset;

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
---
 net/colo.c            |   2 +
 net/colo.h            |   7 ++++
 net/filter-rewriter.c | 112 +++++++++++++++++++++++++++++++++++++++++++++++++-
 trace-events          |   5 +++
 4 files changed, 124 insertions(+), 2 deletions(-)

diff --git a/net/colo.c b/net/colo.c
index 667df56..828a201 100644
--- a/net/colo.c
+++ b/net/colo.c
@@ -119,6 +119,8 @@ Connection *connection_new(ConnectionKey *key)
 
     conn->ip_proto = key->ip_proto;
     conn->processing = false;
+    conn->offset = 0;
+    conn->syn_flag = 0;
     g_queue_init(&conn->primary_list);
     g_queue_init(&conn->secondary_list);
 
diff --git a/net/colo.h b/net/colo.h
index 0efaa6d..5c3d003 100644
--- a/net/colo.h
+++ b/net/colo.h
@@ -50,6 +50,13 @@ typedef struct Connection {
     /* flag to enqueue unprocessed_connections */
     bool processing;
     uint8_t ip_proto;
+    /* offset = secondary_seq - primary_seq */
+    tcp_seq  offset;
+    /*
+     * we use this flag update offset func
+     * run once in independent tcp connection
+     */
+    int syn_flag;
 } Connection;
 
 uint32_t connection_key_hash(const void *opaque);
diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c
index 0cb3cef..c1cb7b2 100644
--- a/net/filter-rewriter.c
+++ b/net/filter-rewriter.c
@@ -21,6 +21,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/iov.h"
 #include "net/checksum.h"
+#include "trace.h"
 
 #define FILTER_COLO_REWRITER(obj) \
     OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER)
@@ -63,6 +64,93 @@ static int is_tcp_packet(Packet *pkt)
     }
 }
 
+/* handle tcp packet from primary guest */
+static int handle_primary_tcp_pkt(NetFilterState *nf,
+                                  Connection *conn,
+                                  Packet *pkt)
+{
+    struct tcphdr *tcp_pkt;
+
+    tcp_pkt = (struct tcphdr *)pkt->transport_header;
+    if (trace_event_get_state(TRACE_COLO_FILTER_REWRITER_DEBUG)) {
+        char *sdebug, *ddebug;
+        sdebug = strdup(inet_ntoa(pkt->ip->ip_src));
+        ddebug = strdup(inet_ntoa(pkt->ip->ip_dst));
+        trace_colo_filter_rewriter_pkt_info(__func__, sdebug, ddebug,
+                    ntohl(tcp_pkt->th_seq), ntohl(tcp_pkt->th_ack),
+                    tcp_pkt->th_flags);
+        trace_colo_filter_rewriter_conn_offset(conn->offset);
+        g_free(sdebug);
+        g_free(ddebug);
+    }
+
+    if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_SYN)) {
+        /*
+         * we use this flag update offset func
+         * run once in independent tcp connection
+         */
+        conn->syn_flag = 1;
+    }
+
+    if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK)) {
+        if (conn->syn_flag) {
+            /*
+             * offset = secondary_seq - primary seq
+             * ack packet sent by guest from primary node,
+             * so we use th_ack - 1 get primary_seq
+             */
+            conn->offset -= (ntohl(tcp_pkt->th_ack) - 1);
+            conn->syn_flag = 0;
+        }
+        /* handle packets to the secondary from the primary */
+        tcp_pkt->th_ack = htonl(ntohl(tcp_pkt->th_ack) + conn->offset);
+
+        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+    }
+
+    return 0;
+}
+
+/* handle tcp packet from secondary guest */
+static int handle_secondary_tcp_pkt(NetFilterState *nf,
+                                    Connection *conn,
+                                    Packet *pkt)
+{
+    struct tcphdr *tcp_pkt;
+
+    tcp_pkt = (struct tcphdr *)pkt->transport_header;
+
+    if (trace_event_get_state(TRACE_COLO_FILTER_REWRITER_DEBUG)) {
+        char *sdebug, *ddebug;
+        sdebug = strdup(inet_ntoa(pkt->ip->ip_src));
+        ddebug = strdup(inet_ntoa(pkt->ip->ip_dst));
+        trace_colo_filter_rewriter_pkt_info(__func__, sdebug, ddebug,
+                    ntohl(tcp_pkt->th_seq), ntohl(tcp_pkt->th_ack),
+                    tcp_pkt->th_flags);
+        trace_colo_filter_rewriter_conn_offset(conn->offset);
+        g_free(sdebug);
+        g_free(ddebug);
+    }
+
+    if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == (TH_ACK | TH_SYN))) {
+        /*
+         * save offset = secondary_seq and then
+         * in handle_primary_tcp_pkt make offset
+         * = secondary_seq - primary_seq
+         */
+        conn->offset = ntohl(tcp_pkt->th_seq);
+    }
+
+    if ((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK) {
+        /* handle packets to the primary from the secondary*/
+        tcp_pkt->th_seq = htonl(ntohl(tcp_pkt->th_seq) - conn->offset);
+
+        net_checksum_calculate((uint8_t *)pkt->data, pkt->size);
+    }
+
+    return 0;
+}
+
 static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
                                          NetClientState *sender,
                                          unsigned flags,
@@ -102,10 +190,30 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf,
 
         if (sender == nf->netdev) {
             /* NET_FILTER_DIRECTION_TX */
-            /* handle_primary_tcp_pkt */
+            if (!handle_primary_tcp_pkt(nf, conn, pkt)) {
+                qemu_net_queue_send(s->incoming_queue, sender, 0,
+                (const uint8_t *)pkt->data, pkt->size, NULL);
+                packet_destroy(pkt, NULL);
+                pkt = NULL;
+                /*
+                 * We block the packet here,after rewrite pkt
+                 * and will send it
+                 */
+                return 1;
+            }
         } else {
             /* NET_FILTER_DIRECTION_RX */
-            /* handle_secondary_tcp_pkt */
+            if (!handle_secondary_tcp_pkt(nf, conn, pkt)) {
+                qemu_net_queue_send(s->incoming_queue, sender, 0,
+                (const uint8_t *)pkt->data, pkt->size, NULL);
+                packet_destroy(pkt, NULL);
+                pkt = NULL;
+                /*
+                 * We block the packet here,after rewrite pkt
+                 * and will send it
+                 */
+                return 1;
+            }
         }
     }
 
diff --git a/trace-events b/trace-events
index ab22eb2..a12279c 100644
--- a/trace-events
+++ b/trace-events
@@ -1925,3 +1925,8 @@ colo_compare_icmp_miscompare(const char *sta, int size) ": %s = %d"
 colo_compare_ip_info(int psize, const char *sta, const char *stb, int ssize, const char *stc, const char *std) "ppkt size = %d, ip_src = %s, ip_dst = %s, spkt size = %d, ip_src = %s, ip_dst = %s"
 colo_old_packet_check_found(int64_t old_time) "%" PRId64
 colo_compare_miscompare(void) ""
+
+# net/filter-rewriter.c
+colo_filter_rewriter_debug(void) ""
+colo_filter_rewriter_pkt_info(const char *func, const char *src, const char *dst, uint32_t seq, uint32_t ack, uint32_t flag) "%s: src/dst: %s/%s p: seq/ack=%u/%u  flags=%x\n"
+colo_filter_rewriter_conn_offset(uint32_t offset) ": offset=%u\n"
-- 
2.7.4

  parent reply	other threads:[~2016-08-17  8:14 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-17  8:10 [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 01/10] qemu-char: Add qemu_chr_add_handlers_full() for GMaincontext Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 02/10] colo-compare: introduce colo compare initialization Zhang Chen
2016-08-31  7:53   ` Jason Wang
2016-08-31  8:06     ` Hailiang Zhang
2016-08-31  9:03     ` Zhang Chen
2016-08-31  9:20       ` Jason Wang
2016-08-31  9:39         ` Zhang Chen
2016-09-01  6:32           ` Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 03/10] net/colo.c: add colo.c to define and handle packet Zhang Chen
2016-08-31  8:04   ` Jason Wang
2016-08-31  9:19     ` Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 04/10] Jhash: add linux kernel jhashtable in qemu Zhang Chen
2016-08-31  8:05   ` Jason Wang
2016-08-31  9:20     ` Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 05/10] colo-compare: track connection and enqueue packet Zhang Chen
2016-08-31  8:52   ` Jason Wang
2016-08-31 11:52     ` Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 06/10] colo-compare: introduce packet comparison thread Zhang Chen
2016-08-31  9:13   ` Jason Wang
2016-09-01  4:50     ` Zhang Chen
2016-09-01  7:38       ` Jason Wang
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 07/10] colo-compare: add TCP, UDP, ICMP packet comparison Zhang Chen
2016-08-31  9:33   ` Jason Wang
2016-09-01  5:00     ` Zhang Chen
2016-09-01  7:40       ` Jason Wang
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 08/10] filter-rewriter: introduce filter-rewriter initialization Zhang Chen
2016-08-17  8:10 ` [Qemu-devel] [PATCH V12 09/10] filter-rewriter: track connection and parse packet Zhang Chen
2016-08-17  8:10 ` Zhang Chen [this message]
2016-08-25  3:44 ` [Qemu-devel] [PATCH V12 00/10] Introduce COLO-compare and filter-rewriter Zhang Chen
2016-08-25  4:07   ` Jason Wang
2016-08-31  9:39 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1471421428-26379-11-git-send-email-zhangchen.fnst@cn.fujitsu.com \
    --to=zhangchen.fnst@cn.fujitsu.com \
    --cc=dgilbert@redhat.com \
    --cc=eddie.dong@intel.com \
    --cc=jasowang@redhat.com \
    --cc=lizhijian@cn.fujitsu.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wency@cn.fujitsu.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.