All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH V2 0/4] Optimize COLO-compare performance
@ 2017-07-13  5:52 Zhang Chen
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance Zhang Chen
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Zhang Chen @ 2017-07-13  5:52 UTC (permalink / raw)
  To: qemu devel, Jason Wang; +Cc: Zhang Chen, zhanghailiang, Li Zhijian

In this serise, we do a lot of job to optimize COLO net performance.
Mainly focus on TCP protocol.

V2:
 - Rename p2's subject.

Zhang Chen (4):
  net/colo-compare.c: Add checkpoint min period to optimize performance
  net/colo-compare.c: Compare the tcp packets that has the same sequence
    number
  net/colo-compare.c: Optimize unpredictable tcp options comparison
  net/colo-compare.c: Adjust net queue pop order for performance

 net/colo-compare.c | 57 ++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 45 insertions(+), 12 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance
  2017-07-13  5:52 [Qemu-devel] [PATCH V2 0/4] Optimize COLO-compare performance Zhang Chen
@ 2017-07-13  5:52 ` Zhang Chen
  2017-07-14  3:22   ` Jason Wang
  2017-07-14 12:10   ` Dr. David Alan Gilbert
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number Zhang Chen
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 19+ messages in thread
From: Zhang Chen @ 2017-07-13  5:52 UTC (permalink / raw)
  To: qemu devel, Jason Wang; +Cc: Zhang Chen, zhanghailiang, Li Zhijian

If colo-compare find out the first different packet that means
the following packet almost is different. we needn't do a lot
of checkpoint in this time, so we set the no-need-checkpoint
peroid, default just set 3 second.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-compare.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 6d500e1..0f8e198 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -40,6 +40,9 @@
 /* TODO: Should be configurable */
 #define REGULAR_PACKET_CHECK_MS 3000
 
+/* TODO: Should be configurable */
+#define CHECKPOINT_MIN_TIME 3000
+
 /*
   + CompareState ++
   |               |
@@ -455,6 +458,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
     Packet *pkt = NULL;
     GList *result = NULL;
     int ret;
+    static int64_t checkpoint_time_ms;
 
     while (!g_queue_is_empty(&conn->primary_list) &&
            !g_queue_is_empty(&conn->secondary_list)) {
@@ -494,7 +498,14 @@ static void colo_compare_connection(void *opaque, void *user_data)
              */
             trace_colo_compare_main("packet different");
             g_queue_push_tail(&conn->primary_list, pkt);
-            /* TODO: colo_notify_checkpoint();*/
+
+            if (pkt->creation_ms - checkpoint_time_ms > CHECKPOINT_MIN_TIME) {
+                /*
+                 * TODO: Notify colo frame to do checkpoint.
+                 * colo_compare_inconsistent_notify();
+                 */
+                checkpoint_time_ms = pkt->creation_ms;
+            }
             break;
         }
     }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number
  2017-07-13  5:52 [Qemu-devel] [PATCH V2 0/4] Optimize COLO-compare performance Zhang Chen
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance Zhang Chen
@ 2017-07-13  5:52 ` Zhang Chen
  2017-07-14  3:25   ` Jason Wang
  2017-07-14 12:24   ` Dr. David Alan Gilbert
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 3/4] net/colo-compare.c: Optimize unpredictable tcp options comparison Zhang Chen
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 4/4] net/colo-compare.c: Adjust net queue pop order for performance Zhang Chen
  3 siblings, 2 replies; 19+ messages in thread
From: Zhang Chen @ 2017-07-13  5:52 UTC (permalink / raw)
  To: qemu devel, Jason Wang; +Cc: Zhang Chen, zhanghailiang, Li Zhijian

If primary packet's sequence number not same with secondary packet's
sequence number, no need to compare the packet other field.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-compare.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 0f8e198..2caeb80 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
     ptcp = (struct tcphdr *)ppkt->transport_header;
     stcp = (struct tcphdr *)spkt->transport_header;
 
+    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
+        ptcp->th_seq != stcp->th_seq) {
+        trace_colo_compare_main("colo_packet_compare_tcp seq not same");
+        return -1;
+    }
+
     /*
      * The 'identification' field in the IP header is *very* random
      * it almost never matches.  Fudge this by ignoring differences in
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH V2 3/4] net/colo-compare.c: Optimize unpredictable tcp options comparison
  2017-07-13  5:52 [Qemu-devel] [PATCH V2 0/4] Optimize COLO-compare performance Zhang Chen
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance Zhang Chen
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number Zhang Chen
@ 2017-07-13  5:52 ` Zhang Chen
  2017-07-14  3:33   ` Jason Wang
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 4/4] net/colo-compare.c: Adjust net queue pop order for performance Zhang Chen
  3 siblings, 1 reply; 19+ messages in thread
From: Zhang Chen @ 2017-07-13  5:52 UTC (permalink / raw)
  To: qemu devel, Jason Wang; +Cc: Zhang Chen, zhanghailiang, Li Zhijian

When network is busy, some tcp options(like sack) will unpredictable
occur in primary side or secondary side. it will make packet size
not same, but the two packet's payload is identical. colo just
care about packet payload, so we skip the option field.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-compare.c | 34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 2caeb80..6406c4a 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -183,7 +183,10 @@ static int packet_enqueue(CompareState *s, int mode)
  * return:    0  means packet same
  *            > 0 || < 0 means packet different
  */
-static int colo_packet_compare_common(Packet *ppkt, Packet *spkt, int offset)
+static int colo_packet_compare_common(Packet *ppkt,
+                                      Packet *spkt,
+                                      int poffset,
+                                      int soffset)
 {
     if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
         char pri_ip_src[20], pri_ip_dst[20], sec_ip_src[20], sec_ip_dst[20];
@@ -198,9 +201,10 @@ static int colo_packet_compare_common(Packet *ppkt, Packet *spkt, int offset)
                                    sec_ip_src, sec_ip_dst);
     }
 
-    if (ppkt->size == spkt->size) {
-        return memcmp(ppkt->data + offset, spkt->data + offset,
-                      spkt->size - offset);
+    if (ppkt->size == spkt->size || poffset != soffset) {
+        return memcmp(ppkt->data + poffset,
+                      spkt->data + soffset,
+                      spkt->size - soffset);
     } else {
         trace_colo_compare_main("Net packet size are not the same");
         return -1;
@@ -263,12 +267,22 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
      * so we just need skip this field.
      */
     if (ptcp->th_off > 5) {
-        ptrdiff_t tcp_offset;
-        tcp_offset = ppkt->transport_header - (uint8_t *)ppkt->data
+        ptrdiff_t ptcp_offset, stcp_offset;
+
+        ptcp_offset = ppkt->transport_header - (uint8_t *)ppkt->data
                      + (ptcp->th_off * 4);
-        res = colo_packet_compare_common(ppkt, spkt, tcp_offset);
+        stcp_offset = spkt->transport_header - (uint8_t *)spkt->data
+                     + (stcp->th_off * 4);
+
+        /*
+         * When network is busy, some tcp options(like sack) will unpredictable
+         * occur in primary side or secondary side. it will make packet size
+         * not same, but the two packet's payload is identical. colo just
+         * care about packet payload, so we skip the option field.
+         */
+        res = colo_packet_compare_common(ppkt, spkt, ptcp_offset, stcp_offset);
     } else if (ptcp->th_sum == stcp->th_sum) {
-        res = colo_packet_compare_common(ppkt, spkt, ETH_HLEN);
+        res = colo_packet_compare_common(ppkt, spkt, ETH_HLEN, ETH_HLEN);
     } else {
         res = -1;
     }
@@ -328,6 +342,7 @@ static int colo_packet_compare_udp(Packet *spkt, Packet *ppkt)
      * the ip payload here.
      */
     ret = colo_packet_compare_common(ppkt, spkt,
+                                     network_header_length + ETH_HLEN,
                                      network_header_length + ETH_HLEN);
 
     if (ret) {
@@ -365,6 +380,7 @@ static int colo_packet_compare_icmp(Packet *spkt, Packet *ppkt)
      * the ip payload here.
      */
     if (colo_packet_compare_common(ppkt, spkt,
+                                   network_header_length + ETH_HLEN,
                                    network_header_length + ETH_HLEN)) {
         trace_colo_compare_icmp_miscompare("primary pkt size",
                                            ppkt->size);
@@ -402,7 +418,7 @@ static int colo_packet_compare_other(Packet *spkt, Packet *ppkt)
                                    sec_ip_src, sec_ip_dst);
     }
 
-    return colo_packet_compare_common(ppkt, spkt, 0);
+    return colo_packet_compare_common(ppkt, spkt, 0, 0);
 }
 
 static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH V2 4/4] net/colo-compare.c: Adjust net queue pop order for performance
  2017-07-13  5:52 [Qemu-devel] [PATCH V2 0/4] Optimize COLO-compare performance Zhang Chen
                   ` (2 preceding siblings ...)
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 3/4] net/colo-compare.c: Optimize unpredictable tcp options comparison Zhang Chen
@ 2017-07-13  5:52 ` Zhang Chen
  3 siblings, 0 replies; 19+ messages in thread
From: Zhang Chen @ 2017-07-13  5:52 UTC (permalink / raw)
  To: qemu devel, Jason Wang; +Cc: Zhang Chen, zhanghailiang, Li Zhijian

The packet_enqueue() use g_queue_push_tail() to
enqueue net packet, so it is more efficent way use
g_queue_pop_head() to get packet for compare.
That will improve the success rate of comparison.
In my test the performance of ftp put 1000M file will increase 10%

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-compare.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 6406c4a..9397269 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -484,7 +484,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
 
     while (!g_queue_is_empty(&conn->primary_list) &&
            !g_queue_is_empty(&conn->secondary_list)) {
-        pkt = g_queue_pop_tail(&conn->primary_list);
+        pkt = g_queue_pop_head(&conn->primary_list);
         switch (conn->ip_proto) {
         case IPPROTO_TCP:
             result = g_queue_find_custom(&conn->secondary_list,
@@ -519,7 +519,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
              * until next comparison.
              */
             trace_colo_compare_main("packet different");
-            g_queue_push_tail(&conn->primary_list, pkt);
+            g_queue_push_head(&conn->primary_list, pkt);
 
             if (pkt->creation_ms - checkpoint_time_ms > CHECKPOINT_MIN_TIME) {
                 /*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance Zhang Chen
@ 2017-07-14  3:22   ` Jason Wang
  2017-07-17  6:42     ` Zhang Chen
  2017-07-14 12:10   ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 19+ messages in thread
From: Jason Wang @ 2017-07-14  3:22 UTC (permalink / raw)
  To: Zhang Chen, qemu devel; +Cc: zhanghailiang, Li Zhijian



On 2017年07月13日 13:52, Zhang Chen wrote:
> If colo-compare find out the first different packet that means
> the following packet almost is different. we needn't do a lot
> of checkpoint in this time, so we set the no-need-checkpoint
> peroid, default just set 3 second.
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/colo-compare.c | 13 ++++++++++++-
>   1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 6d500e1..0f8e198 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -40,6 +40,9 @@
>   /* TODO: Should be configurable */
>   #define REGULAR_PACKET_CHECK_MS 3000
>   
> +/* TODO: Should be configurable */
> +#define CHECKPOINT_MIN_TIME 3000
> +
>   /*
>     + CompareState ++
>     |               |
> @@ -455,6 +458,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
>       Packet *pkt = NULL;
>       GList *result = NULL;
>       int ret;
> +    static int64_t checkpoint_time_ms;

Let's avoid static variable here since we support more than one compare 
instance.

Thanks

>   
>       while (!g_queue_is_empty(&conn->primary_list) &&
>              !g_queue_is_empty(&conn->secondary_list)) {
> @@ -494,7 +498,14 @@ static void colo_compare_connection(void *opaque, void *user_data)
>                */
>               trace_colo_compare_main("packet different");
>               g_queue_push_tail(&conn->primary_list, pkt);
> -            /* TODO: colo_notify_checkpoint();*/
> +
> +            if (pkt->creation_ms - checkpoint_time_ms > CHECKPOINT_MIN_TIME) {
> +                /*
> +                 * TODO: Notify colo frame to do checkpoint.
> +                 * colo_compare_inconsistent_notify();
> +                 */
> +                checkpoint_time_ms = pkt->creation_ms;
> +            }
>               break;
>           }
>       }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number Zhang Chen
@ 2017-07-14  3:25   ` Jason Wang
  2017-07-17  7:39     ` Zhang Chen
  2017-07-14 12:24   ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 19+ messages in thread
From: Jason Wang @ 2017-07-14  3:25 UTC (permalink / raw)
  To: Zhang Chen, qemu devel; +Cc: Li Zhijian, zhanghailiang



On 2017年07月13日 13:52, Zhang Chen wrote:
> If primary packet's sequence number not same with secondary packet's
> sequence number, no need to compare the packet other field.
>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/colo-compare.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 0f8e198..2caeb80 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
>       ptcp = (struct tcphdr *)ppkt->transport_header;
>       stcp = (struct tcphdr *)spkt->transport_header;
>   
> +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
> +        ptcp->th_seq != stcp->th_seq) {
> +        trace_colo_compare_main("colo_packet_compare_tcp seq not same");
> +        return -1;
> +    }
> +
>       /*
>        * The 'identification' field in the IP header is *very* random
>        * it almost never matches.  Fudge this by ignoring differences in

Do we have any statistics numbers for this?

Thanks

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 3/4] net/colo-compare.c: Optimize unpredictable tcp options comparison
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 3/4] net/colo-compare.c: Optimize unpredictable tcp options comparison Zhang Chen
@ 2017-07-14  3:33   ` Jason Wang
  2017-07-17  9:06     ` Zhang Chen
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2017-07-14  3:33 UTC (permalink / raw)
  To: Zhang Chen, qemu devel; +Cc: zhanghailiang, Li Zhijian



On 2017年07月13日 13:52, Zhang Chen wrote:
> When network is busy, some tcp options(like sack) will unpredictable
> occur in primary side or secondary side. it will make packet size
> not same, but the two packet's payload is identical. colo just
> care about packet payload, so we skip the option field.

A question is, if SACK were not same, does it mean e.g some packet were 
lost just for primary or secondary? If yes, we will be out of sync soon. 
Is it really better to delay the checkpoint here?

Thanks

>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/colo-compare.c | 34 +++++++++++++++++++++++++---------
>   1 file changed, 25 insertions(+), 9 deletions(-)
>
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 2caeb80..6406c4a 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -183,7 +183,10 @@ static int packet_enqueue(CompareState *s, int mode)
>    * return:    0  means packet same
>    *            > 0 || < 0 means packet different
>    */
> -static int colo_packet_compare_common(Packet *ppkt, Packet *spkt, int offset)
> +static int colo_packet_compare_common(Packet *ppkt,
> +                                      Packet *spkt,
> +                                      int poffset,
> +                                      int soffset)
>   {
>       if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
>           char pri_ip_src[20], pri_ip_dst[20], sec_ip_src[20], sec_ip_dst[20];
> @@ -198,9 +201,10 @@ static int colo_packet_compare_common(Packet *ppkt, Packet *spkt, int offset)
>                                      sec_ip_src, sec_ip_dst);
>       }
>   
> -    if (ppkt->size == spkt->size) {
> -        return memcmp(ppkt->data + offset, spkt->data + offset,
> -                      spkt->size - offset);
> +    if (ppkt->size == spkt->size || poffset != soffset) {
> +        return memcmp(ppkt->data + poffset,
> +                      spkt->data + soffset,
> +                      spkt->size - soffset);
>       } else {
>           trace_colo_compare_main("Net packet size are not the same");
>           return -1;
> @@ -263,12 +267,22 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
>        * so we just need skip this field.
>        */
>       if (ptcp->th_off > 5) {
> -        ptrdiff_t tcp_offset;
> -        tcp_offset = ppkt->transport_header - (uint8_t *)ppkt->data
> +        ptrdiff_t ptcp_offset, stcp_offset;
> +
> +        ptcp_offset = ppkt->transport_header - (uint8_t *)ppkt->data
>                        + (ptcp->th_off * 4);
> -        res = colo_packet_compare_common(ppkt, spkt, tcp_offset);
> +        stcp_offset = spkt->transport_header - (uint8_t *)spkt->data
> +                     + (stcp->th_off * 4);
> +
> +        /*
> +         * When network is busy, some tcp options(like sack) will unpredictable
> +         * occur in primary side or secondary side. it will make packet size
> +         * not same, but the two packet's payload is identical. colo just
> +         * care about packet payload, so we skip the option field.
> +         */
> +        res = colo_packet_compare_common(ppkt, spkt, ptcp_offset, stcp_offset);
>       } else if (ptcp->th_sum == stcp->th_sum) {
> -        res = colo_packet_compare_common(ppkt, spkt, ETH_HLEN);
> +        res = colo_packet_compare_common(ppkt, spkt, ETH_HLEN, ETH_HLEN);
>       } else {
>           res = -1;
>       }
> @@ -328,6 +342,7 @@ static int colo_packet_compare_udp(Packet *spkt, Packet *ppkt)
>        * the ip payload here.
>        */
>       ret = colo_packet_compare_common(ppkt, spkt,
> +                                     network_header_length + ETH_HLEN,
>                                        network_header_length + ETH_HLEN);
>   
>       if (ret) {
> @@ -365,6 +380,7 @@ static int colo_packet_compare_icmp(Packet *spkt, Packet *ppkt)
>        * the ip payload here.
>        */
>       if (colo_packet_compare_common(ppkt, spkt,
> +                                   network_header_length + ETH_HLEN,
>                                      network_header_length + ETH_HLEN)) {
>           trace_colo_compare_icmp_miscompare("primary pkt size",
>                                              ppkt->size);
> @@ -402,7 +418,7 @@ static int colo_packet_compare_other(Packet *spkt, Packet *ppkt)
>                                      sec_ip_src, sec_ip_dst);
>       }
>   
> -    return colo_packet_compare_common(ppkt, spkt, 0);
> +    return colo_packet_compare_common(ppkt, spkt, 0, 0);
>   }
>   
>   static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance Zhang Chen
  2017-07-14  3:22   ` Jason Wang
@ 2017-07-14 12:10   ` Dr. David Alan Gilbert
  2017-07-17  9:33     ` Zhang Chen
  1 sibling, 1 reply; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-14 12:10 UTC (permalink / raw)
  To: Zhang Chen; +Cc: qemu devel, Jason Wang, Li Zhijian, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> If colo-compare find out the first different packet that means
> the following packet almost is different. we needn't do a lot
> of checkpoint in this time, so we set the no-need-checkpoint
> peroid, default just set 3 second.
> 
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  net/colo-compare.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 6d500e1..0f8e198 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -40,6 +40,9 @@
>  /* TODO: Should be configurable */
>  #define REGULAR_PACKET_CHECK_MS 3000
>  
> +/* TODO: Should be configurable */

Yes it should!

> +#define CHECKPOINT_MIN_TIME 3000
> +
>  /*
>    + CompareState ++
>    |               |
> @@ -455,6 +458,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
>      Packet *pkt = NULL;
>      GList *result = NULL;
>      int ret;
> +    static int64_t checkpoint_time_ms;
>  
>      while (!g_queue_is_empty(&conn->primary_list) &&
>             !g_queue_is_empty(&conn->secondary_list)) {
> @@ -494,7 +498,14 @@ static void colo_compare_connection(void *opaque, void *user_data)
>               */
>              trace_colo_compare_main("packet different");
>              g_queue_push_tail(&conn->primary_list, pkt);
> -            /* TODO: colo_notify_checkpoint();*/
> +
> +            if (pkt->creation_ms - checkpoint_time_ms > CHECKPOINT_MIN_TIME) {
> +                /*
> +                 * TODO: Notify colo frame to do checkpoint.
> +                 * colo_compare_inconsistent_notify();
> +                 */
> +                checkpoint_time_ms = pkt->creation_ms;
> +            }

You need to be careful how this interacts with the actual start of the
checkpoint.   Lets say you have two miscompared packets close to each
other:


    miscompare!
         checkpoint
    miscompare!
         ignore it because it was close to the 1st one

   That means we never trigger the 2nd checkpoint and it'll carry on
until the maximum checkpoint length.

   But also, I think you need to consider what happens to future packets
being compared; you can't release any packets now until the checkpoint
as soon as you know there's a miscompare.

Dave

>              break;


>          }
>      }
> -- 
> 2.7.4
> 
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number
  2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number Zhang Chen
  2017-07-14  3:25   ` Jason Wang
@ 2017-07-14 12:24   ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-14 12:24 UTC (permalink / raw)
  To: Zhang Chen; +Cc: qemu devel, Jason Wang, Li Zhijian, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> If primary packet's sequence number not same with secondary packet's
> sequence number, no need to compare the packet other field.
> 
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>  net/colo-compare.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/net/colo-compare.c b/net/colo-compare.c
> index 0f8e198..2caeb80 100644
> --- a/net/colo-compare.c
> +++ b/net/colo-compare.c
> @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
>      ptcp = (struct tcphdr *)ppkt->transport_header;
>      stcp = (struct tcphdr *)spkt->transport_header;
>  
> +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
> +        ptcp->th_seq != stcp->th_seq) {
> +        trace_colo_compare_main("colo_packet_compare_tcp seq not same");
> +        return -1;
> +    }

Do you need to check that the stcp->th_flags is the same ?

Looking back at patches I had in this area; I was doing
  if (ptcp->th_flags == stcp->th_flags &&

see:
   https://github.com/orbitfp7/qemu/commit/848ca1113aec802dd032fd5b6d6b301931b3e1e0

Dave

>      /*
>       * The 'identification' field in the IP header is *very* random
>       * it almost never matches.  Fudge this by ignoring differences in
> -- 
> 2.7.4
> 
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance
  2017-07-14  3:22   ` Jason Wang
@ 2017-07-17  6:42     ` Zhang Chen
  0 siblings, 0 replies; 19+ messages in thread
From: Zhang Chen @ 2017-07-17  6:42 UTC (permalink / raw)
  To: Jason Wang, qemu devel; +Cc: zhangchen.fnst, zhanghailiang, Li Zhijian



On 07/14/2017 11:22 AM, Jason Wang wrote:
>
>
> On 2017年07月13日 13:52, Zhang Chen wrote:
>> If colo-compare find out the first different packet that means
>> the following packet almost is different. we needn't do a lot
>> of checkpoint in this time, so we set the no-need-checkpoint
>> peroid, default just set 3 second.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-compare.c | 13 ++++++++++++-
>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 6d500e1..0f8e198 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -40,6 +40,9 @@
>>   /* TODO: Should be configurable */
>>   #define REGULAR_PACKET_CHECK_MS 3000
>>   +/* TODO: Should be configurable */
>> +#define CHECKPOINT_MIN_TIME 3000
>> +
>>   /*
>>     + CompareState ++
>>     |               |
>> @@ -455,6 +458,7 @@ static void colo_compare_connection(void *opaque, 
>> void *user_data)
>>       Packet *pkt = NULL;
>>       GList *result = NULL;
>>       int ret;
>> +    static int64_t checkpoint_time_ms;
>
> Let's avoid static variable here since we support more than one 
> compare instance.

OK, I will add the "checkpoint_time_ms" to CompareState.

Thanks
Zhang Chen

>
> Thanks
>
>>         while (!g_queue_is_empty(&conn->primary_list) &&
>>              !g_queue_is_empty(&conn->secondary_list)) {
>> @@ -494,7 +498,14 @@ static void colo_compare_connection(void 
>> *opaque, void *user_data)
>>                */
>>               trace_colo_compare_main("packet different");
>>               g_queue_push_tail(&conn->primary_list, pkt);
>> -            /* TODO: colo_notify_checkpoint();*/
>> +
>> +            if (pkt->creation_ms - checkpoint_time_ms > 
>> CHECKPOINT_MIN_TIME) {
>> +                /*
>> +                 * TODO: Notify colo frame to do checkpoint.
>> +                 * colo_compare_inconsistent_notify();
>> +                 */
>> +                checkpoint_time_ms = pkt->creation_ms;
>> +            }
>>               break;
>>           }
>>       }
>
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number
  2017-07-14  3:25   ` Jason Wang
@ 2017-07-17  7:39     ` Zhang Chen
  2017-07-17  8:55       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 19+ messages in thread
From: Zhang Chen @ 2017-07-17  7:39 UTC (permalink / raw)
  To: Jason Wang, qemu devel
  Cc: zhangchen.fnst, Li Zhijian, zhanghailiang, Dr. David Alan Gilbert



On 07/14/2017 11:25 AM, Jason Wang wrote:
>
>
> On 2017年07月13日 13:52, Zhang Chen wrote:
>> If primary packet's sequence number not same with secondary packet's
>> sequence number, no need to compare the packet other field.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-compare.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 0f8e198..2caeb80 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet *spkt, 
>> Packet *ppkt)
>>       ptcp = (struct tcphdr *)ppkt->transport_header;
>>       stcp = (struct tcphdr *)spkt->transport_header;
>>   +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
>> +        ptcp->th_seq != stcp->th_seq) {
>> +        trace_colo_compare_main("colo_packet_compare_tcp seq not 
>> same");
>> +        return -1;
>> +    }
>> +
>>       /*
>>        * The 'identification' field in the IP header is *very* random
>>        * it almost never matches.  Fudge this by ignoring differences in
>
> Do we have any statistics numbers for this?

Rethink about this patch, I will remove it in next version and send a 
independent
patch in the future.
Because in FTP get test, primary guest send lots of packet differ to 
secondary guest's,
the packet payload are not same, but the total payload are same.
I think I have to buffer some packet's payload depend on sequence number 
for comparison?
Any idea about this?

Thanks
Zhang Chen

>
> Thanks
>
>
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number
  2017-07-17  7:39     ` Zhang Chen
@ 2017-07-17  8:55       ` Dr. David Alan Gilbert
  2017-07-17  9:23         ` Zhang Chen
  0 siblings, 1 reply; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-17  8:55 UTC (permalink / raw)
  To: Zhang Chen; +Cc: Jason Wang, qemu devel, Li Zhijian, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> 
> 
> On 07/14/2017 11:25 AM, Jason Wang wrote:
> > 
> > 
> > On 2017年07月13日 13:52, Zhang Chen wrote:
> > > If primary packet's sequence number not same with secondary packet's
> > > sequence number, no need to compare the packet other field.
> > > 
> > > Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> > > ---
> > >   net/colo-compare.c | 6 ++++++
> > >   1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/net/colo-compare.c b/net/colo-compare.c
> > > index 0f8e198..2caeb80 100644
> > > --- a/net/colo-compare.c
> > > +++ b/net/colo-compare.c
> > > @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet
> > > *spkt, Packet *ppkt)
> > >       ptcp = (struct tcphdr *)ppkt->transport_header;
> > >       stcp = (struct tcphdr *)spkt->transport_header;
> > >   +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
> > > +        ptcp->th_seq != stcp->th_seq) {
> > > +        trace_colo_compare_main("colo_packet_compare_tcp seq not
> > > same");
> > > +        return -1;
> > > +    }
> > > +
> > >       /*
> > >        * The 'identification' field in the IP header is *very* random
> > >        * it almost never matches.  Fudge this by ignoring differences in
> > 
> > Do we have any statistics numbers for this?
> 
> Rethink about this patch, I will remove it in next version and send a
> independent
> patch in the future.
> Because in FTP get test, primary guest send lots of packet differ to
> secondary guest's,
> the packet payload are not same, but the total payload are same.

Do you mean that the TCP stream is the same but the packet sizes are
different due to different fragmentation?

> I think I have to buffer some packet's payload depend on sequence number for
> comparison?
> Any idea about this?

The original COLO discussions ~2-3 years ago talked about performing TCP
reassembly and comparing the TCP stream; not a simple task.

But the version I worked with also had the rewrite of the sequence
numbers on the secondary to cause them to match even with the same
fragmentation - but that doesn't seem to be upstream yet.

Dave

> 
> Thanks
> Zhang Chen
> 
> > 
> > Thanks
> > 
> > 
> > 
> 
> -- 
> Thanks
> Zhang Chen
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 3/4] net/colo-compare.c: Optimize unpredictable tcp options comparison
  2017-07-14  3:33   ` Jason Wang
@ 2017-07-17  9:06     ` Zhang Chen
  0 siblings, 0 replies; 19+ messages in thread
From: Zhang Chen @ 2017-07-17  9:06 UTC (permalink / raw)
  To: Jason Wang, qemu devel; +Cc: zhangchen.fnst, zhanghailiang, Li Zhijian



On 07/14/2017 11:33 AM, Jason Wang wrote:
>
>
> On 2017年07月13日 13:52, Zhang Chen wrote:
>> When network is busy, some tcp options(like sack) will unpredictable
>> occur in primary side or secondary side. it will make packet size
>> not same, but the two packet's payload is identical. colo just
>> care about packet payload, so we skip the option field.
>
> A question is, if SACK were not same, does it mean e.g some packet 
> were lost just for primary or secondary? If yes, we will be out of 
> sync soon. Is it really better to delay the checkpoint here?

The SACK is designed to optimize TCP fast retransmit, but in COLO 
situation, this TCP options field
will make COLO-compare trigger checkpoint frequently, and we use normal 
TCP ACK to do retransmit job
will get better performance. In the worst situation, some skipped TCP 
options will make primary and secondary
send different packet after that, it will trigger checkpoint very soon. 
So, I think we no need care
the SACK field in COLO situation.

Thanks
Zhang Chen


>
> Thanks
>
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-compare.c | 34 +++++++++++++++++++++++++---------
>>   1 file changed, 25 insertions(+), 9 deletions(-)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 2caeb80..6406c4a 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -183,7 +183,10 @@ static int packet_enqueue(CompareState *s, int 
>> mode)
>>    * return:    0  means packet same
>>    *            > 0 || < 0 means packet different
>>    */
>> -static int colo_packet_compare_common(Packet *ppkt, Packet *spkt, 
>> int offset)
>> +static int colo_packet_compare_common(Packet *ppkt,
>> +                                      Packet *spkt,
>> +                                      int poffset,
>> +                                      int soffset)
>>   {
>>       if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
>>           char pri_ip_src[20], pri_ip_dst[20], sec_ip_src[20], 
>> sec_ip_dst[20];
>> @@ -198,9 +201,10 @@ static int colo_packet_compare_common(Packet 
>> *ppkt, Packet *spkt, int offset)
>>                                      sec_ip_src, sec_ip_dst);
>>       }
>>   -    if (ppkt->size == spkt->size) {
>> -        return memcmp(ppkt->data + offset, spkt->data + offset,
>> -                      spkt->size - offset);
>> +    if (ppkt->size == spkt->size || poffset != soffset) {
>> +        return memcmp(ppkt->data + poffset,
>> +                      spkt->data + soffset,
>> +                      spkt->size - soffset);
>>       } else {
>>           trace_colo_compare_main("Net packet size are not the same");
>>           return -1;
>> @@ -263,12 +267,22 @@ static int colo_packet_compare_tcp(Packet 
>> *spkt, Packet *ppkt)
>>        * so we just need skip this field.
>>        */
>>       if (ptcp->th_off > 5) {
>> -        ptrdiff_t tcp_offset;
>> -        tcp_offset = ppkt->transport_header - (uint8_t *)ppkt->data
>> +        ptrdiff_t ptcp_offset, stcp_offset;
>> +
>> +        ptcp_offset = ppkt->transport_header - (uint8_t *)ppkt->data
>>                        + (ptcp->th_off * 4);
>> -        res = colo_packet_compare_common(ppkt, spkt, tcp_offset);
>> +        stcp_offset = spkt->transport_header - (uint8_t *)spkt->data
>> +                     + (stcp->th_off * 4);
>> +
>> +        /*
>> +         * When network is busy, some tcp options(like sack) will 
>> unpredictable
>> +         * occur in primary side or secondary side. it will make 
>> packet size
>> +         * not same, but the two packet's payload is identical. colo 
>> just
>> +         * care about packet payload, so we skip the option field.
>> +         */
>> +        res = colo_packet_compare_common(ppkt, spkt, ptcp_offset, 
>> stcp_offset);
>>       } else if (ptcp->th_sum == stcp->th_sum) {
>> -        res = colo_packet_compare_common(ppkt, spkt, ETH_HLEN);
>> +        res = colo_packet_compare_common(ppkt, spkt, ETH_HLEN, 
>> ETH_HLEN);
>>       } else {
>>           res = -1;
>>       }
>> @@ -328,6 +342,7 @@ static int colo_packet_compare_udp(Packet *spkt, 
>> Packet *ppkt)
>>        * the ip payload here.
>>        */
>>       ret = colo_packet_compare_common(ppkt, spkt,
>> +                                     network_header_length + ETH_HLEN,
>>                                        network_header_length + 
>> ETH_HLEN);
>>         if (ret) {
>> @@ -365,6 +380,7 @@ static int colo_packet_compare_icmp(Packet *spkt, 
>> Packet *ppkt)
>>        * the ip payload here.
>>        */
>>       if (colo_packet_compare_common(ppkt, spkt,
>> +                                   network_header_length + ETH_HLEN,
>>                                      network_header_length + 
>> ETH_HLEN)) {
>>           trace_colo_compare_icmp_miscompare("primary pkt size",
>>                                              ppkt->size);
>> @@ -402,7 +418,7 @@ static int colo_packet_compare_other(Packet 
>> *spkt, Packet *ppkt)
>>                                      sec_ip_src, sec_ip_dst);
>>       }
>>   -    return colo_packet_compare_common(ppkt, spkt, 0);
>> +    return colo_packet_compare_common(ppkt, spkt, 0, 0);
>>   }
>>     static int colo_old_packet_check_one(Packet *pkt, int64_t 
>> *check_time)
>
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number
  2017-07-17  8:55       ` Dr. David Alan Gilbert
@ 2017-07-17  9:23         ` Zhang Chen
  2017-07-17 10:02           ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 19+ messages in thread
From: Zhang Chen @ 2017-07-17  9:23 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: zhangchen.fnst, Jason Wang, qemu devel, Li Zhijian, zhanghailiang



On 07/17/2017 04:55 PM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>>
>> On 07/14/2017 11:25 AM, Jason Wang wrote:
>>>
>>> On 2017年07月13日 13:52, Zhang Chen wrote:
>>>> If primary packet's sequence number not same with secondary packet's
>>>> sequence number, no need to compare the packet other field.
>>>>
>>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>>> ---
>>>>    net/colo-compare.c | 6 ++++++
>>>>    1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>>> index 0f8e198..2caeb80 100644
>>>> --- a/net/colo-compare.c
>>>> +++ b/net/colo-compare.c
>>>> @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet
>>>> *spkt, Packet *ppkt)
>>>>        ptcp = (struct tcphdr *)ppkt->transport_header;
>>>>        stcp = (struct tcphdr *)spkt->transport_header;
>>>>    +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
>>>> +        ptcp->th_seq != stcp->th_seq) {
>>>> +        trace_colo_compare_main("colo_packet_compare_tcp seq not
>>>> same");
>>>> +        return -1;
>>>> +    }
>>>> +
>>>>        /*
>>>>         * The 'identification' field in the IP header is *very* random
>>>>         * it almost never matches.  Fudge this by ignoring differences in
>>> Do we have any statistics numbers for this?
>> Rethink about this patch, I will remove it in next version and send a
>> independent
>> patch in the future.
>> Because in FTP get test, primary guest send lots of packet differ to
>> secondary guest's,
>> the packet payload are not same, but the total payload are same.
> Do you mean that the TCP stream is the same but the packet sizes are
> different due to different fragmentation?

Yes, like that:
We send this payload: "1234567890".

primary:
pkt1 payload:"123"
pkt2 payload:"4567890"

secondary:
pkt1 payload:"1234567890"


>
>> I think I have to buffer some packet's payload depend on sequence number for
>> comparison?
>> Any idea about this?
> The original COLO discussions ~2-3 years ago talked about performing TCP
> reassembly and comparing the TCP stream; not a simple task.
>
> But the version I worked with also had the rewrite of the sequence
> numbers on the secondary to cause them to match even with the same
> fragmentation - but that doesn't seem to be upstream yet.

In current qemu upstream we use filter-rewriter to rewrite the sequence
numbers on the secondary, but we can not avoid different fragmentation 
in two side.
Any comments about guarantee the primary side and the secondary side 
have the same fragmentation?

Thanks
Zhang Chen

>
> Dave
>
>> Thanks
>> Zhang Chen
>>
>>> Thanks
>>>
>>>
>>>
>> -- 
>> Thanks
>> Zhang Chen
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance
  2017-07-14 12:10   ` Dr. David Alan Gilbert
@ 2017-07-17  9:33     ` Zhang Chen
  2017-07-17 12:24       ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 19+ messages in thread
From: Zhang Chen @ 2017-07-17  9:33 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: zhangchen.fnst, qemu devel, Jason Wang, Li Zhijian, zhanghailiang



On 07/14/2017 08:10 PM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>> If colo-compare find out the first different packet that means
>> the following packet almost is different. we needn't do a lot
>> of checkpoint in this time, so we set the no-need-checkpoint
>> peroid, default just set 3 second.
>>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>   net/colo-compare.c | 13 ++++++++++++-
>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>> index 6d500e1..0f8e198 100644
>> --- a/net/colo-compare.c
>> +++ b/net/colo-compare.c
>> @@ -40,6 +40,9 @@
>>   /* TODO: Should be configurable */
>>   #define REGULAR_PACKET_CHECK_MS 3000
>>   
>> +/* TODO: Should be configurable */
> Yes it should!
>
>> +#define CHECKPOINT_MIN_TIME 3000
>> +
>>   /*
>>     + CompareState ++
>>     |               |
>> @@ -455,6 +458,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
>>       Packet *pkt = NULL;
>>       GList *result = NULL;
>>       int ret;
>> +    static int64_t checkpoint_time_ms;
>>   
>>       while (!g_queue_is_empty(&conn->primary_list) &&
>>              !g_queue_is_empty(&conn->secondary_list)) {
>> @@ -494,7 +498,14 @@ static void colo_compare_connection(void *opaque, void *user_data)
>>                */
>>               trace_colo_compare_main("packet different");
>>               g_queue_push_tail(&conn->primary_list, pkt);
>> -            /* TODO: colo_notify_checkpoint();*/
>> +
>> +            if (pkt->creation_ms - checkpoint_time_ms > CHECKPOINT_MIN_TIME) {
>> +                /*
>> +                 * TODO: Notify colo frame to do checkpoint.
>> +                 * colo_compare_inconsistent_notify();
>> +                 */
>> +                checkpoint_time_ms = pkt->creation_ms;
>> +            }
> You need to be careful how this interacts with the actual start of the
> checkpoint.   Lets say you have two miscompared packets close to each
> other:
>
>
>      miscompare!
>           checkpoint
>      miscompare!
>           ignore it because it was close to the 1st one
>
>     That means we never trigger the 2nd checkpoint and it'll carry on
> until the maximum checkpoint length.
>
>     But also, I think you need to consider what happens to future packets
> being compared; you can't release any packets now until the checkpoint
> as soon as you know there's a miscompare.

We need some time to do the checkpoint, and in this period we can ignore
the miscompare to get better performance. Like that:

currently:

     miscompare!
          notify checkpoint
     miscompare!
          notify checkpoint
     miscompare!
          notify checkpoint
     miscompare!
          notify checkpoint
     vm_stop and do checkpoint

     vm_start and finish checkpoint

     vm_stop and do checkpoint

     vm_start and finish checkpoint

     vm_stop and do checkpoint

     vm_start and finish checkpoint

     vm_stop and do checkpoint

     vm_start and finish checkpoint


running normally.


after:

     miscompare!
          notify checkpoint
     miscompare!
          ignore
     miscompare!
          ignore
     miscompare!
          ignore
     vm_stop and do checkpoint

     vm_start and finish checkpoint

running normally.



Thanks
Zhang Chen
  


>
> Dave
>
>>               break;
>
>>           }
>>       }
>> -- 
>> 2.7.4
>>
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number
  2017-07-17  9:23         ` Zhang Chen
@ 2017-07-17 10:02           ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-17 10:02 UTC (permalink / raw)
  To: Zhang Chen; +Cc: Jason Wang, qemu devel, Li Zhijian, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> 
> 
> On 07/17/2017 04:55 PM, Dr. David Alan Gilbert wrote:
> > * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> > > 
> > > On 07/14/2017 11:25 AM, Jason Wang wrote:
> > > > 
> > > > On 2017年07月13日 13:52, Zhang Chen wrote:
> > > > > If primary packet's sequence number not same with secondary packet's
> > > > > sequence number, no need to compare the packet other field.
> > > > > 
> > > > > Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> > > > > ---
> > > > >    net/colo-compare.c | 6 ++++++
> > > > >    1 file changed, 6 insertions(+)
> > > > > 
> > > > > diff --git a/net/colo-compare.c b/net/colo-compare.c
> > > > > index 0f8e198..2caeb80 100644
> > > > > --- a/net/colo-compare.c
> > > > > +++ b/net/colo-compare.c
> > > > > @@ -222,6 +222,12 @@ static int colo_packet_compare_tcp(Packet
> > > > > *spkt, Packet *ppkt)
> > > > >        ptcp = (struct tcphdr *)ppkt->transport_header;
> > > > >        stcp = (struct tcphdr *)spkt->transport_header;
> > > > >    +    if ((ptcp->th_flags & TH_SYN) != TH_SYN &&
> > > > > +        ptcp->th_seq != stcp->th_seq) {
> > > > > +        trace_colo_compare_main("colo_packet_compare_tcp seq not
> > > > > same");
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > >        /*
> > > > >         * The 'identification' field in the IP header is *very* random
> > > > >         * it almost never matches.  Fudge this by ignoring differences in
> > > > Do we have any statistics numbers for this?
> > > Rethink about this patch, I will remove it in next version and send a
> > > independent
> > > patch in the future.
> > > Because in FTP get test, primary guest send lots of packet differ to
> > > secondary guest's,
> > > the packet payload are not same, but the total payload are same.
> > Do you mean that the TCP stream is the same but the packet sizes are
> > different due to different fragmentation?
> 
> Yes, like that:
> We send this payload: "1234567890".
> 
> primary:
> pkt1 payload:"123"
> pkt2 payload:"4567890"
> 
> secondary:
> pkt1 payload:"1234567890"

Yes; I think it comes down to very fine grain timing and interaction
with nagling; if the guest is that bit slower in generating the output,
the network code will decide to send it.

> > 
> > > I think I have to buffer some packet's payload depend on sequence number for
> > > comparison?
> > > Any idea about this?
> > The original COLO discussions ~2-3 years ago talked about performing TCP
> > reassembly and comparing the TCP stream; not a simple task.
> > 
> > But the version I worked with also had the rewrite of the sequence
> > numbers on the secondary to cause them to match even with the same
> > fragmentation - but that doesn't seem to be upstream yet.
> 
> In current qemu upstream we use filter-rewriter to rewrite the sequence
> numbers on the secondary, but we can not avoid different fragmentation in
> two side.
> Any comments about guarantee the primary side and the secondary side have
> the same fragmentation?

I don't think you can; the only choice is to perform the comparison
after de-fragmentation - or to do the same thing by building your own
reassembly.

Dave

> Thanks
> Zhang Chen
> 
> > 
> > Dave
> > 
> > > Thanks
> > > Zhang Chen
> > > 
> > > > Thanks
> > > > 
> > > > 
> > > > 
> > > -- 
> > > Thanks
> > > Zhang Chen
> > > 
> > > 
> > > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> > 
> > .
> > 
> 
> -- 
> Thanks
> Zhang Chen
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance
  2017-07-17  9:33     ` Zhang Chen
@ 2017-07-17 12:24       ` Dr. David Alan Gilbert
  2017-07-18  2:20         ` Zhang Chen
  0 siblings, 1 reply; 19+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-17 12:24 UTC (permalink / raw)
  To: Zhang Chen; +Cc: qemu devel, Jason Wang, Li Zhijian, zhanghailiang

* Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> 
> 
> On 07/14/2017 08:10 PM, Dr. David Alan Gilbert wrote:
> > * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
> > > If colo-compare find out the first different packet that means
> > > the following packet almost is different. we needn't do a lot
> > > of checkpoint in this time, so we set the no-need-checkpoint
> > > peroid, default just set 3 second.
> > > 
> > > Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> > > ---
> > >   net/colo-compare.c | 13 ++++++++++++-
> > >   1 file changed, 12 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/net/colo-compare.c b/net/colo-compare.c
> > > index 6d500e1..0f8e198 100644
> > > --- a/net/colo-compare.c
> > > +++ b/net/colo-compare.c
> > > @@ -40,6 +40,9 @@
> > >   /* TODO: Should be configurable */
> > >   #define REGULAR_PACKET_CHECK_MS 3000
> > > +/* TODO: Should be configurable */
> > Yes it should!
> > 
> > > +#define CHECKPOINT_MIN_TIME 3000
> > > +
> > >   /*
> > >     + CompareState ++
> > >     |               |
> > > @@ -455,6 +458,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
> > >       Packet *pkt = NULL;
> > >       GList *result = NULL;
> > >       int ret;
> > > +    static int64_t checkpoint_time_ms;
> > >       while (!g_queue_is_empty(&conn->primary_list) &&
> > >              !g_queue_is_empty(&conn->secondary_list)) {
> > > @@ -494,7 +498,14 @@ static void colo_compare_connection(void *opaque, void *user_data)
> > >                */
> > >               trace_colo_compare_main("packet different");
> > >               g_queue_push_tail(&conn->primary_list, pkt);
> > > -            /* TODO: colo_notify_checkpoint();*/
> > > +
> > > +            if (pkt->creation_ms - checkpoint_time_ms > CHECKPOINT_MIN_TIME) {
> > > +                /*
> > > +                 * TODO: Notify colo frame to do checkpoint.
> > > +                 * colo_compare_inconsistent_notify();
> > > +                 */
> > > +                checkpoint_time_ms = pkt->creation_ms;
> > > +            }
> > You need to be careful how this interacts with the actual start of the
> > checkpoint.   Lets say you have two miscompared packets close to each
> > other:
> > 
> > 
> >      miscompare!
> >           checkpoint
> >      miscompare!
> >           ignore it because it was close to the 1st one
> > 
> >     That means we never trigger the 2nd checkpoint and it'll carry on
> > until the maximum checkpoint length.
> > 
> >     But also, I think you need to consider what happens to future packets
> > being compared; you can't release any packets now until the checkpoint
> > as soon as you know there's a miscompare.
> 
> We need some time to do the checkpoint, and in this period we can ignore
> the miscompare to get better performance. Like that:
> 
> currently:
> 
>     miscompare!
>          notify checkpoint
>     miscompare!
>          notify checkpoint
>     miscompare!
>          notify checkpoint
>     miscompare!
>          notify checkpoint
>     vm_stop and do checkpoint
> 
>     vm_start and finish checkpoint
> 
>     vm_stop and do checkpoint
> 
>     vm_start and finish checkpoint
> 
>     vm_stop and do checkpoint
> 
>     vm_start and finish checkpoint
> 
>     vm_stop and do checkpoint
> 
>     vm_start and finish checkpoint
> 
> 
> running normally.
> 
> 
> after:
> 
>     miscompare!
>          notify checkpoint
>     miscompare!
>          ignore
>     miscompare!
>          ignore
>     miscompare!
>          ignore
>     vm_stop and do checkpoint
> 
>     vm_start and finish checkpoint
> 
> running normally.

Yes, but you must make sure that you don't
ignore any miscompares after the start of the next checkpoint - I don't
see how you avoid that.

Also we must be careful about packets released after the 1st miscompare.

Dave

> 
> 
> Thanks
> Zhang Chen
> 
> 
> > 
> > Dave
> > 
> > >               break;
> > 
> > >           }
> > >       }
> > > -- 
> > > 2.7.4
> > > 
> > > 
> > > 
> > > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> > 
> > .
> > 
> 
> -- 
> Thanks
> Zhang Chen
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance
  2017-07-17 12:24       ` Dr. David Alan Gilbert
@ 2017-07-18  2:20         ` Zhang Chen
  0 siblings, 0 replies; 19+ messages in thread
From: Zhang Chen @ 2017-07-18  2:20 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: zhangchen.fnst, qemu devel, Jason Wang, Li Zhijian, zhanghailiang



On 07/17/2017 08:24 PM, Dr. David Alan Gilbert wrote:
> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>>
>> On 07/14/2017 08:10 PM, Dr. David Alan Gilbert wrote:
>>> * Zhang Chen (zhangchen.fnst@cn.fujitsu.com) wrote:
>>>> If colo-compare find out the first different packet that means
>>>> the following packet almost is different. we needn't do a lot
>>>> of checkpoint in this time, so we set the no-need-checkpoint
>>>> peroid, default just set 3 second.
>>>>
>>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>>> ---
>>>>    net/colo-compare.c | 13 ++++++++++++-
>>>>    1 file changed, 12 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/colo-compare.c b/net/colo-compare.c
>>>> index 6d500e1..0f8e198 100644
>>>> --- a/net/colo-compare.c
>>>> +++ b/net/colo-compare.c
>>>> @@ -40,6 +40,9 @@
>>>>    /* TODO: Should be configurable */
>>>>    #define REGULAR_PACKET_CHECK_MS 3000
>>>> +/* TODO: Should be configurable */
>>> Yes it should!
>>>
>>>> +#define CHECKPOINT_MIN_TIME 3000
>>>> +
>>>>    /*
>>>>      + CompareState ++
>>>>      |               |
>>>> @@ -455,6 +458,7 @@ static void colo_compare_connection(void *opaque, void *user_data)
>>>>        Packet *pkt = NULL;
>>>>        GList *result = NULL;
>>>>        int ret;
>>>> +    static int64_t checkpoint_time_ms;
>>>>        while (!g_queue_is_empty(&conn->primary_list) &&
>>>>               !g_queue_is_empty(&conn->secondary_list)) {
>>>> @@ -494,7 +498,14 @@ static void colo_compare_connection(void *opaque, void *user_data)
>>>>                 */
>>>>                trace_colo_compare_main("packet different");
>>>>                g_queue_push_tail(&conn->primary_list, pkt);
>>>> -            /* TODO: colo_notify_checkpoint();*/
>>>> +
>>>> +            if (pkt->creation_ms - checkpoint_time_ms > CHECKPOINT_MIN_TIME) {
>>>> +                /*
>>>> +                 * TODO: Notify colo frame to do checkpoint.
>>>> +                 * colo_compare_inconsistent_notify();
>>>> +                 */
>>>> +                checkpoint_time_ms = pkt->creation_ms;
>>>> +            }
>>> You need to be careful how this interacts with the actual start of the
>>> checkpoint.   Lets say you have two miscompared packets close to each
>>> other:
>>>
>>>
>>>       miscompare!
>>>            checkpoint
>>>       miscompare!
>>>            ignore it because it was close to the 1st one
>>>
>>>      That means we never trigger the 2nd checkpoint and it'll carry on
>>> until the maximum checkpoint length.
>>>
>>>      But also, I think you need to consider what happens to future packets
>>> being compared; you can't release any packets now until the checkpoint
>>> as soon as you know there's a miscompare.
>> We need some time to do the checkpoint, and in this period we can ignore
>> the miscompare to get better performance. Like that:
>>
>> currently:
>>
>>      miscompare!
>>           notify checkpoint
>>      miscompare!
>>           notify checkpoint
>>      miscompare!
>>           notify checkpoint
>>      miscompare!
>>           notify checkpoint
>>      vm_stop and do checkpoint
>>
>>      vm_start and finish checkpoint
>>
>>      vm_stop and do checkpoint
>>
>>      vm_start and finish checkpoint
>>
>>      vm_stop and do checkpoint
>>
>>      vm_start and finish checkpoint
>>
>>      vm_stop and do checkpoint
>>
>>      vm_start and finish checkpoint
>>
>>
>> running normally.
>>
>>
>> after:
>>
>>      miscompare!
>>           notify checkpoint
>>      miscompare!
>>           ignore
>>      miscompare!
>>           ignore
>>      miscompare!
>>           ignore
>>      vm_stop and do checkpoint
>>
>>      vm_start and finish checkpoint
>>
>> running normally.
> Yes, but you must make sure that you don't
> ignore any miscompares after the start of the next checkpoint - I don't
> see how you avoid that.

Good catch, I will fix it in next version.

>
> Also we must be careful about packets released after the 1st miscompare.

Yes, after the 1st miscompare, all ignored packet will be enqueued.
Then, we will flush all packet in the queue during do checkpoint.

Thanks
Zhang Chen

>
> Dave
>
>>
>> Thanks
>> Zhang Chen
>>
>>
>>> Dave
>>>
>>>>                break;
>>>>            }
>>>>        }
>>>> -- 
>>>> 2.7.4
>>>>
>>>>
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>>
>>> .
>>>
>> -- 
>> Thanks
>> Zhang Chen
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

-- 
Thanks
Zhang Chen

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-07-18  2:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-13  5:52 [Qemu-devel] [PATCH V2 0/4] Optimize COLO-compare performance Zhang Chen
2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 1/4] net/colo-compare.c: Add checkpoint min period to optimize performance Zhang Chen
2017-07-14  3:22   ` Jason Wang
2017-07-17  6:42     ` Zhang Chen
2017-07-14 12:10   ` Dr. David Alan Gilbert
2017-07-17  9:33     ` Zhang Chen
2017-07-17 12:24       ` Dr. David Alan Gilbert
2017-07-18  2:20         ` Zhang Chen
2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 2/4] net/colo-compare.c: Compare the tcp packets that has the same sequence number Zhang Chen
2017-07-14  3:25   ` Jason Wang
2017-07-17  7:39     ` Zhang Chen
2017-07-17  8:55       ` Dr. David Alan Gilbert
2017-07-17  9:23         ` Zhang Chen
2017-07-17 10:02           ` Dr. David Alan Gilbert
2017-07-14 12:24   ` Dr. David Alan Gilbert
2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 3/4] net/colo-compare.c: Optimize unpredictable tcp options comparison Zhang Chen
2017-07-14  3:33   ` Jason Wang
2017-07-17  9:06     ` Zhang Chen
2017-07-13  5:52 ` [Qemu-devel] [PATCH V2 4/4] net/colo-compare.c: Adjust net queue pop order for performance Zhang Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.